It’s just data

IRI support

I didn’t realize that James Holderness had stealthily launched a blog.  That’s sufficient reason in my book to add IRI support to Planet and UFP (note: this support is only available if Python 2.3 or later is used).

Oh, and James, cute use of internal entities.


Interesting. I followed the link in your entry from Vienna (my OSX aggregator), and it works fine. In Safari, it displays as an IRI, and worked fine.

BUT, when I go to subscribe to the feed in Vienna, I have a problem. If I follow or cut-n-paste the link from Safari, it’ll be an IRI, and contain non-ASCII characters. Feeding that to Vienna results in an error (I assume because it doesn’t yet understand IRIs as input).

When I look at this blog entry in Vienna, it displays the hint as a URI (with punycode), but when I cut it, it shows up on the clipboard as an IRI. Which Vienna doesn’t understand.

Viewing source on your blog entry and cutting from there doesn’t do any good, because the domain name is entity-encoded (which neither a URI nor an IRI parser will understand).

It would help if James' site used an absolute URI to point to its feed — then anyone could subscribe. Until IRI-aware tools are more prevalent, it’s probably the safest thing to do.

Posted by Mark Nottingham at

P.S. I was eventually able to subscribe by feeding his domain name into the online IDN conversion tool.

Posted by Mark Nottingham at

Yeah, NNW barfed on the link to his blog actually.  Instead of a link to James' blog, when it got to <a href="http://www.詹姆斯.com/">James Holderness</a>, it gave a link to http://www.intertwingly.net/blog/ instead for whatever reason.

Posted by Bob Aman at

Even more fascinating, it resolves the link from James' first entry (I just subscribed) to http://ranchero.com/blog/2006/05/launch.  Wild.

Posted by Bob Aman at

I did some more testing, and it seems like Vienna does the right thing when you have Safari dispatch the subscription directly to it; the only problem is when you cut-and-paste the URI.

Considering that cut-and-pasting URIs is one of the big things that makes them cool, it seems like IRI-aware software should convert IRIs to URIs when cutting...

Posted by Mark Nottingham at

Meanwhile, James' feed totally kicks FeedTools' butt.  Mainly because Ruby’s URI class kinda sucks.  Is there a good C library for parsing/joining/creating URIs that I can wrap instead that won’t totally barf on things like:

../feeds/atom
http://user:password@gmail.google.com/gmail/feed/atom/
file:///Users/sporkmonger/Projects/Ruby/Libraries/feedtools/test.atom
urn:uuid:b202fb8c-f32a-11da-bdcc-00112486f05c
tag:intertwingly.net,2004:2300
http://詹姆斯.com/
http://science_boy.blogspot.com/atom.xml
Posted by Bob Aman at

Oh, and James, cute use of internal entities.

Oh wow, that is really neat. Anyone have any idea of well this works in aggregators which don’t use a proper XML parser? After some thinking it actually seems much less risky than diddling namespaces, but I don’t want my feed to break due to use of “exotic” XML features again, even though I’d love to exploit anything that will let me squeeze my feeds and make my main one easier to edit.

Posted by Aristotle Pagaltzis at

In Firefox 1.5.0.4 on my Mac, I set network.IDN_show_punycode first to true and then back to false.  Neither value seemed to have an effect on the display of the domain--it’s always punycode.  The site works with both forms of the domain, however.  I suppose the past security uproar is probably preventing me from seeing anything but the punycode, regardless of my settings.

Posted by Scott Johnson at

Seems to work in Thunderbird 1.5, Firefox 1.5, and Firefox trunk.

Posted by Robert Sayre at

Good grief, Java’s IRI/URI support sucks.  It doesn’t grok “http://www.詹姆斯.com/” at all. However, “http://www.xn--8ws00zhy3a.com/” works just fine.

Posted by James Snell at

Anyone have any idea of well this works in aggregators which don’t use a proper XML parser?

I haven’t tested my feed itself, but back when I was doing Atom content tests I threw in a couple of internal entities for a laugh. There were slightly more working aggregators than their were failures, but not by much, and I think all the online aggregators failed (at least of those I tested). In most cases the failure will probably be relatively benign, but IE7 flat out refuses to subscribe to anything containing a DTD.

In Firefox 1.5.0.4 on my Mac, I set network.IDN_show_punycode first to true and then back to false.  Neither value seemed to have an effect on the display of the domain--it’s always punycode.

The trick is to set network.IDN.whitelist.com to true. Firefox only whitelists TLDs that have a policy in place for dealing with homographs. It seems .com is not on their list yet. Of course I wouldn’t recommended anyone mess with the default settings unless you fully understand the implications and security risks.

Posted by James Holderness at

Planet Musings

Introducing Planet Musings.... [more]

Trackback from Musings

at

This is certainly an interesting use of an IDN. BTW I’m just scouring the web for real life examples of IDN’s to add to my new IDN search engine and found your post.

Thanks!

wil.

p.s. The IDNSearch permalinks are proper IRIs, let’s see how your blog handles it :-P

Posted by William Tan at

Add your comment