URI Equivalence
In researching how Atom and the FeedValidator should handle URI equivalence, I took a look at how language environments with built in URI classes implement equality methods.
testuri.java produces:
http://example.com/ http://example.com false HTTP://example.com/ http://example.com/ true http://example.com/ http://example.com:/ true http://example.com/ http://example.com:80/ false http://example.com/ http://Example.com/ true http://example.com/~smith/ http://example.com/%7Esmith/ false http://example.com/~smith/ http://example.com/%7esmith/ false http://example.com/%7Esmith/ http://example.com/%7esmith/ true http://example.com/%C3%87 http://example.com/C%CC%A7 false
testuri.cs produces:
http://example.com/ http://example.com True HTTP://example.com/ http://example.com/ True http://example.com/ http://example.com:/ True http://example.com/ http://example.com:80/ True http://example.com/ http://Example.com/ True http://example.com/~smith/ http://example.com/%7Esmith/ True http://example.com/~smith/ http://example.com/%7esmith/ True http://example.com/%7Esmith/ http://example.com/%7esmith/ True http://example.com/%C3%87 http://example.com/C%CC%A7 True
Update: testuri.pl produces:
http://example.com/ http://example.com 1 HTTP://example.com/ http://example.com/ 1 http://example.com/ http://example.com:/ 1 http://example.com/ http://example.com:80/ 1 http://example.com/ http://Example.com/ 1 http://example.com/~smith/ http://example.com/%7Esmith/ 1 http://example.com/~smith/ http://example.com/%7esmith/ 1 http://example.com/%7Esmith/ http://example.com/%7esmith/ 1 http://example.com/%C3%87 http://example.com/C%CC%A7 0
Gack. Just hypothetically, if someone wanted to write carefully-done URI comparator, I suppose the cleanest thing would be to subclass URI... maybe not. Since it doesn't really have any exposed fields that seem useful, you might just as well do a URIEquivalenceChecker class with a single static method taking two URIs and some way of expressing how hard you want to try...
Posted by Tim Bray at
There are a finite number of URI schemes. It should be possible to write a URI compare module/class that takes the quirks of each scheme into account.
Now that I've pretty much run out of useful features to implement for the Universal Feed Parser, maybe I'll work on this next.
Posted by Mark atThe "irc:" scheme does not appear in the assigned list. It amazed me, I've always used it in mozilla. It can be found in the wild.
Posted by Santiago Gala at
Perhaps I should have qualified: "registered" URI schemes. The irc:// scheme has had several drafts over the years, but never made it to final RFC status.
Posted by Mark at
A few more can be found here which still does not include feed: nor tag: nor urn:uuid, all of which can be found out in the wild.
Makes me wonder if the registration process is broken to the point where the concept of registered schemes is increasingly becoming less and less relevant.
Posted by Sam Ruby atBookmarks
Some interesting recent reads: Bertrand on rhino shell Stefano on Semantic web specs Sam with some tests on URI equivalency (and more by clicking through the comments...) Observations from Paul Graham via Brian...... [more]Trackback from Marc, himself, his blogs, and you reading them. at
Sam Ruby: In researching how Atom and the FeedValidator should handle URI equivalence, I took a look at how language environments with built in URI classes implement equality methods. Randy: Question? How would the Python URI type or class do? Is it...
Excerpt from RSS at
More URI Equivalence
Mark: Java is totally borked. Tim Bray: Gack. Just hypothetically, if someone wanted to write carefully-done URI comparator (in Java). Randy: Bookmarked by someone....Excerpt from RSS at
Sticky, it’s not just data.
I think I underestimated how sticky Moveable Type is. Vendors love things that make their product sticky. If developers really appreciated this software products would be even more sticky. Instead developers hate sticky; they call it things like...Excerpt from Ascription is an Anathema to any Enthusiasm at
Preserving Identity
Mark Pilgrim's Identifying Atom article indirectly makes three assertions about what would be ideal in a syndication protocol with respect to ids, which I will paraphrase thus: IDs are mandatory the semantics on how/when IDs are to be generated and wh... [more]Trackback from Sam Ruby at
Preserving Identity
Preserving Identity. Mark Pilgrim"s Identifying Atom article indirectly makes three assertions about what would be ideal in a syndication protocol with respect to ids, which I will paraphrase thus: IDs are mandatory the semantics on how/when IDs are...Excerpt from Tralla.org : Search : Debian at
GentleCMS Development Log: Part 4
I’ve been up to no good again. I keep changing my directory structure around. Nothing feels quite right, but each time I change it, it seems a bit better than the last time. In any case, my svn repository for this project is now something of a...Excerpt from Sporkmonger at
GentleCMS Development Log: Part 4
I’ve been up to no good again. I keep changing my directory structure around. Nothing feels quite right, but each time I change it, it seems a bit better than the last time. In any case, my svn repository for this project is now something of a...Excerpt from Sporkmonger Blog at
In Java equals is very closely linked to hashcode.
If you encode before comparing in .equals, then you must encode before calculating your hashcode() for storage and retrieval in HashMap/HashTable/HashSet etc. That leaves you with a pretty slow hashcode() function, espeicially if you have to re-size the hashmap. So URI may “suck” for performance reasons. If you don’t encode in hashcode too then [link] goes into your hashset and [link] won’t get it out.
Note that this definitely means you never want URL to be a key in your hashmap! :) wanna take bets on what hashcode() does to create a hash value (I haven’t checked yet).
Posted by anonymous atGentleCMS Development Log: Part 4
I’ve been up to no good again. I keep changing my directory structure around. Nothing feels quite right, but each time I change it, it seems a bit better than the last time. In any case, my svn repository for this project is now something of a...Excerpt from gentlecms on SWiK at
Bogtha on Mr. Gosling - why did you make URL equals suck?!?
Wow. That’s monumentally bad for any library, let alone the standard library. Whoever thought that would be a good idea? > Argh! This class sucks and I refuse to ever use it again. I’ll always use URI from now on since it doesn’t suck. Sorry,...Excerpt from programming: what's new online at
Java is totally borked. The first seven examples are straight from section 3.2.3 of RFC 2616; they should all return true.
Test 8 was recently discussed on atom-syntax, and should also return true, although this is not explicitly clear from reading RFC 2396bis.
Posted by Mark at