Ben Goodger: The problem with detecting feeds is that very many feeds are served with incorrect or overly generic Content-Types. Some are served as text/html which is clearly wrong, but others are served as application/xml or text/xml which is not incorrect, just not specific enough. We can’t attempt to parse every candidate Content-Type as a feed just to see if it is, since that would significantly impact our performance. We also can’t restrict ourselves to Feed types, since that would leave us not detecting a lot of feeds, and still be incorrect.
Alright, that’s fine, but if we’re going to do this, I’d really like a convenient way of getting at the source of a feed without having FireFox always downloading it to my desktop. Anyone care to write an extension for FireFox that adds a “View Source of Link” option to the context menu for hyperlinks?
Bob: Just use “copy link location”, paste into a new tab, but prefix with “view-source:”. Works like a treat for me. (An extension would still be nice, though.)
That being said, there isn’t consensus on whether or not the roadblocks are insurmountable or not. I have recently talked to somebody within the IETF that I trust on this exact issue, and he isn’t convinced that it is impossible.
As this moves forward, undoubtedly there may be additional circumstances that merit a warning, or some of the existing warnings may need to be adjusted or removed.
I would just like to note, you know, for posterity, that Wordpress 2.0.2 serves all of its RSS 0.92 and 2.0 feeds (including comment feeds) as “Content-type: text/xml; charset=...”.
Since I’ve failed thus far to get anyone else to do some 'splainin, maybe you’ll know: why, in that detection heuristic you and Ben copied from IE7, is RSS 1.0 sniffed as “<rdf:RDF” and the RDF namespace URI?
Other than typos, how big is the set of documents with an rdf:RDF element which does not declare the RDF namespace?
And in contrast, how big is the set of documents with an element named feed, in either the default namespace or no namespace, which are not Atom?
That sure seems to me like three tests for RSS 1.0 where two would do, following one test for Atom where two would do.
That sure seems to me like three tests for RSS 1.0 where two would do, following one test for Atom where two would do.
Phil, you know as well as I do that shortly after the euphoria of “for web content interoperability, copying is good” stage passes, the heuristics will evolve and diverge; and inevitably the finger pointing stage will quickly follow.
As you and I are following the one spec that is clear, we are more safe than most from this.
As to the people who serve RSS feeds with the potentially unregisterable application/rss+xml MIME type, or serve Atom feeds with the not-specific enough to be useful application/xml MIME type — I don’t yet have the data to show that there is demonstrable harm by their ignoring the specs in this manner, but when I do, I will update the Feed Validator accordingly.
I remember what stopped me before (my memory is rusty, but in my defense it was three years ago).
The RSS 2 spec is made available under the Creative Commons Attribution/Share Alike 1.0 license. Its terms include a stipulation that any derivative works must be published under an identical license; in particular “you may not offer or impose any terms on the Derivative Works that alter or restrict the terms of this License,” and “you must keep intact all copyright notices for the Work.”
Given that, my layman’s reading (as always, IANAL) was that RSS2 can’t be republished as an Internet-Draft or an RFC, at least without special dispensation from the copyright holders. I sent a few exploratory e-mails to both the CC folks and the Berkman folks, but that didn’t get anywhere.
If anyone can dig up better info than this (or better yet, get permission from Berkman), that would be great. I guess the other path would be to get permission from the RSS 1.0 authors to submit it as an I-D, or get them to submit it as a W3C Note.
Otherwise, application/rss+xml very well may be unregisterable.
I don’t understand why the spec text needs to be an RFC (instead of having minimal IETF language that references the format spec). Granted, application/msword way have slipped in at another time but still there are new non-vnd media types for non-IETF formats.
They made concessions in the past for legacy formats, but the process has tightened up considerably since. I tried precisely that — referencing a widely-used, but not “stable” document — in 2003, and it didn’t pass muster. I do wonder if trying again now would have a different result.
BTW: strictly, it doesn’t need to be an RFC; it just needs to be “stable.” A W3C Note would do the trick, for example. Even getting an RFC isn’t that hard, really — if you have copyright.
I believe RSS x.x would be considered stable at this time, regardless of new efforts and some oddball offshoots.
I don’t think there’d be any benefit in “restating” those formats for the purpose of a mime type registration.
I think an RFC clearly stating the interoperable behavior of receiving a document with a mime type of ‘application/rss+xml’ would be possible, simply referencing the relevant format specifications and expected future development.
That all seems reasonable to us mortals, but it doesn’t meet the criteria of the IESG, at least in the past. This discussion (what constitutes a stable reference) has been happening in the IETF for a long time; there is still a general prohibition against URIs as normative references in RFCs, for example, because they’re unstable.
These are the folks who still publish in plain text, after all...
I’ve considered the media type registration system broken ever since mp4 files started surfacing and the video/mp4 type (and image/jp2 for that matter) stayed unregistered for no good reason (from the point of view of an outsider).
I think Apple got its type and creator code registration right. You get one when you ask for one. There’s no good reason for anyone to make one up for “private use”.
I think IANA should just have a first come, first serve Web form for instant media type registration. Not letting people who want to have media types have them causes much more trouble than some useless registration rotting in a corner somewhere. If Apple can do it with a more constrained lexical space, so could IANA.
Andrew, I think Sam or Mark showed me that trick awhile back, not sure. In any case, I’m lazy and it’s way too much work to type that out. I want to just right-click. :-)