RSS progress and a big missed opportunity
Lots of really, really, really good progress on RSS. Now, I'd like to make a plea. Slow down.
RSS 0.94 added a number of elements without consideration as to what possibilities were out there once namespaces were added. Now it looks like namespaces will be added.
What RSS 2.0 needs now is a focus on simplicity and some serious deprecation. Strip it to the core. Then have two modes (just like HTML does)... a transitional mode which allows anybody to add any element they wish with or without namespaces, including the classic 0.91 ones like skipHours and the proposed 0.94 ones. And a strict mode in which the only additions permitted are ones that reside in namespaces.
Before somebody says that we don't have time for that... how long does it take to remove elements from a document?
I'll second Sam's motion!
Since we are, as Jon Udell put it, "...still at the beginning of the RSS adoption curve", it doesn't seem logical to just say "what's done is done" and continue to build on a shaky foundation. Not that backwards compatability should not be a concern --- some transitional and depreciation work is warranted to start straighten the path forward.
Rahul I don't think full backward compatibility is possible with all of the RSS formats that have been published. For instance .9 and 1.0 use <rdf:RDF> as their root while the other .9x formats use <rss version="">.
Posted by Timothy Appnel at
Tim,
The lack pf backword compatibility was the very problem with RSS 1.0. RSS 1.0 isnt a bad spec if you leave out the RDF baggage, and make all RSS 0.9x elements omitted completely optional.
Which is why i think a strict subset is a great idea, keeping the other wierd 0.94 elements, but making em ALL optional. This way, use namespaces if you want(like I do), and dont if you dont.
Posted by Rahul Dave at
I believe all the "weird" elements that were thrown into the core namespace in RSS 0.94 are already optional. I still would prefer that they be officially deprecated, along with everything else in the core namespace that is now made redundant by existing modules. Clean core: title-link-description, and maybe language. But even language is part of Dublin Core, so maybe not even that.
I expressed this philosophy to Dave Winer last night in email, and he replied that he would prefer to keep the existing (albeit optional) elements in the core namespace. I agree with Sam and Sjoerd that this is unnecessary.
Posted by Mark Pilgrim at
Rahul...
1.0 did suffer with backwards compatability issues, but for that matter the .9x formats do depending on what version you are comparing it to. Its all a matter of perspective that will be difficult to fully resolve. For instance .92 breaks backwards compatability with .91 by declaring some tags that where required as optional and removing all limits.
A complete comparison from http://blogspace.com/rss/compatibility :
* All 0.91 files are valid 0.92 files.
* Some 0.92 files are valid 0.91 files (plus elements).
* All RSS 1.0 files are RSS 0.9 files (plus attributes and elements).
* No RSS 0.9 and RSS 0.91 files are compatible.
* No RSS 0.91 and RSS 1.0 files are compatible.
Mark: I agree. the "weird" elements should be depreciated and implemented with a module.
Posted by Timothy Appnel at
Just to be clear, I don't think the additional elements in RSS 0.94 are "weird", I was just referring to them in response to Rahul's comment. They are, however, additional elements that were not in RSS 0.92 ( see http://backend.userland.com/rssChangeNotes for details).
Now that RSS 0.94 will have namespaced modules, I agree with Tim that these additional elements should be taken out of the default namespace and put in their own module. It can be the shining example of the newfound power available in RSS 0.94! Look how easy it is to extend for your own purposes! Anybody can do it! Etc. Great marketing opportunities here, being missed.
Posted by Mark Pilgrim at
If you're going to discuss pushing things out into modules then you're going to want to discuss some best practices on their use.
Using Dublin Core elements brings along the amazingly smart work those folks have done. No need to reinvent the wheel by using something already hashed out. Just take the time to read the specs.
There should be more discussion about what's not being used in all these feeds. There's a staggering lack of implementation for most of the elememnts! Things like contact info are largely missing, to say nothing of the scheduling elements. So paring the core spec back to a few won't really change a damned thing; they're already using just the core.
But seriously, if using external namespaces is entertained then it's going to be necessary to offer clear documentation on what the intended uses are for an element. That is one deficiency in /some/ of the XML documentation that we're all painfully familiar with.
If anyone's not familiar with it, the entire archives of the rss-dev conversations are online in the yahoogroup. Many of this same arguments were made well over two years ago. It's fascinating to see how things were resolved back then considering how little everyone knew of XML. Now that we all know better, perhaps it's time to pick those discussions up again.
Posted by Bill Kearney at
Mark, Tim,
My concern is backword compat with 0.92 really, as that seems to be the most used spec(informally..i havent really done an exhaustive look).
Mark, I totally agree that a lot of the 0.92 (and 0.94) cruft is way better of in well designed namespaces. And deprecating the use of these in 2.x might be a good idea. But I dont want to see them formally go away, rather, well designed namespaces ought to let them wither.
Some perspective of where I come from might be useful here..I work at a university, and the people I work for use code as a tool in their work(physics), unlike most of us, who code to live. Most folks here who balked at XML a year back now like it, but believe me, bring in namespaces and people will want to turn off. At no point should the usefulness of RSS be dictated by namespaces, Dublin core, etc
Most people dont, and rightfully wont want to care. In context, XML is translatable (which is why I also think that RDF syntax is useless, as opposed to the RDF model; one should always be able to extract RDF from XML by specifying some transform).
There are nice and widely used tags in RSS 0.94, such as guid and enclosure. I'd say, let them be. But I am all for creating an object namespace with tags like object:guid and object:enclosure, with attributes in the namespace, clsid's, whatever else is useful.
A little duplication wont hurt. So lets deprecate, but agree never to remove.
I dont mind little pain in lost backword compatability when it is necessary(html->xhtml 1.0), but I positively abhor changes which make it harder for casual users(xhtml 1.0->xhtml 2.0, loss of img tag, for example).
Lets never forget that the web took off only because of TBL's 3 major simplifications (a) html was dirt simple and everybody could write it (b) links had no consistency guarantees. (c) http was a textual key-value based protocol. I dont think that these points and the fact that TBL worked at CERN are non-coincidental, and you can see this streak in TBL even today with his introduction of n3 as an alternate rdf syntax
Posted by Rahul Dave at
Well, here's trying to write what people have been saying about making a core or strict subset. I also add another, 'useful' core subset.
http://tig.nareau.com/stories/2002/09/06/crss199.html
Copied below:
Core RSS: CRSS 1.99
Not very useful without namespaces
rss: @version
channel
title
link
language
item*
link
title
description
Non Programmer RSS: NPRSS 1.99
Useful without namespaces, but still quite a core. I claim
99% of all RSS applications can be satisfied with this subset.
The only thing missing is instructions for reloading to
aggregator, but no-one seems to use these anyway.
rss: @version
channel
title
link
language
pubDate
copyright
managingEditor
item*
link
title
description
author
pubDate
Posted by Rahul Dave at
Your link to "core":
http://127.0.0.1:5335/stories/2002/09/02/reallySimpleSyndication#plan
needs fixing. Thanks.
Posted by anon at
Dave has responded on this issue: http://scriptingnews.userland.com/stories/storyReader$1744
Highlights:
- Deprecating elements would be "gratuitous", despite the fact that they "so obviously suck".
- By analogy to shell commands, he implies that those who wish to deprecate elements simply don't understand them.
- No acknowledgement that RSS 0.92 introduced major changes that required RSS 0.91 consumers to severely alter their application logic (by making previously required elements optional, and removing length restrictions, and allowing encoded HTML in descriptions). By contrast, the worst that would happen if we deprecated elements in RSS 2.0 is that consumers wouldn't take advantage of all the richness of the feed (because they would, for example, be looking for a pubDate instead of a dc:date, or only looking for description but not noticing the fuller content:encoded).
It appears that RSS 2.0 will be much like HTML 4.0 Transitional. Lots of cruft built up that no one is brave enough to deprecate, raising the barrier of entry for developing news readers, and causing confusion for newbies trying to learn by example. "This feed uses pubDate, but that one uses dc:date. What's the difference? Is one better? Why 2 different date formats? Can I use either element with either date format? Do I need to supply both? If I'm trying to consume a feed that includes both, which one takes precedence?" And so forth.
Posted by Mark Pilgrim at
I guess we're going to need a widely publicized "RSS 2.0 Best Practices" document to accompany the spec. Maybe even a weblint-type validator that not only validates but recommends usage patterns. (The iCab web browser for Macintosh does this with HTML. It smiles on pages with valid HTML, frowns on pages with syntactic errors, and gives a straight face and warns about dodgy-but-technically-valid HTML like FONT tags and the like.)
"Dive Into RSS", anyone?
Posted by Mark Pilgrim at
Rahul...
I think we're closer in our thinking then it seem. Perhaps we are looking at different datapoints. For instance Jeff Barr recently ran a RSS file size analysis on the Syndic8 data base of over 11k feeds (http://groups.yahoo.com/group/rss-dev/message/3315). I did a bit of reverse engineering on those numbers and came up with these file format stats:
.91 5640
.92 3757
1.0/.90 1533
.94 179
.93 56
0.92d2 1
0.91fn 2
0.94b1 1
.91 clearly has the larger marketshare. without checking i suspect the majority of .92 feeds are from radio users. I agree with you on namespaces in a sense -- they shouldn't be in the required or even in the core like RSS 1.0. While the dublin core has been well established and much better thought out and researched then say RSS .94, I'd ideally like to see them implemented in some form, but differently then RSS 1.0. I have no specific option how -- yet.
<tim/>
Posted by Timothy Appnel at
Mark...
I wouldn't take too much of what Dave has written to heart. He is one person with an obvious basis -- his company. If RSS 2.0 is to be taken seriously it will need to be the community's specification, designed by the community through discussions and debates in a community forum like a mailing list. If its not, we may as well not even continue and go back to flaming each other.
Posted by Timothy Appnel at
I have to agree that RSS is moving too fast... it still needs more official adoption of news sites. There is no pressing need for these new specs (.94 and 2.0) because 1.0 really offers everything one needs in a simple syntax. If one can't understand RDF/RSS 1.0 then you shouldn't be coding in the first place.
And as long as I'm here I'd like to mention I just added a RSS feed to our news site as well as something a little different, an RSS feed for our job site, which ran be read at http://Turning-Stone.com/jobs/rss/ and I've also proposed a module for RSS 1.0 for special job related meta tags, which you can read at http://216.244.96.20/travis/rss-module-job.html and I'm waiting for it to be adopted by the RSS 1.0 group.
Posted by Travis at
In watching Sam Ruby's recent obsession with RSS, it's interesting (to me at least) to see people complaining about XML namespaces and RDF. It feels to me that there are roughly three camps: Anything goes it's all XML, recognize what you can, ignore what you can'tAnything (qualified) goes - basically, the Anything Goes approach with namespace qualification on "extensibility" data to address name collisionsAnything goes (as long as it's RDF) - a very weird sect of Anything (qualified) goes that is willing to accept arbitrary (qualified) data so long as it follows a rigid subset of the RDF transfer syntax.
Seen on Don Box's Spoutlet at
Why strict and transitional. Let RSS be fully backward compatible, with skipHours and all, but define a subset, no larger than RSS 1.0 without the RDF baggage, call it RSS Strict or RSSS, and build on that. This way one gets the baby and the bathwater, with zro backword compat issues
Posted by Rahul Dave at