It’s just data

RSS Conformance Issues

Don Box: RSS Heck (not quite Hell). FWIW, my take: P.S. Consider adding a link tag to your blog...

I agree with you. The inconsistent usage of <link> tags is particularly irritating and bothers me about Leigh Dodds Eclectic which always points back to the item being talked about instead of his commentary.

XHTML vs. encoded HTML doesn't seem like such a big deal to me although I can see why Don cares given that he is using XSLT for his RSS reader.

PS: What is the fifth point that Don and you are talking about. His blog trails off with "at least one extremely popular site started spitting out XML that contained..." and I viewing source doesn't show any special character or is that meant to be null ?

Posted by Dare Obasanjo at

I can't see what the last point is either, but I strongly encourage the use of smart quotes.

The Web is typography too, and all the nice typographical details (like proper quotes and em-dashes) that it entails.

Posted by Aaron Swartz at

I have a guess as to the fifth item given that I know that Don is prone to taking friendly swipes at a mutual acquaintance whose RSS feed was invalid XML just over 24 hours ago, due to a stray quote in a blog entry about keeping up with standards being a dead end.

The irony all around (including Don's truncated blog entry) is palpable. ;-)

Posted by Sam Ruby at

I heard that the next draft of XHTML 2.0 required conforming UAs to only display "well-behaved pages" -- if the page badmouthed the W3C then the UA must refuse to display it and put up an error message instead.

Posted by Aaron Swartz at

What I've seen recommended for the content:encoded always put it enclosed in a CDATA (comment - Mark Pilgrim is the example I remember best). I don't know if this is important or not.

Posted by David at

Well, I hate to burst you guys bubble but most people and most webloggers are not RSS wonks-) Indeed Dare took me to task for assuming the same in the .NET world about runtimes. I haven't the foggiest idea about any of these issues or how to fix them (since I got named) and I shut titles off because they don't work in Radio - see my post. The point? It should be done by the "platform" or "application." I expect, as a Radio customer that they *do* understand and implement RSS and all this cruft *properly* just like when I buy a car I don't expect to have to implement the oil punp. I pay to have something do this for me and abstract the cruft away from me. If it's not done properly or in accordance with ceratin specs, that's Radio's fault and not mine.

In other words, I am a user of weblogs, and a product, not a developer of them. I don't live XML and RDF.

Posted by Sam Gentile at

Sam: Simon Fell seems to manage.

Posted by Sam Ruby at

"I can't see what the last point is either, but I strongly encourage the use of smart quotes."

The issue isn't properly encoded smart quotes (implemented with HTML entities) - it's smart quotes encoded straight in to the document as weird Microsoft character codes that don't work on other platforms.

Posted by Simon Willison at

Sam, Simon manages because he works the XML field and is very versed in all of the standards and technologies-) Do you really think that everyone using a web log should dive down into the underlying plumbing instead of just writing and doing what they do best? Writing would come to a halt. I sense a double standard-) I get critized for writing an "rant" on programmers not understanding there is a runtime underneath and how they are not supposed to know that or all the details, and now you're sort of saying the opposite here-)

I don't agree. Most of us dont have the knowledge or the time. We barely find time to even blog-) I would rather my time be spent in writing compelling content and getting my readers information they like or need, not in re-implementing and massaging plumbing.

Posted by Sam Gentile at

Yes, Sam, the irony is palpable, although my bug was an English language bug, not an XML bug (I'm not sure which is worse).

Here's the last point:

"Oh, and RSS is an XML format. I was shocked when at least one <em>extremely</em> popular site started spitting out XML that contained non-well-formed XML. My poor little XML world fell apart."

I'd already reported it to the site administrator/author, but I'm guessing he's pretty busy right now hanging out in Harvard square :-) :-)

DB

Posted by Don Box at

As for the <link> tag, my current blogging home isn't really set up to support blogging. Until that changes, alas, I'm sans link tag and spontaneous blogging (I usually have a 24 hour delay before my blog gets propped).

Stay tuned.

DB

Posted by Don Box at

If you're unhappy with your publishing tools, perhaps you should invest in different ones that fulfill your needs better.

Posted by Mark at

Well, there you go. -)

Posted by Sam Gentile at

Linking up the link tag

Udell's The name game about URL politics and principles is important. My private coding adventures with bookmarklets and auto-discovery continues.... [more]

Trackback from Gotzeblogged

at

As for smart quotes, isn't it sufficient to simply indicate that ISO-8859-1 (assuming the author uses US windows) is in use using the XML declaration?

Ted Neward's feed has a smart apostrophe that seems fine once I add the XML decl to it.

DB

Posted by Don Box at

But Don, that would be lying. If you are using a moronic encoding, the correct thing to do is fess up to it. ;-)

FYI: I use iso-8859-1 in my RSS 2.0 feed. Of course, I turn off smart quotes when in Word...

Posted by Sam Ruby at

Sam G: rdf.root is your friend.

http://www.ideaspace.net/users/wkearney/misc/radio/radio8/rdf/

Posted by Simon Fell at

Ouch! Damn if Ted's feed doesn't contain an 0x92 which according to http://www.unicode.org/charts/PDF/U0080.pdf is PRIVATE USE TWO, not the smart apostrophe he intended.

I stand corrected.

DB

Posted by Don Box at

Thanks Simon F! -)

Posted by Sam Gentile at

Don: :-)

P.S. I must say that I am particularly fond of the XML parser used by IE (MSXML?). Generally, the first thing I do when presented with invalid XML is to try it in IE to see it if parses and renders correctly.

Posted by Sam Ruby at

Pingback from Sam Gentile's Weblog

at

The Blog police, they come to me in my head... I must have Titles or else-). According to these two, I must have Titles or else-) but let's see how much Radio screws up the presentation no matter how much I massage the templates (lost 3 hours last...

Excerpt from Sean 'Early' Campbell & Scott 'Adopter' Swigart's Radio Weblog at

Add your comment