It’s just data

Relative RSS links

Tim Bray: Except for I'm resisting one change that the validator wants...

The problem is that most "content management systems" simply schlep the bytes that are often ill formed HTML into an RSS feed that may be placed in a separate directory or even machine. Then a typical "aggregator" simply throws these bytes into a HTML page without really taking the time to understand it.

Simply put, while HTML is easy to author, parsing it seems beyond the capability of most DIY developers and even many commercial software developers.

The RSS specs (all of them) are actually silent on this. They don't specify what a URL should be relative to (many assume that it should be the site, architecturally, it seems like it should be the feed itself). Since the specs are silent, what matters most is what the tools actually support. Advocacy may help fix this.

Tangentially related, I wonder what it would take to get Tim Bray to add a RSS autodiscovery link tag to his html page? One of the tools that supports it is the RSS validator itself.


UserLand has never -- in all their years of creating web servers, web-based clients, and other web-based software -- they have never written anything vaguely resembling an HTML parser.

Interesting historical note: the lack of an HTML parser was the basis of their initial reluctance to support RSS auto-discovery. To this day, their tools still don't fully support RSS autodiscovery properly (unless this documentation is out of date).

http://radio.userland.com/aggregatorAutoDiscovery

"""
In order for it to work it must be formatted this way. If the attributes aren't all present, it fails. If they aren't in the correct order, it fails. If the whitespace isn't exactly as above, it fails. If the attributes aren't quoted, it fails.
"""

Posted by Mark at

More on RSS and relative links

Tim Bray makes the case for why relative URLs should be used in RSS feeds. Sam Ruby responds by pointing... [more]

Trackback from Chris Heller's Weblog

at

Slow down a bit, can't you?

"Can't sleep, the blogs will eat me" starts to look pretty prophetic....

Excerpt from phil ringnalda dot com at

Link Tag (RSS Autodiscovery)

Via Sam Ruby See Mark's post Important change to the LINK tag "All you early adopters, pay attention. The LINK tag for pointing to a page’s RSS feed is changing (just once, and then solidifying forever). Both the type and title attributes are ...

Pingback from Bitflux Blog :: Link Tag (RSS Autodiscovery)

at

Link Tag (RSS Autodiscovery)

Via Sam Ruby

See Mark's post
Important change to the LINK tag

"All you early adopters, pay attention. The LINK tag for pointing to a page’s RSS feed is changing (just once, and then solidifying forever). Both the type and title attributes are c...... [more]

Trackback from Bitflux Blog

at

Source: [Sam Ruby] Relative RSS links. Tim Bray: Except for I'm resisting one change that the validator wants... The problem is that most "content management systems" simply schlep the bytes that are often ill formed HTML into an RSS feed that may ...

Pingback from Dewayne Mikkelson and his Radio WebDog, Shadow

at

"Simply put, while HTML is easy to author, parsing it seems beyond the capability of most DIY developers and even many commercial software developers."

Folks who need to parse HTML might want to take a look at John Cowan's TagSoup parser, which takes all kinds of wacky HTML and reports it as clean XMLized trees:

http://mercury.ccil.org/~cowan/XML/tagsoup/

That might cut down on the DIY pain, anyway.

Posted by Simon St.Laurent at

Simon, yes, there are solutions out there. I use Tidy. I'm also quite happy with the sgml parser in Python. Perl's got a good one too.

Got any suggestions for UserLand?

Posted by Sam Ruby at

Again with the relative URLs

Phil is right.  The specs are silent on this.  In fact, many people believe that relative links should be resolved relative to the feed itself, not the <channel><link> element's value.  Spec issues aside,... [more]

Trackback from Sam Ruby

at

Add your comment