Tim
Bray: Except for I'm resisting one change that the
validator wants...
The problem is that most "content management systems" simply
schlep the bytes that are often ill formed HTML into an RSS feed
that may be placed in a separate directory or even machine. Then a
typical "aggregator" simply throws these bytes into a HTML page
without really taking the time to understand it.
Simply put, while HTML is easy to author, parsing it seems
beyond the capability of most DIY developers and even many
commercial software developers.
The RSS specs (all of them) are actually silent on this. They
don't specify what a URL should be relative to (many assume that it
should be the site, architecturally, it seems like it should be the
feed itself). Since the specs are silent, what matters most is what
the tools actually support. Advocacy may help fix this.
Tangentially related, I wonder what it would take to get Tim
Bray to add a
RSS autodiscovery link tag to his html page? One of the tools
that supports it is the RSS validator itself.
UserLand has never -- in all their years of creating web servers, web-based clients, and other web-based software -- they have never written anything vaguely resembling an HTML parser.
Interesting historical note: the lack of an HTML parser was the basis of their initial reluctance to support RSS auto-discovery. To this day, their tools still don't fully support RSS autodiscovery properly (unless this documentation is out of date).
""" In order for it to work it must be formatted this way. If the attributes aren't all present, it fails. If they aren't in the correct order, it fails. If the whitespace isn't exactly as above, it fails. If the attributes aren't quoted, it fails. """
Via Sam Ruby See Mark's post Important change to the LINK tag "All you early adopters, pay attention. The LINK tag for pointing to a page’s RSS feed is changing (just once, and then solidifying forever). Both the type and title attributes are ...
"All you early adopters, pay attention. The LINK tag for pointing to a page’s RSS feed is changing (just once, and then solidifying forever). Both the type and title attributes are c......
[more]
Trackback from Bitflux Blog
at
Source: [Sam Ruby] Relative RSS links. Tim Bray: Except for I'm resisting one change that the validator wants... The problem is that most "content management systems" simply schlep the bytes that are often ill formed HTML into an RSS feed that may ...
"Simply put, while HTML is easy to author, parsing it seems beyond the capability of most DIY developers and even many commercial software developers."
Folks who need to parse HTML might want to take a look at John Cowan's TagSoup parser, which takes all kinds of wacky HTML and reports it as clean XMLized trees:
Phil is right. The specs are silent on this. In fact, many people believe that relative links should be resolved relative to the feed itself, not the <channel><link> element's value. Spec issues aside,...
[more]