Feedback loops

2004-04-29T13:40:50Z

Is this feed valid? Both SharpReader and Bloglines handle it flawlessly. In fact, there are active blogline subscribers.

The feedvalidator chokes on it.

I point this out, as a permathread has reerupted on atom-syntax. What bothers me about the permathread is that people seem to take certain things as absolutes when reality is so much more deliciously complicated.

Now, lets take a look at what SharpReader says when it encounters a feed it can't handle:

Error parsing RSS XML: Undefined root element: html

Please try to validate this feed. If this feed validates as correct RSS, you can send an error report.

Very simple, honest, and to the point. It doesn't proclaim that the "feed" is invalid. It simply states that it encountered an error during parse, specifies what the error is, suggests a way to independently validate the feed, and suggests a way to provide feedback to the tool author if you think that there is likely a bug in SharpReader.

Based on experiments, I am very much convinced that every possible permutation of validity, successfully passing the validator, and being able to be meaningfully consumed by your favorite aggregator exists out there in the wild.

In an absolute sense, the feedvalidator is not perfect. Does that mean that it is not useful? The best we can observe is that there is a high correlation between correctness and usefulness. This also is true for feeds. A feed may be technically valid but not useful. A feed may be technically invalid but useful.

In the midst of all this noise, a sensible suggestion re-emerged. A totally opt-in feature which enables feedback to be provided. This being said, I have the following concerns:

Placing the information on how to handle invalid feeds inside the feed itself seems counterproductive. This seems like a perfect use case for an HTTP header, with a fallback of an element with suggestions that if the fallback is used, the element should be placed near the top of the feed (for the benefit of stream, pull, of SAX parsers), and rigidly matching a regular expression. The pingback spec can be used for inspiration.
From a security perspective, I have grave concerns about the ability for a single person to orchestrate a DDOS attack on a third party. Such an effort could easily be cloaked.

However, overall, the idea of an optional feature whereby an HTTP GET coupled with a User-Agent header seems like it can't do much harm, and might actually prove useful.