Common Feed Errors
An analysis of a week’s work of click-throughs on Feed Validator [help] links
- MissingGuid (1726)
- RSS has an identity crisis.
- SAXError (653)
- People are bozos. They make all kinds of errors.
- UndefinedElement (629)
- Yes, it really is pubDate with a capital D. And no, itunes:category can’t be placed at the item level.
- UnexpectedContentType (536)
- Why are WordPress feeds served as text/html?
- EncodingMismatch (519)
- XML on the Web Has Failed. Buy the t-shirt.
- InvalidRFC2822Date (461)
- The single most error prone format on the planet. Evar.
- HttpError (303)
- If I can’t get to it, I can’t validate it. Capisce?
- ObsoleteNamespace (235)
- What? Atom 1.0 has been out for a full three months now, and your free hosting provider hasn’t yet upgraded? Try FeedBurner.
- ImageLinkDoesntMatch (206)
- I’m still not sure why this is a problem.
- UnicodeError (171)
- If all you are doing is strcpy’ing your html page into your feed, do us all a favor and add the following line at the top.
<?xml version="1.0" encoding="iso-8859-1"?>
Thanks.
- DuplicateDescriptionSemantics (136)
- Funky!
- InvalidFullLink (135)
- If only xml had provided a standard way to declare the base for a given URI…
- InvalidContact(120)
- People don’t seem to want to reveal their email addresses. Perhaps they should be told about Dublin Core?
- NotHtml (120)
- Silent Data Loss.
- UnexpectedAttribute (105)
- Yes, it really is spelled isPermaLink with a capital P and a capital L.
- ContainsHTML (96)
- Some people really want to put markup in their titles.
- BadCharacters (85)
- Generally this means that there are some evil quote characters smarting off again.
- SecurityRisk (84)
- Beware of the platypus.
- MissingDescription (78)
- Some people don’t seem to want to provide both a title and a description for their feed.
- MissingAttribute (78)
- If you are going to include an enclosure element, you might want to put the url in there too. I’m just saying…
- ContainsRelRef (76)
- People seem to want to put relative URI references in their descriptions too.
- DuplicateValue (69)
- What part of globally unique do you not understand?
- MissingItunesElement (65)
- If you are going to submit your podcasts to iTunes, you really should include a category, a language, and indicate whether or not the podcast is “explicit”. Think of the children.
- UndefinedNamedEntity (60)
- I don’t care if XHTML defines them in their DTD, DTD’s have not been a part of RSS since the summer of 2000.
- NotInANamespace (58)
- RSS 2.0 does not permit extensions to define child elements unless those child elements are also in a namespace.
Why a permalink?
I don’t know. I just copied that text straight from the spec.
Posted by Sam Ruby atToday's links [March 13, 2006]
Windows RSS Platform Niall Kennedy also blogged about Windows RSS plaform this past weekend Common Feed ErrorsSam Ruby posts “An analysis of a week’s work of click-throughs on Feed Validator”...Excerpt from Blogging Roller at
Feed Breakage
Error analysis is important. When you build operating systems, you examine crashlogs. When you run search engines, you look at the searches that produced zero results. When you run a Feed Validator, you look at what kinds of mistakes people make....Excerpt from ongoing at
For ObsoleteNamespace, FeedBurner’s only a partial solution, it will work for some user-agents but not all of them.
And for Clone Wars, the best part is that the Yahoo feed is still producing 100 duplicate GUIDs. I don’t care who you are, that’s funny there.
Posted by Gordon Weakliem at
Before getting too uppity, you might want to validate the validator a bit more, Sam :) While I’m sure this feed has a ton of errors, it most certainly exists, despite what Feedvalidator claims. Discovered this today and was quite a bit irritated I couldn’t actually check how broken the feed was :)
Posted by Luis Villa at
Luis,
Something very weird is going on here
>>> import urllib2
>>> urllib2.urlopen('http://cyber.law.harvard.edu/audio/home?func=viewRSS&wid=12')
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "/usr/lib/python2.4/urllib2.py", line 130, in urlopen
return _opener.open(url, data)
File "/usr/lib/python2.4/urllib2.py", line 364, in open
response = meth(req, response)
File "/usr/lib/python2.4/urllib2.py", line 471, in http_response
response = self.parent.error(
File "/usr/lib/python2.4/urllib2.py", line 402, in error
return self._call_chain(*args)
File "/usr/lib/python2.4/urllib2.py", line 337, in _call_chain
result = func(*args)
File "/usr/lib/python2.4/urllib2.py", line 480, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 404: Not Found
and
$ curl --head "http://cyber.law.harvard.edu/audio/home?func=viewRSS&wid=12" HTTP/1.1 404 Not Found Date: Tue, 14 Mar 2006 04:47:17 GMT Server: Apache/1.3.34 (Unix) mod_fastcgi/2.4.2 PHP/4.3.11 mod_perl/1.29 Set-Cookie: wgSession=375j4W7jvCZrE; path=/; expires=Fri, 11-Mar-2016 04:47:17 GMT Content-Type: text/xml; charset=ISO-8859-1 X-Cache: MISS from cyber.law.harvard.edu Connection: closePosted by Sam Ruby at
Feed Breakage
Error analysis is important. When you build operating systems, you examine crashlogs. When you run search engines, you look at the searches that produced zero results. When you run a Feed Validator, you look at what kinds of mistakes people make....Excerpt from diogenius at
That harvard feed is returning a 404 status code AND a body with containing RSS:
C:\temp>irb
irb(main):001:0> require 'net/http'
=> true
irb(main):002:0> Net::HTTP.start('cyber.law.harvard.edu') do |http|
irb(main):003:1* response=http.get('/audio/home?func=viewRSS&wid=12')
irb(main):004:1> puts "Code=#{response.code}"
irb(main):005:1> p response.body[0,100]
irb(main):006:1> end
Code=404
"<?xml version=\"1.0\" encoding=\"ISO-8859-1\" ?>
<rss version=\"2.0\" xmlns:creativeCommons=\"http://backe"
=> nil
Posted by Jonno Downes
at
links for 2006-03-14
inkling Web 2.0 prediction market game. (tags: economics web2.0 startup) Official Google Research Blog: Hiring: The Lake Wobegon Strategy “You know the Google story: small start-up of highly-skilled programmers in a garage grows into a large...Excerpt from Edward O'Connor at
Sam Ruby’s amusing run through “a week’s work of click-throughs on Feed Validator [help] links.”...
Excerpt from del.icio.us/tag/python at
Fascinatingly bizarre. FWIW, Firefox (haven’t actually checked in IE) ignores the error code and procedes to allow you to view the XML. But I’ll look into why we’re generating the bad 404 today as well.
Posted by Luis Villa at
I just deployed code that will go ahead and validate the body even on HTTPError, but only if the last line non-blank line is </rss>, </feed>, or </rdf:RDF>. But the Feed Validator will still report the error, and it will still mark the feed as invalid.
Sam Ruby: Common Feed Errors
Philippe Janvier : Sam Ruby: Common Feed Errors - “An analysis of a week’s work of click-throughs on Feed Validator [help] links” Tags : atom rss...Excerpt from HotLinks - Level 1 at
The browsers aren’t ignoring the error code at all. Ever wonder how custom 404 pages? Right, the server sends an HTML document which the browser then renders. It’s no different when the server sends an XML body with the 404 response: it just gets rendered. The browsers are simply doing what they always have.
Posted by Aristotle Pagaltzis at
how IE responds to different HTTP status codes
There is a discussion on intertwingly about feed errors, including the case where a server was serving a valid RSS feed with a 404 (file not found) status code. The feedvalidator was reporting the feed as being non-existent, but IE and firefox would...... [more]Trackback from jamtronix at
Sam Ruby has compiled a useful (and entertaining) list of the most Common Feed Errors. If you generate your own RSS feeds, it’s worth a look. I occassionally get burnt by evil smart quotes when I copy and paste content into a posting. Interactions...
Excerpt from Bob Congdon at
How IE Handles HTTP Status Codes
Jonno Downes (aka Jamtronix) has performed an experiment designed to work out how IE handles various...... [more]Trackback from Ken Schaefer at
Luis Villa (luis): Thu, 16 Mar 2006
Most Bizarre Technological Thing I’ve Been Involved In This Week. Still have no idea how the feed is both being served and generating a 404. On the occasion of the release of GNOME 2.14, I hope everyone in GNOME steps back and takes a moment to...Excerpt from Planet GNOME at
Links for 2006-03-15 [del.icio.us]
SiliconBeat: The company that Fox Interactive acquired: Newroo Rupert Murdock buys NewRoo, a web-based aggregator Sam Ruby: Common Feed Errors long list of common rss feed formatting issues found by falidator Mobile blogging makes a move: Six Apart...Excerpt from deeje.com/musings at
Minutiae
A surprisingly large part of a software engineer’s life is dealing with the little things. Much as I like to write about grand designs and architectural issues or people, processes and communities, all too often, the devil is in the details and I...Excerpt from BlogAfrica at
Social Engineering
While this clearly falls far short of RFC 2119 terminology, for nearly three weeks now, the Feed Validator has issued a warning when it encounters an item in an RSS 2.0 feed that does not contain a GUID. Despite this warning being exposed to a large... [more]Trackback from Sam Ruby at
Luis Villa (luis): Thu, 16 Mar 2006
Most Bizarre Technological Thing I’ve Been Involved In This Week. Still have no idea how the feed is both being served and generating a 404. On the occasion of the release of GNOME 2.14, I hope everyone in GNOME steps back and takes a moment...Excerpt from Planet GNOME at
Relative References
I feel strongly that Atom processors need to be able to process relative references in a consistent manner. But, for now, I’ve restored the use of absolute URIs in my Atom feed, and I will keep it that way for a minimum of 90 days. Looking at C... [more]Trackback from Sam Ruby at
Distribuire feed di qualità
Dopo Atom vs RSS e Feed autodiscovery questa è la terza puntata di una serie che potrebbe essere chiamata l’importanza di servire feed di qualità.Questa volta vorrei prendere spunto dal report pubblicato da Sam Ruby sugli errori più comuni presenti...Excerpt from edit at
Another Month
Deja Vu. This problem is important to me because truth be told, specs matter, but only so far as they are followed. For years, RSS had a validator that happily accepted feeds which were not even well formed XML. We... [more]Trackback from Sam Ruby at
Feed mess
Something I’ve been working on at work deals with feeds - I have to read, parse and derive meaning out...... [more]Trackback from Sriram Krishnan at
Feed mess
Something I’ve been working on at work deals with feeds - I have to read, parse and derive meaning out of RSS and Atom feeds in the wild. And it’s not been fun. The Universal Feed Parser is nice and everything but I’m still being forced to debug...Excerpt from Sriram Krishnan at
Bloglines Rocks!
I’ve given Bloglines a fair amount of grief over the past few months over their pathetic-at-the-time handling of Atom feeds. I’m not ego-centric enough to believe that I got them to change – at most, I may have increased awareness of... [more]Trackback from Sam Ruby at
OpenSearch results validation
Given the relaunch of OpenSearch, and given that OpenSearch results can be included in feeds, it seemed time to spend some of my recreational programming time on adding Feed Validator support for the OpenSearch namespace extensions. The spec is cleanly... [more]Trackback from Sam Ruby at
“It’s recommended that you provide the guid, and if possible make it a permalink.”
Why a permalink?
Posted by Graham at