It’s just data

Widening the Net

The excerpting function seems to be working, and by now I guess that all the people I could have encouraged to add RSS AutoDiscovery information to their websites have done so.

So, now it is time to widen the net a bit. No, I am not going to include the Ultra-liberal RSS locator because I feel that it would be morally wrong to do so. Scouring several (possibly dozens) of sites for information after a human enters text into an entry field is one thing, but doing so automatically once an hour for each referrer is another.

So here is what I have implemented so far. If I retrieve a page and it has no appropriate link tag, then I will scan for <a> tags with hrefs that point to the same site and end with a file name that is commonly used for rss. The ones I have come up with so far are: rss.xml, index.xml, index.rdf, and ?flav=rss. The first one I encounter will be used - so there will only be one attempt to fetch an RSS feed per site per hour.

If you know of another common convention, leave a comment. If your site doesn't follow a common convention, consider adding a <link> tag to your site.


My site has a link tag, but the syndication URL is a bit unconventional (the RSS file appears to be called "syndicate/rss1.0"). If you wanted to pick it up you could expand the net to search for links that include the words "RSS" or "syndicate" in their title attribute.

Posted by Simon Willison at

Simon, your link tag looks fine. If I get a referral from a link that has already pinged my site via PingBack or TrackBack, I don't try to extract an excerpt.

Posted by Sam Ruby at

Hey I just realized that Scripting News and all the Radio sites will be included in this thing you're doing. Very nice. Indeed.

Posted by Dave Winer at

It appears you are trying to implement an ultra-liberal RSS locator.

http://diveintomark.org/archives/2002/08/15.html#ultraliberal_rss_locator

Posted by Mark at

But you said that. Never mind. Can't read today.

Posted by Mark at

/backend.php is used by all php-nuke sites and most of the php-nuke clones

Posted by Julian Bond at

OK, I've added backend.php, though few of these feeds seem to have meaningful descriptions.

Posted by Sam Ruby at

Add your comment