It’s just data

Bueller?

It looks like Share Your OPML’s top 100 list has blown a head gasket Update: It is back.  64 bit integer underflow error, perhaps?  Curiously, the site’s aggregator seems to be operating off of a completely different list of feeds.

This had been a good test of Planet (now it is merely a test of Cyrillic support)

Does anybody else publish a list of popular feeds in an export format?  TechnoratiBloglines?  It need not be in OPML.  Other formats like FOAF or XOXO would be fine.


The Bloglines page you link has an RSS feed, isn’t that good enough?

Posted by BillSaysThis at

Very strange problems on SYO this morning.  Almost seemed like some form of coordinated spam attack on the site.  I’d be interested in hearing from the site operators as to what exactly happened to the site.

Posted by James Snell at

The Bloglines page you link has an RSS feed, isn’t that good enough?

It would be much easier to screen scrape the HTML than to try to extract any real information from that feed.  That feed is a single HTML page, escaped, and placed into a description.

I’m looking for something that is intended to be parsed.

Posted by Sam Ruby at

Try The tech tag in blogfinder (or other tag of your choice). There is OPML on the page.

Posted by Kevin Marks at

Rmail
[link]
[link]

Posted by Randy Charles Morin at

Kevin: valid?
Randy: valid?

My OPML parser is able to salvage 19 out of the 20 entries in the Technorati feed.  It finds nothing however in the r-mail feed.

Posted by Sam Ruby at

It’s entirely possible that I missed something, but the validation errors reported seem to be incorrect.

[link]

Broken validator, I guess :-)

Maybe you want to forward the spec to whomever wrote it :-)

[link]

Technorati’s on the other hand, seems to be invalid XML, never mind OPML.

[link]

Please correct me, if I’ve boo-boo-ed again.

Posted by Randy Charles Morin at

Found a bunch of targetted opml links here, plus the community lists at pubsub.

Also, Scoble’s Blogroll and this list of "Web 2.0" companies looked big enough to make good test-subjects.

Lastly, this is really slow loading, but eventually comes back with a top 500 opml

Posted by Kevin H at

It’s entirely possible that I missed something

Actually, it was my mistake.  OPML™ can be literally anything.  What I am looking for is a something resembling a subscription list.

Lastly, this is really slow loading, but eventually comes back with a top 500 opml

Well formed, and uses titles instead of text, but interesting data.  I just may have to play with that.  Thanks!

Posted by Sam Ruby at

I made a single run of the top500.  Apparently, there are enough feeds in that list with no meaningful dates that simply assigning the default date to those entries (i.e., now) dominates the output.  I’ll try running it again in a bit, and hopefully some other entries will float to the top.

Posted by Sam Ruby at

I’ve rerun the top500 script and now all the old entries have scrolled off of the bottom.  After a quick scan, the only obvious error I see is in a Gear Live entry — the subscription is to a URL containing rss_2.0, but this redirects to an Atom 0.3 feed which encodes the summary as escaped HTML, but lets the type default to text/plain.

And the output page is still well formed.

All in all, not bad.

Posted by Sam Ruby at

You should run the scoble blogroll.  It seems to have a couple hundred more entries than the feedster top 500, and I wonder if it doesn’t have a good deal less 404’d and 410’d red-dashed-underline entries.

Posted by Kevin H at

The bug with our (technorati’s) OPML files should be fixed now. Sorry for the inconvenience.

Posted by ryan king at

Links for 2006-06-12 [del.icio.us]

Alex Barnett blog : MSReadr (or, RoboScoble) The Post Money Value: The Scoble Start Up Lessons for You Workbench: Robert Scoble, Naked Conversations and Exposed PCs Sam Ruby: Bueller? YouTube owns YourStuff | The Register Blogging Tools Survey -...

Excerpt from The RSS Blog at

Add your comment