BBC Aggregator
Asemantics: A new generation of feed aggregators for Web 2.0 applications is being jointly developed by Asemantics and the British Broadcasting Corporation. As a first step in the process, Asemantics has completed the aggregator engine for the Memory Share service of the BBC.
I know cynicism is your gimmick, Mark, but that’s a bit of a stretch. The only word that doesn’t have a good reason to be there is “Web2.0”, and even with that it’s a pretty straightforward description of what they’re doing.
Posted by Brendan Taylor at
Sorry, I was put off by the “HOLY SHIT HERE COMES A PDF: Presentation” until I realized it was a Stylish script I’d written a long time ago:
@namespace url(http://www.w3.org/1999/xhtml);
a[href$=".pdf"]:before {
content:"HOLY SHIT HERE COMES A PDF: "
}
Lessee:
- DOCTYPE: XHTML 1.0 Strict
- Content type: text/html
- Number of times they mention RDF: 4
- Number of times they mention SQL:
- Number of badges adorning their footer: 4
- Number of SPARQL badges adorning their footer, despite the 4 mentions of SQL: 1
- Validation errors found by clicking on “Valid XHTML!” badge: 18
- Number of “web 2.0” presentations served as HOLY SHIT HERE COMES A PDF: 1
We have a winner!
My uneducated guess: they’re polling Atom feeds, lossily converting them to RDF, serializing them as RDF/XML, and storing the serializations as text blobs in a MySQL database.
At least they didn’t spell it “ATOM”.
Posted by Mark at"My uneducated guess: they’re polling Atom feeds, lossily converting them to RDF, serializing them as RDF/XML, and storing the serializations as text blobs in a MySQL database."
According to page 14 of the presentation, you’re basically correct until that last step. There are two SQL databases, one that stores and indexes the fields they’ve designated as ‘searchable’, and another one that uses Redland to store the actual triples and from which the metadata can be retrieved for search results (using the URL as the joining key between the two DBs). This design suggests they started with everything in Redland but found query performance was poor.
Posted by Michael R. Bernstein atBBC Aggregator Asemantics: A new generation of feed aggregators for Web 2.0 applications is being jointly developed by Asemantics and the British Broadcasting Corporation. As a first step in the process, Asemantics has completed the aggregator...
Excerpt from r-echos at
From the puff piece:
...because the most important thing about using RDF is that you keep it to yourself.
BINGO!
Posted by Mark at