PaceSparqlLink - Atom Wiki

Abstract

Define a link relation to point to a SPARQL query end point.

Status

Initial pace draft.

Try it out:

A SPARQL interface to test queries has been set up here http://roller.blogdns.net:2020/snorql.
That interface queries entries on the Roller server placed here http://roller.blogdns.net:8080/roller/.
For a full description of how this works see the SPARQLing Roller presentation.

Rationale

The Atom Charter used to have a section mentioning how there should be a semantics for atom (what happened to this?). This is being defined by the atom-owl group, which is working on the AtomOWL ontology that is designed to fit very closely the model of the atom-syntax.

Such an official semantics gives us the tools to both understand atom with more precision, but it also automtically makes it possible to query a atom web site with the full power of the SPARQL query language, allowing an atom service to offer a search facility for the store that is as powerful as possibly can be.

This means that it becomes very easy for a client to get exactly the data he is interested in for example, thereby reducing the load on the server. For example it is very easy to generate a query that only asks for changed entries on a particular day of some year and that were written by someone. No need to download and process all the feeds to get that information.

Proposal

Add a link relation of relation type "sparql" that points to a SPARQL end point. This link relation would appear in feeds.

  Client                                      Server 
    |                                           |
    |  1.) GET feed with sparql link            |     
    |------------------------------------------>|     
    |                                           |     
    |  1a.) Return feed document                |    
    |<------------------------------------------|
    |                                           |
    |  2.) Post SPARQL query                    |
    |------------------------------------------>|     
    |                                           |     
    |  2a.) Return SPARQL result                |    
    |<------------------------------------------|

1. A software agent finds a sparql link in a feed document

   <feed>
       <link rel="sparql" href="http://example.org/query/sparql.cgi?">
       ...
    </feed>

2. The software agent can query the endpoint using the atom-owl ontology

Question: should the agent first query the database for the version of the atom-owl ontology? (needed in order to get test cases going before the ontology is formally accepted)
Question: is there a way to specify the version of the ontology used by the server in the link relation?

Example Usages

a) list all entries ids (not terribly useful)

SELECT ?id
WHERE { [] a awol:Entry;
          awol:id ?id .
      }

b) find all entries and their titles

SELECT ?entry ?title 
WHERE { 
         ?entry a awol:Entry;
               awol:title ?title .
 }

c) find all entries written after 13 October 2006

SELECT * 
WHERE { 
     [] a awol:Entry; 
        awol:id ?id;
        awol:updated ?dt .
      FILTER ( ?dt > "2006-10-13T00:00:00Z"^^xsd:dateTime  )
 }


d) find all distinct categories

SELECT DISTINCT ?cat 
WHERE { 
   [] a awol:Entry; 
      awol:category ?cat .
}


e) find all feeds whose authors are named henry

SELECT DISTINCT *
WHERE {
    ?feed a awol:Feed ; 
         awol:author ?auth; 
         awol:title ?feedTitle . 
   ?auth awol:name ?name .
   FILTER regex(?name, "Henry", "i")
}


f) list all entries in the collection belonging to the software category

SELECT DISTINCT *
WHERE {
        [] a awol:Entry;
         awol:id ?id;
         awol:category [ awol:term "software" ] .
      }

Impacts

No negative impacts.

Positive:

solves the whole query language problem.
makes it very easy to get exactly what is desired from the server, therby reducing bandwidth problems
gives clients the ability to incorporate "smart folders" a la OSX

Notes

CategoryProposals