It’s just data


Tim Bray: So why isn't there a tag there saying this is a property?

This question bothers me at so many levels.

HTML is a wonderfully quirky grammar.  Because it is extensible, we have wonders like blink tags.  Initially, the right to extend the grammar was reserved by browser vendors, but eventually with the advent of CSS, a revolution was created whereby content creators could devise their own tags.

Unfortunately, they are all named div.

Grammars in which all elements are named the same bother me.  Particularly ones in which the values are all presumed to be a string, and achieve XML well-formedness by virtue of obscuring structure via XML encoding.

While it was greatly maligned, RSS 0.90 really wasn't all that much different from RSS today.  What it got right was that things like titles were represented as <title> instead of <PV name="title">.


Back in the early days of the XML Protocol WG, Eric Prud'hommeaux developed a protocol matrix that identified several facets for each protocol, several of which apply to any XML application.

One of the facets I proposed was that of grammar types: custom, automatic, or fixed grammars.

As examples, RSS 2.0 is custom, RSS/RDF 1.0 is automatic, and RPV is fixed.

My current preference is automatic, then custom, and, rarely, fixed.  Custom has recently been gaining on automatic.

Posted by Ken MacLeod at


Posted by Dave Winer at

All the points you make are good, Sam, but I don't feel like you addressed Tim's concern.  When I read that question in context, I think what he was trying to get at is this:  If you have tags like <title> and <ex:editor>, how can you 'tell' that they are properties?

It seems to me that Tim doesn't want to infer that properties are defined by their position on the tree.  But you know what? I'm not sure why. I'm not sure why he's reluctant to assume that everything enclosed by a 'resource' of some kind is automatically a property.

At this point I have to reveal my ignorance by asking a simple question:  Can't we use a schema to describe which tags are properties and which tags are resources?  I hated to ask that question, because RDF is about creating metadata for content, and if I were to advocate the course of action implied in my question, then we'd need metadata to describe the syntax we're using to describe the metadata we're compiling for our set of resources.  The madness has to stop somewhere.  But... it may give you the ability to say <title...> and Tim the ability to clearly see what's a property of what.

Another thought:
Like you, I personally don't really like <pv  name="title">.  But suppose I'm creating metadata on the fly without really knowing what kind of ontology it's all going to fit in: a bottom-up approach. The <pv name="title"> route seems more pragmatic than using <title> itself and creating a schema to describe the resource-property relationships, because I'm not going to immediately see when working from the bottom up that <title> is necessarily the best tag to describe that kind of data.

And so by using <pv>, we go back to <div> hell.  :)  At least, until we have a body of information large enough to transform the <pv>'s into something more meaningful.

Posted by Robert Hahn at

Robert: are we trying to represent a property or are we trying to represent an editor?  Also, note the title of this blog entry.

Posted by Sam Ruby at

I'm kind of new to this, but Robert, when you say "Can't we use a schema to describe which tags are properties and which tags are resources?" - isn't this the purpose of an ontology? e.g. the definition of the FOAF schema ( ) has things like (!abbreviated versions!) :

<rdfs:Class rdf:about=""  />


<rdf:Property rdf:about="" >
  <rdfs:domain rdf:resource=""/>

and I'd guess that that RDF file was used to generate a human-readable table of this information:

Posted by Phil Wilson at

Sam: Thanks for your response.

It would seem to me that if the purpose of using RDF was to make statements that conform to a Resource-Property-Value model, then the markup should match the model.  By doing this, I'm aiming for the more general case, which should maximize interoperability opportunities.

To answer your question then, I'd pick representing a property over an editor.  I would do this because I think that if I were to use an editor tag without any additional context (ie: that it is a property of some resource), then I would be merging my ontology with my RDF - and I think that such a merge is bad design.  One of XML's goals is to separate presentation from markup. I would suggest that using ontological references inside of an RDF file merges the presentation (the ontology) with the markup (RDF/XML).

I realize that this position may seem at at odds with my preference for using an <editor> tag over a <pv> tag, but I've already mentioned that providing some kind of mapping that clarifies what tags are properties (among other relevant mappings) would resolve that.

So: why do you ask that particular question?  And: At the risk of derailing the thread, I must confess not seeing the connection, even after Googling Atomists, between your title and your content.  And honestly, I didn't put much weight into the title because there are so many people who title their blogs or articles badly <shrug>.

Posted by Robert Hahn at

Ian Davis' 14Nov2002 response to RPV is a larger example of the issue with RPV from a readability perspective.

Posted by Ken MacLeod at


I too am not an expert in RDF/XML, so when you made your points, I was eager to see what was done w.r.t. FOAF. Yes, everything you say is absolutely correct.  When I wrote my first comment, I did not know what the implementation details were for describing ontologies.  And I didn't think it mattered for this discussion. 

What mattered to me was that we have on one hand Sam (and supporters) who apparently would love to merge ontologies into the document specifying the RPV's, and on the other we have Tim (and supporters) who have an equally valid requirement to have a document clearly indicate what the resources, properties and values are.  Sam's approach results in clear markup that will work only in certain information spaces.  Tim's approach results in less clear (but to me, still acceptable) markup that would work in many information spaces.

This is hard stuff, and the more I write on this, the more I can't help but feel that I'm rehashing a discussion that has taken place many many times before by many many other passionate people. 

I feel that both arguments have merit, but I've asked Sam how he thinks the idea of properties can be brought into an RDF file written 'his' way.  I've yet to hear it - or if I heard it, then I certainly don't understand it.

Posted by Robert Hahn at

Robert: are you a collection of atoms?  What distinguishes you from the collection of atoms that represents me?

I agree that markup should match the model.  I am questioning the model.

Is an RSS feed a collection of items or a collection of properties?

Posted by Sam Ruby at


I am understanding you better.

I would say an RSS feed is a collection of items.  I would say that because an RSS feed is (to drop into Object Oriented terminology) an instance of RDF syntax.

In my mind, then, the whole issue changes slightly:  what's the correct model? And this is what you're asking too. 

Is it: 

Write the Statements (or R-P-V's) in the general case (Tim's way), and define in a separate place one (or more) ontologies that can make use of it,


Use the ontology right in the markup, and provide a seperate document that describes the relationships?

I would say that it would largely depend on the application, but speaking for myself, if I were to prefer the first option, I would still hope to see enough of the R-P-V model visible in the syntax to make it easy to grok what it is I'm describing.

Posted by Robert Hahn at

Robert, we are making good progress.  Now let me pick on one word, namely the.  When you say it depends on the application, which application are you talking about?

For purposes of information exchange, we are talking about a collection of items.  If you want to see my RSS 1.0 feed as a collection of properties, that's fine.  If I want to produce that via a template, that's OK too.

One of the lessons of the Web is that once you specify a representation for something, people will build new surprising and often useful ways to generate it.

Posted by Sam Ruby at

Sam: Good questions, but I'm not clear what progress 'we' have made, seeing that you've given no indication to me of either expanding or repositioning your stance.  But then, I suppose for you to say that I've made progress would sound condescending. :)

When it comes to which application I'm talking about, well, I haven't given it a lot of thought.  It would seem to me that describing some collection of statements as an RPV triple has much broader coverage than describing things as RSS.  Aside: I note that despite your premonition of greater things in store for RSS, you haven't exactly been forthcoming as to ideas on what those greater things are either.  As I've said before, this is hard stuff.

What I can imagine is a class of applications that would conceivably access a huge mass of ontology-free statements.  The ontologies themselves would either be encoded within a representative application, or specified by someone using a representative application.  Boiled down, it doesn't sound much different than XML-ified SQL querying a decentralized database.  But while XQuery (intro) (which I'd imagine many people thinking of) works on XML documents directly, what I'm imagining is an engine that can query just metadata.  I think that's a valid and useful kind of application because I'm assuming that there is less metadata than data (in terms of byte count and complexity of structure), that it would conceivably take less work to query metadata for what you need than data, and that all metadata will point to data.

As for your other remarks, I have no disagreements.

Posted by Robert Hahn at

Sam Ruby: "While it was greatly maligned, RSS 0.90 really wasn't all that much different from RSS today. What it got right was that things like titles were represented as instead of ."...

Excerpt from Scripting News at

Robert, my basic problem is that "One man's metadata is another man's data".  There may be a clear distinction in your mind between these two, but there isn't one in my mind.

Posted by Sam Ruby at

Sam:  For every complex problem there is an answer that is clear, simple, and wrong. With that, please do not assume that the difference between data and metadata is clear to me - because I haven't spent that much time thinking about it.  But I am willing to learn.

But thank you anyway for this interesting dialogue.  I just saw David's thoughts on the matter, and am delighted to see his wry observation: "I see lots of people with strong opinions and not much software."  I am guilty as charged, it seems.  But perhaps someday I'll have something more substantial to present - even if it only serves as a stepping stone to a greater idea.

Posted by Robert Hahn at

RDF: Binary XML

Tim Bray has had a couple of essays on RDF, including an offer of a domain,, to someone who invents an RDF tool he would like to use. I wonder if he likes poetry? I wonder if I'm interested in yet another domain? Tim's pushback isn't against... [more]

Trackback from Burningbird


Data vs. Tools

A number of people have been commenting on Tim Bray's suggestion for "simplified" markup in things like RDF. The most interesting one so far came from Burningbird:

<p>"Sure, View Source is how people learned to create web pages, or to... [more]

Trackback from Big Damn Heroes (Tech) at

Quick links from June 2002: Re: Potential new issue: PSVI co  Sam Ruby: Atomists  Burningbird: RDF: Binary XML  Russell Beattie Notebook - Saturday, May 24, 2003  My Delusional Dream: LPD for fun and MP3 playing  Port80... [more]

Trackback from l.m.orchard


Quick links from June 2002: Re: Potential new issue: PSVI co Sam Ruby: Atomists Burningbird: RDF: Binary XML Russell Beattie Notebook - Saturday, May 24, 2003 My Delusional Dream: LPD for fun and MP3 playing Port80 Software: Towards Next...

Excerpt from 0xDECAFBAD at

Better Messaging

I agree.  However, I would like to add that once you accept that, you can go on to produce better models, ones that are more suited to transformation. For starters, it helps if the messages are cleanly and thoroughly specified.  And adopt... [more]

Trackback from Sam Ruby


The Message Is The Medium

The title comes from Rohit's Rant on Personal Servers.

<p><a... [more]

Trackback from Mod-pubsub blog at

libxml2 screams

libxml2 is apparently optimized for parsing lots of small files.  I tested the theory by running a fairly realistic query against all of the weblog entries on my site.  This completed in less than a second.  This does not mean that I... [more]

Trackback from Sam Ruby


Boiled down, it doesn't sound much different than XML-ified SQL querying a decentralized database.  One of XML's goals is to separate presentation from markup.

Posted by Fred Hurb at

The Ongoing Saga of JMX

Easy Is Not Always Simple provides a nifty overview of the history of JMX’s architecture. It’s interesting because it explains the design decisions that were intended to make JMX simple and flexible and superior to other management APIs. Though, as...

Excerpt from discipline and punish at

Add your comment