Aaron Swartz: These changes also have the side effect of making feeds valid RDF.
This statement surprised me. So I tried it. I didn't get it right the first time, but after Aaron gave me a few suggestions on IRC last night, I finally got there.
minAtom.rdf and maxAtom.rdf, both valid RDF.
Note: this is not an official snapshot.
Sure, that's a valid RDF document. The document is:
<RDF></RDF>
The other elements are all in the Atom namespace, so an RDF-only parser shouldn't be touching them.
Just pointing out that I don't believe the rdf:RDF root element is required. Just throw the namespace declaration on "feed".
P.S. s!#!/!
Martin,
If you run the maximal feed through the RDF validator you will see that it picks up quite a few triples.
And I can't help but note that I can't just include a link here to the validator output since the RDF validator uses POST for it's form instead of GET.
Simon, you're missing the secret sauce; the parseType attributes. These help morph the RDF parse model.
I suggest not wrapping the feed in rdf:RDF. That automatically rubs people the wrong way, plus has some technical issues; chief amoung them, IMO, being that the same bit-stream can't just use a different content-type to dispatch it to an RDF processor or an Atom processor (or an XML processor even).
Martin, putting elements into containinig RDF tags isn't the only requirement for for it being valid RDF/XML. But the other tweaks sure did make this valid RDF/XML if a bit dependent on blank URIs, which actually lessens a bit of its usefulness in favor of 'hiding' the RDFness.
By adding the tweaks this way, you get a bonus -- you're bringing this syndication feed into format with a host of other vocabularies, leaving doors open rather than closing them.
This, this to me and the effort leading to this is where the road forked, and you're now looking at the possibility of an RSS 2.0/RSS 1.0 'killer', because you're showing that the RDF overhead can be minimized, and both worlds can be satisfied. Heh -- Grand Unification theory at a micro level.
Additionally, you can then use other vocabularies -- replace vocabulary specific elements such as name with DC or even FOAF elements.
(It also looks a little like my simplified RSS at http://rdf.burningbird.net/archives/000516.htm , except without the content, different tag names, and using specific URI's to identify elements.)
The only thing -- drop the version. I'd post this on the mailing list in response to the back and forth between Danny and Aaron on this, but am not getting all the emails to respond to -- versioning because of breaks with backwards compatibility is antithetical to the use of namespaces within RDF/XML. Backwards compatibility breaks are antithetical to any good data scheme.
How much do blank URIs lessen the usefulness of the RDF?
Won't you rather prefer to do a transformation on the plain XML version, that get's the URIs right, and at the same time translates the properties to the dublin core properties, and possibly other existing vocabularies?
Sjoerd, because the link is nothing more than a property -- adding a URI reference to the individual posted item adds in a unique identifier for that item, making it easier to work with in other vocabularies -- or in other feeds (it's 'identified').
For instance, if one pings an item, this ping has an associated URI of the item being referenced and the item doing the referencing -- you have the beginnings of a thread. Eventually, through the use of namespaces, we could add threading information to this feed about the item being pinged. Smarter aggregaters could see the new RSS item coming in, with its ping information, and to interesting stuff with it.
You could say, well they can with link too, but there's no understanding of link within Pie/Echo/Atom being 'unique'.
Blank URIs don't allow us to combine information from different sources. Using URIs does.
As for transformation -- why? Why add this additional programmatic step, just to avoid a few tweaks in the syndication format? Why add the complexity and burden on every user of the syndication feed, when a couple of minor changes in the syndication format makes this valid RDF/XML? This won't make any different in the URIs, so I'm not sure where you're coming with this one.
Blank URIs will allow us to combine information from different sources, but I do have a similar concern - the use of :
<entry><id>URI</id></entry>
is a bit mangled.
The question is do we want this to actually be RDF, or close-to RDF?
As true RDF it would certainly make a lot of jobs a damn sight easier.
My guess is there'd be backlash to <rdf:RDF> and parseType attributes, otherwise why isn't everyone already using RSS 1.0? I guess a poll could settle this easily enough.
By making it RDF we would giving ammunition to those who have been trying to undermine this project. "Right, over there that's complicated IBM terrorist-funded RDF stuff, but you don't want that, come and eat some of Uncle's plain XML cookies...".
Either way I do think it's close enough that we can use RDF's extension mechanism, the tastiest bit of RSS 1.0.
Shelley - re. versioning - do you have examples of how things have worked in the various approaches? I read somewhere recently Tim Bray pointing to hassles with changing namespaces to change version, but can't now find the ref. My own opinion is now 'undecided' ;-)
(I could've sworn I already posted this ten minutes ago, apologies if it turns up somewhere strange)
A tranformation does make a difference in the URIs. It's not that the right identifier isn't there, it's just not recognizable to the RDF parser. A tranformation written specificaly for Pie/Echo/Atom can transform the id element to an rdf:about attribute.
Why add the complexity and burden on every user of the syndication feed, when a couple of minor changes in the syndication format makes this valid RDF/XML? Because you should turn this question around: Why add the complexity and burden on every producer of a syndication feed, when a couple of minor changes to the aggregators makes this work?
btw, has the order of entries been worked out (by one of the dates?)? Otherwise the RDF will also need to express that
Why does it need to. It's not as if RDF databases can't sort.
Shelley,
The only thing -- drop the version. I'd post this on the mailing list in response to the back and forth between Danny and Aaron on this, but am not getting all the emails to respond to -- versioning because of breaks with backwards compatibility is antithetical to the use of namespaces within RDF/XML. Backwards compatibility breaks are antithetical to any good data scheme.
Why does a version number automatically translate to backwards incompatible breaks?
Sjord, I created a new MT template for this feed in five minutes -- well, it took 10 because my cat wanted a cuddle first. You can see the generated results here http://weblog.burningbird.net/atomfeed.rdf . The only reason it's not validating is that my content is not proper XHTML -- but normally I wouldn't include full content in my feeds anyway.
The saying of 'adding a burden' to the feed generators is a bird that just don't fly anymore. Tell the generators what they need to do and they can do it, about as easily as I did it. The architects work out the RDF/XML issues and provide the structure -- the rest is just pattern generation.
What do you get in return -- unification. After three bloody years, you get unification.
Dare, I'm reading the comments out on the threads, and what the folks are saying is that versions must be maintained so that breaks in backwards compatibility can be signaled. I'm just following along with the conversation.
As for sorting -- I never agreed with the use of Seq in RSS 1.0, and I've said this. Order should not be a part of this model -- information that allows people to order should. Name of contributor, posted date, title -- doesn't matter. Allow the tools and the people to choose, just supply the data.
One does not use RDF/XML, or not use it, based on this as a criteria. Compatibility my friends -- keep your eyes on the big picture not the minutia.
@Joe: You can include a direct link to the validator output since the servlet understands GET as well (although that doesn't seem to be documented). Here comes the URL:
That's also nice as a bookmarklet.
re: "multipart should work without any additional changes"
How are you expressing the ordering? content elements within a multipart are published in order of increasing fidelity.
By making it RDF we would giving ammunition to those who have been trying to undermine this project.
Since I can't comment usefully on the tech, I'll comment usefully on the procedure: Don't make decisions on a technical format based on personality or political conflicts that have nothing to do with features or implementation. Those conflicts won't be solved by making concessions to people who won't use the format regardless of what it looks like.
Quite interesting notion. Long ago when I was fool enough to try and convince the RSS 1.0 group to evolve that spec into something more appropriate, I made a post about an RSS with
like format with streamlined RDF.
My knowledge of RDF is quite limited and It really owes much of its brilliance to Sean Palmer who based his work off a proposal by Shelley.
As I understand it the rdf:RDF element wrapper is really not necessary as is probably best avoided given past history.
http://www.w3.org/TR/rdf-syntax-grammar/#start
If the content is known to be
RDF/XML by context, such as when RDF/XML is embedded inside other XML content, then the grammar can either start at Element Node, RDF, [...] or a production nodeElementList
So, why wouldn't this apply to the RDF Atom feeds?
+1 to Anil's suggestion.
I think pursuing RDF should absolutely be considered if it makes sense -- and I think it does. The presence of explicit RDF elements should be minimized if possible, not for reasons of politics or personality, but for streamlining and reducing overhead.
I'm just a "normal" user observing all this from the sidelines, but it seems that making it RDF ensures the "unity" of RSS types Shelley is mentioning + future-proofs it (if RDF grows as expected)!
Creating it with RDF or not in the structure does not make a difference to the masses (or publishers) that adopt it. What we need is ONE standard - people will follow it. While we don't have that standard, we only reduce the value of RSS as a whole (and Atom in particular).
Speaking purely from a developer POV, I just don't understand this desire to tack on "kinda" RDF. I don't see what's gained at all. Throwing RDF in will make the spec more complex and will necessarily require a higher investment since it seems like you'll have to grok RDF in order to safely extend and/or customize Atom (or risk nuking the RDF validity of Atom documents). Furthermore (though I'll probably be shouted down for raising this point) RDF just isn't a safe investment. There are still real problems with RDF and the TAG web architecture and then the whole 'Where's the Resource?' debacle that could easily be avoided by sticking to plain old XML.
So where are the use cases for this change? What are the motivations? Since Atom is such a strongly define spec it seems like it'd be trivial to define a standard transform (via XSLT) that would allow the RDF people to gain all the benefit of a native RDF format without forcing the rest of us to eat complexity penalty.
Don,
I definitelly agree. I haven't seen anyone describe the concrete benefits of making Atom compliant with RDF so this just seems like an effort in Semantic Web buzzword compliance.
The one or two benefits I've heard outlines are either incorrect (RDF solves the mixing namespace vocabularies problem) or are just as possible with straight XML.
Avdi,
a.) How is this graph useful?
b.) Given that I could write an XSLT stylesheet that could graphically render an ATOM feed in any dozen possible ways why is this graph special?
Dare:
a) It's not especially, other than the fact that for some people a visualization is the best way to demonstrate a data structure. The importance of it is that it was created without any code.
b) You know XSLT. That's great. I understand it's essentials and can write a stylesheet with my copy of "XML in a Nutshell" on my knee. But the question is, how much is J. Random Hacker, or even John Q. User, going to be willing to learn and/or code in order to implement his clever idea? We all know that with the right code we can transorm Atom into just about anything we want; but even if the Atom project is good enough to provide canonical XSLT transforms for some formats the average weblog isn't going to provide those transformations for free.
This is an issue of lowering the bar to the most likely user. True, you might ask why it's important to lower the bar for RDF coders in particular. I happen to think that RDF is going to be the most requested transformation. There's a lot of RDF out there, and it's only going to grow.
You want a hypothetical example? Supposing it becomes popular to add example:bookRating elements to the Atom listings for blog entries which review a book. Supposing also that I can get an RDF feed from Amazon listing their current bestsellers. Now suppose I want to get a summary of the ratings people on my blogroll have given to bestsellers. If Atom isn't valid RDF, I have no choice but to write code. I have to understand XSLT, locate the right transform, and write a program which transforms Atom to RDF, funels all the resulting RDF into a store, and then queries the store.
On the other hand, it is not at all hard to imagine a generalized tool appearing in the near future (if it doesn't exist already) that lets me specify my data sources (the Atom feeds and the bestseller feed) and then construct my query point-and-click without writing a single line of code. For a lot of transient uses this would be all I really needed; and for the cases where I wanted to do more with the data, or embed the query into my own blog, it would serve as a useful prototyping tool. In a lot of cases the ease of being able to try it out without going to a lot of trouble would be the difference between actually doing it and giving it a miss.
re: "Supposing also that I can get an RDF feed from Amazon listing their current bestsellers."
This argument is somewhat undercut by the fact that you can already get this information from Amazon... as RSS 0.91. And their RSS feeds are automatically generated from their own proprietary DTD by means of... XSLT.
Furthermore, their engine is (currently) wide open, it accepts remote XSLT documents as query string parameters and can therefore perform any transformation you like. If you want the same data in RDF, there are absolutely no technical barriers stopping you from writing an appropriate transform and publishing the URL.
In other words, if you want RDF, you can pay the RDF tax. But please don't force it on the rest of us.
Mark:
I was unaware of that, and that's very cool. But honestly, how many websites do you think are ever going to "pay the XSLT tax", to use your terminology, and allow anyone to send an arbitrary transformation? I don't see that feature coming to gramma's muffin blog anytime soon.
What is the "RDF Tax"? The post that started this thread showed that Atom could be trivially made RDF/XML compatible, and as later commenters have pointed out, you don't even need the "RDF:rdf" container that some have such a strong reaction to. I've been trying to follow this thread both here and on the mailing list, but I've yet to see a clear example of this "RDF Tax".
Avdi,
All you've said is that someone has already thrown up a website that does RDF graph visualization while someone hasn't done the same using XSLT showing an XML node tree. I sincerely doubt that this is the case and even if it was it is a feeble argument. I can get home after work and throw up an ASPX page on my machine that accepts XML documents and draws a node tree using XSLT by the end of the evening.
So what?
Sam,
<blockquote>
Mark, if you are going to use terms like "RDF tax", can you do me a favor and quantify exactly what that tax is in the context of exhibit B?
</blockquote>
Are there any limitations on how I can add extension elements and attributes? If so, there's your tax.
Dare:
Great, so for the simplest possible application of RDF, raw XML is sufficient. What about the slightly more complex? What about that query tool I hypothesized for the bestsellers example? Similar tools already exist in the RDF world - can you quickly whip one up for raw XML which isn't specific to a single task but consists of more than just a list of inputs and a box to enter raw XQuery?
Dare, chaos is also a tax. A more subtle one, but a real one nevertheless. For example, choosing between attribute and elements capriciously increases learning curve.
If we can come up with some simple guidelines that don't limit what can be expressed but tend to encourage similar things to be expressed in a similar manner, then we all benefit. Particularly if those guidelines are expressible in a manner that the validator can enforce.
I am by no means knowledgeable on RDF, but it appears that the limitation RDF imposes on attributes are that they must be in a namespace. This should not limit what can be expressed, or the extensibility of the model.
I was very much surprised by the outcome of the exercise that Aaron initiated. Atom feeds seem very hierarchical to me, but with little or no effort this could be adapted into a format that can be readily digested into a relational model. And, quite frankly, the result - when expressed in a hierarchical format, appears more consistent.
Avdi,
Similar tools already exist in the RDF world - can you quickly whip one up for raw XML which isn't specific to a single task but consists of more than just a list of inputs and a box to enter raw XQuery?
How about a GUI tool that uses XQuery in the background? Such as allowing you to compose XQuery expressions in a drag and drop manner the same way that many tools allow you to do for SQL.
Given that XQuery isn't done yet I'm actually surprised by the number of folks who've announced that they plan to build such XQuery GUI tools. I believe BEA has screenshots of such a tool at http://edocs.bea.com/liquiddata/docs10/querybld/design.html#1104374 and I've heard others propose similar tools.
Re: RDF Tax
I know enough XML to get around. I could extend RSS by way of namespaces if I wanted to.
Like a lot of people, I know zero RDF. If one of the things to keep in mind in defining (and extending) Atom was RDF compatibility and validity, what would I have to learn?
Whatever that is, that's the RDF tax to me as a non-RDF savvy person.
Re: RDF Tax
If I understand this correctly, then all that is being proposed is structuring Atom in such a way that it will fit nicely into an <rdf:RDF> element.
Thus the "RDF Tax" would only apply to Atom itself. If you want to create a namespace for your Atom feed, you can create it without worrying about RDF since you don't have to use your namespace in RDF.
Of course, this may not be acceptable to the people involved in defining Atom's structure. But, the rest of us shouldn't have to worry about it.
Sam,
Before folks get carried away with trying to second guess what the "RDF tax" actually is it'd be nice of someone familiar with RDF (hey Shelley!!!) could explain what the requirements of RDF-like XML has to be.
I remember reading an article about these restrictions a while ago. Lemme see if I can dig this up
...
Here we go. Make Your XML RDF-Friendly
http://www.xml.com/pub/a/2002/10/30/rdf-friendly.html
Here
OK, I read the "Make your XML RDF-Friendly" article Dare linked to. If I'm understanding it correctly, here's what we would need to do to make Atom "RDF-friendly" (# references refer to the numbered suggestions within the article):
1. (#1) Get rid of all attributes that are not in a namespace, starting with feed/@version. This has already led to the inevitable permathread on versioning (starting here: http://www.imc.org/atom-syntax/mail-archive/msg00224.html ).
2. (#2) Get rid of the feed/id element and put the id in the rdf:ID attribute of the feed.
3. Ditto entry/id.
4. (#3, #4) Change the feed/link element into an empty element with the value of the link in the rdf:resource attribute. Either that, or move it to the rdf:about attribute on feed itself. I'm not sure which, and I vaguely recall a permathread on RSS-DEV about this as well. No doubt we will get 3 different answers.
5. Ditto entry/link.
6. Ditto author/url.
7. Ditto contributor/url.
8. (#5) Replace feed/title with dc:title.
9. Ditto entry/title
10. (#5) Replace feed/modified with dcterms:modified.
11. Ditto entry/modified
12. (#5) Replace feed/tagline with dc:description.
13. Ditto entry/summary.
14. (#5) Replace author/name with dc:creator
15. Ditto contributor/name
16. (#5) Replace entry/issued with dcterms:issued.
17. Replace entry/created with dcterms:created.
18. (#6) Wrap contributor in an rdf:Bag, because RDF can not handle an element having multiple identical children without putting them in a bag. According to the article, "There's no limit to the level of nesting, as long as even-numbered elements in the line of descendants are resources and odd-numbered resources are predicates."
19. (#6) Wrap entry in an rdf:Seq, to make up for the fact that RDF has no implicit ordering.
20. (#6) Wrap the content elements within a multipart/alternative in an rdf:Alt.
21. (#7) Give up on the inline content model, which currently allows for well-formed "mixed content" embedded directly within a content element. Aaron's example declares content as rdf:parseLiteral, which means the RDF parser will return the value as a single string (as if you were using mode="escaped" now in the current syntax). Since the content is escaped and not treated as XML, there's no way to do the equivalent of XQuery-ing your way into getting a list of all the outgoing links in your entries (citing Udell's example).
22. Mandate that all extension modules follow these same rules.
No doubt I have missed something, misunderstood something, used the wrong terminology somewhere, or misinterpreted the article completely.
Tim Bray said:
"And if you're going to do some dumbed-down subset instead of full RDF, you might as well just publish a standardized transformation from a nice idiomatic Atom feed to real
full first-class RDF."
I tend to agree.
Dare wrote:
"Sam, Before folks get carried away with trying to second guess what the "RDF tax" actually is it'd be nice of someone familiar with RDF (hey Shelley!!!) could explain what the requirements of RDF-like XML has to be."
I did write on this. It's called "Practical RDF".
Actually, there are misunderstandings in Mark's interpretation of what would need to change to be RDF/XML.
True, you have to use namespaces, but they don't have to be indicated -- you can use a default namespace for the primary Pie/Echo/Atom elements. The use of the namespace for elements from other vocabularies is there to prevent name collision, and would have to be used if you supported namespacing anyway -- so, I don't think this is a RDF/XML 'tax' or burden.
As for the version attribute -- well, I imagine you could find a way to keep this, it's just that some of us wonder why you'd want to do this. Different strokes, different folks.
Also, you can have repeating properties in RDF/XML -- you don't have to put them in a container, such as a Bag or a Seq or Alt. You do if you want to imply order in the sequence of elements, but I've never been fond of this concept and rarely use containers. I prefer my ordering coming in via the data -- such as the date or title. I use repeating properties all the time and my RDF/XML is valid.
You don't have to use DC elements in your vocabulary. The reason why people want to do this is the ability to combine data and perform useful extrapolations.
But if you don't want to do this combination of data from say, FOAF and Pie/Echo/Atom and trackback and threadsml, and making use of existing vocabularies and being able to extend the functionality beyond being a simple feed -- you don't have to. There's nothing in RDF that says, you must use DC. And since you're using XML DC for subject, ie. dc:subject for the non-RDF XML version, not sure about this seeming concern?
Also, you don't have to use RDF:ID -- you could use rdf:about, or you could continue with what you have right now in this prototype, which is the ID property (element). Using rdf:about again makes it easier to combine data, but if you all don't care about combining Pie/Echo/Atom with other vocabularies, and building a rich set of data, you don't have.
Mark,
After a good night's sleep I also had some concerns on what making ATOM RDF-friendly would mean to embedded XHTML markup in <code>content</code> elements. Doesn't this mean that all embedded HTML markup would also have to be RDF-friendly?
RSS 1.0 didn't have this problem since there was no means of embedding XHTML just escaped HTML as character data.
re: "you don't have to put them in a container, such as a Bag or a Seq or Alt [unless] you want to imply order in the sequence of elements"
In the current snapshot, the order of content elements within a multipart is significant. The publisher puts the content alternatives in order of increasing fidelity. This is valuable information that can't be guessed. (For example, the alternatives might be the same content in different languages, with the original listed last.)
So you need to retain that order in RDF. There are at least 2 ways of doin that, one requires an extra container, one requires an extra element or attribute. I don't care which way you want to do it, but you can't simply do nothing.
I would also argue (and I know this is controversial) that the order of the entries could be significant to some applications. I know dates are required and most desktop news aggregators sort by date anyway and publishers usually add newest items to the top of the feed and all that. The point is not any particular use case; the point is that there is ordering information implicit in the XML (order-as-published) which you will lose if you treat the input as RDF. So again, either add a container, or add a Seq, or simply mandate that order-as-published doesn't matter and hope you're right.
Come to think of it, contributors may have the same problem, since the publisher may have put contributors in a specific order on purpose.
re: "Using rdf:about again makes it easier to combine data, but if you all don't care about combining Pie/Echo/Atom with other vocabularies"
Shelley, you can't have it both ways. If we're talking about mandating that Atom is RDF, the only benefit I've ever heard from anyone about why we would want to force that on everyone is that you can combine RDF documents with other vocabularies. Was there some other benefit I missed? It's wickedly difficult to parse unless you have perfect tools; it has a steep learning curve; RDF/XML syntax is so complex it requires a whole book to explain, above and beyond the complex syntax rules of XML; and so forth. There are lots of downsides. And the big big big upside is that it's supposed to make it easy to combine documents from different vocabularies. So of course we'd be interested in doing that.
And, once you admit that that's the only reason for going RDF, the next logical question becomes: "why are you reinventing all this stuff in the Atom namespace when you could be directly reusing existing ontologies?" After all, redefining semantics that already exist elsewhere is like inventing your own spoken language -- everyone who wants to talk to you needs to figure out the mapping. So either we'd need to change everything in Atom to use existing ontologies, or we'd need to also maintain a semantic mapping between how we express things in Atom and how the rest of the world expresses things in existing ontologies (using DAML? or OWL? I've heard those acronyms bandied about in this context but I've never looked very hard at them myself -- but everyone I talk to who has looked at them says they're even more complex than RDF, but that's hearsay so take it with a grain of salt).
Semantics isn't syntax. If you want Atom to be semanticly relevant, and you want to use RDF as the means to achieve that goal, then we need to talk about all of what's involved in making that happen. Slapping an rdf:RDF element around everything and saying "ooh, look at pretty graphs" isn't enough.
Shelley/others: can you produce an "maximally RDF friendly" document that corresponds to Mark's maximal feed? The hope is that by seeing what a maximally friendly feed would be, we do the appropriate cost/benefit analysis and hopefully can find some place to meet in the middle.
Dare: rdf:parseType="literal" allows arbitrary XML to be treated as a property.
Mark (Pilgrim|Nottingham): todo item for the spec... identify all cases where order is significant. This should not be left ambiguous.
Good questions/responses.
True, if the physical order of the elements has significance, then within RDF/XML you do need to use a container. The reason I don't care for this is, being a data person, I believe that ordering should come from the data, not the physical construction of the data file. But that tends to be a left over from my relational database work.
True also -- most people such as myself look at the use of RDF primarily because we want to combine data and do things above and beyond what the data was originally intended. And if this is true, then it makes more sense to reuse data elements then use all new ones. Very true.
Dare -- XHTML in a feed. Hmm. Sam, did you have problems with the XHTML in the sample feed I created for you? I know that it validated in RDF/XML, but I think the problems we ran into were other elements and I wasn't following the Pie/Echo/Atom guidelines with some of the data.
Also, I believe you used non-RDF specific techniques with this validator, didn't you? Did you have problems with the feed? I know I've used the same XML processing with RSS 1.0 as I've used with RSS 2.0 and with RSS 0.9x, but I'm wondering if this still holds true with the new Pie/Echo/Atom stuff.
Namespaces: I don't claim to have a full understanding of DAML/OWL; but as I understand it (those who know better, correct me if I'm wrong) there is no either/or choice between utilizing existing vocabularies like DC and having a clean, prefix-free document. Let all the standard Atom elements be defined in a single atom namespace, let atom be set as the default namespace in the DTD, and have a reference in the DTD to an OWL ontology which maps atom-namespace elements to Dublin Core, etc. Those who don't want to think about obscure things like RDF and OWL don't have to - only the semantic web freaks will have to pay the "OWL Tax". Does anyone (Shelley?) know any reason this would not be feasible?
Sequences: Leaving aside the issue of RDF, significant ordering gives me the heebie-jeebies as a developer, and I hope that doesn't make it into the final standard. If I'm writing an aggregator it does me no good to know that the content elements are in "increasing fidelity" order - I don't care what order they are in, I want to know where to find an element of a specific fidelity or format. I don't care that the Elvish translation is "lower fidelity" than the Klingon; but I need to be able to specify which language I want. The same applies to levels of detail, levels of markup, etc.
Inline Content: I'm curious whether there's a way around this in RDF, but in any case it doesn't matter. Mark, why are you going to be parsing Atom with an RDF parser? Why don't you just parse it with an ordinary XML parser, preserving inline content as a node tree, and leave the annoyance of dealing with content-as-flat-string to those wierdos like me who want to parse it as RDF?
Taxation: There seems to be a lot of strawman-kicking going on as far as what's "necessary" to write/parse RDF/XML. I don't think anyone here is arguing that Atom should require an RDF library to parse or emit, or knowledge of RDF to read or write. What, exactly, do you need to know about RDF to understand/parse/emit the examples linked at the top of this page? As far as I can tell, nothing. This isn't about forcing everyone to grok RDF; it's about making a few small changes so that RDF-heads can source Atom feeds directly.
The biggest assumption I see here is that the data-model as RDF sees it has to bt eh data model as everyone sees it. As I understand it from the Wiki, it was already decided that Atom was to be defined by it's XML serialization, not by it's abstract data model. So why the focus on a single data model? Let RDF users worry about element order, inline content, and ontology mapping. Nobody's asking you to change how you look at the data.
Avdi writes:
As I understand it from the Wiki, it was already decided that Atom was to be defined by it's XML serialization, not by it's abstract data model. So why the focus on a single data model? Let RDF users worry about element order, inline content, and ontology mapping. Nobody's asking you to change how you look at the data.
If RDF folks want to use Atom information, there are two choices:
1) They need to convert Atom markup in an RDF processable-form, probably using XSLT.
2) Atom needs to be stored directly in an RDF-processable form.
Choice (1) makes no impact on the XML serialization. Choice (2) has a serious impact on the XML serialization.
Choice (1) would definitely "let RDF users worry about element order, inline content, and ontology mapping," but it's not at all clear that Choice (1) is what we're discussing here.
With regard to choice 2, Dare noted this article on making XML RDF-friendly:
http://www.xml.com/pub/a/2002/10/30/rdf-friendly.html
And Mark's already outlined the impact of that above on Atom. It's significant. Making Atom RDF-friendly - the Choice 2 route - has a definite impact on the XML serialization even if you don't ask people "to change how you look at the data."
Simon & Dare: Yes, I've been following the conversation. The points I was making were in direct response to the earlier comments you directed me to. In short: I have read what is supposedly necessary to make Atom's serialization RDF-friendly. Everything I wrote was to point out that no, all those changes are not necessary. This thread started because someone pointed out that the vast majority of those changes are either unnecessary or trivial. Since then, Mark and others have brought up a list of "necessary changes" as if they are somehow new information. By and large they are not, and have already been addressed. I was addressing a few of the points that didn't seem to be fully dismissed by the preceding comments.
Just because someone says "change X is necessary" doesn't make it so. If that were the case, and all the changes Mark lists were necessary to make Atom "RDF friendly", then minAtom.rdf and maxAtom.rdf above either wouldn't be valid RDF or wouldn't be as readable as they are.
Well, there's a difference between "valid" and "friendly". Shelley has already stated (in this thread) that it would be better if Atom-as-RDF used existing ontologies. Otherwise why really bother with RDF? If you're going to define an RDF ontology that redefines a bunch of semantics defined elsewhere, anyway who reasonably wants to use it as RDF will have to use a transformation/mapping/DAML/OWL/whatever to understand the semantics anyway. So what does forced RDF syntax buy you? You'd do just as well to keep Atom as XML and use XSLT to transform it directly into "perfect RDF".
Simon is right: Atom should either be directly usable as RDF (and easily combinable with other documents and easily understood at a semantic level), or it should just stay XML (and someone can produce a normative XSLT transform to make it into perfect RDF). Aaron's example is neither. All of this "hey look, I put an rdf:RDF element around everything and now it's RDF, look at the pretty graphs" is interesting, in a hackish sort of way, but it doesn't really buy you anything, because all you've done is create your own private language with its own semantics.
Mark: It's simply the difference between a)
1) read this RDF document (which happens to be an ontology) into the RDF store.
2) read this document (which happens to be an RDF-parseable Atom document) into the RDF store.
3) Do app-specific magic...
and b)
1) read this ontology RDF into the RDF store
2) Read this Atom XML
3) Transform the Atom XML into RDF
4) Read the Atom RDF into the RDF store
5) Do app-specific magic
Why is the extra step a big deal? Because (a) can conceivably be done by a non-programmer using some as-yet-uninvented whiz-bang tool which works with generic RDF input. Whereas (b) almost certainly requires a programmer. I happen to be of the belief that that kind of RDF tool will come into increasingly common use in the next few years.
Simon:
I believe it because Aaron's examples and the subsequent discussion of the remaining warts in those examples have shown it to be true. I don't know how much clearer I can get.
Avdi: (a) is missing a step between 2 and 3:
2.5) Transform the ontologies used in Atom to the ontologies used in the rest of your RDF store.
Avdi,
What bothers me the most about these RDF arguments is that pro-RDF folks like you, Shelley and Danny are talking about complicating the workload of folks working with Pie/Echo/Atom/Whatever for unknown future gains. It's one thing to design for the future and another thing to design for a future without a goal in sight.
As far as I'm concerned there is no difference between your (a) and (b) because they can both be automated by "as-yet-invented whiz-bang" tool. Sam already has an XSLT that turns Atom into RDF. So there is no difference between (a) and (b) from an end user perspective. Anyone can write a script, web service, or pretty GUI that does transformations of ATOM to RDF without the user having to know diddly about XML, XSLT or RDF.
Well, its more than obvious by now that between the likes of Mark, Dare, and Simon, more noise can be generated against RDF than can be quieted by even the most active pro-RDF voices.
What I would suggest is this: As appeared to be popular (perhaps even the majority) viewpoint in the early days of the wiki, let Atom be pure XML.
Just let it go.
(And yes, I'm saying this as someone who is 100% behind RDF.)
Go back to the wiki, resurrect the pages related to creating a single normative set of transformations to and from "Atom-RDF", complete with reuse of other ontologies where possible (e.g. DC).
Then let the two ("Atom-XML" as it were vs. "Atom-RDF") duke it out in the community.
If you (the pro-RDFers) are as confident as I am that RDF offers serious benefits you should be willing in this case to step back from trying in vain to quiet the anti-RDF zealotry and allow the community the time to give them a good kick in the foaf.
Let the ontologies prove them wrong. Starting with foaf, most likely. Let the usage of Atom-RDF grow and grow and grow until the current anti-RDF zealots look positively silly.
(RSS 1.0 was too early in this regard, but ontologies that can show real benefit have finally started to appear of late, so now's the chance to really get things going)
If it doesn't happen, it doesn't happen, but if you (the pro-RDFers) are as confident in RDF as you are, and if these discussions are as wasteful of everyone's time as they obviously are, then show your confidence by allowing them this battle. The war (if I dare call it that) is far from over.
As a final note to those that think that RDF is "hard": Some things in this world are "easy". Some things are "hard". Some things which are "easy" may in fact only be "hard" for you.
Most critically, not all things which are important are "easy". Or "easy" for you.
The internet is full of plumbing which is considered "hard" by the vast majority of the population, and while those to which it is "hard" likely didn't understand it at the time of its development (and most likely still don't), that plumbing was and is important.
To use an example which is quite common but almost always incomplete and therefore misused: The HTML ViewSourceClan.
Many people argue that one of the main benefits of HTML is the ability to learn it via View Source. They highlight the fact that there are two groups of people. Those who learn from specifications and those who learn from View Source. While it is important to balance the needs of both groups in the way that best fits each situation, the example is missing a critical piece.
There is a third group.
Those who find BOTH specification AND View Source "hard".
This group of people vastly outnumbers those who learn and use HTML via either specification or View Source, two the point where specification-learners and "View Source"-learners fall into a percentile of the population that is hardly even worth mentioning. Standard error. Noise. Nothing.
Had those people (who now quite capably create and edit HTML emails, web pages, and the like via a whole host of tools) complained about HTML during its early days, had those people railed against it, calling it hard, calling it useless, had those people won: we wouldn't be here now.
We wouldn't have the very tools with which I am typing this entry.
We wouldn't have had a whole class of other tools that are now only a click away.
Let go of your ego.
Not everything you come across in your life will be instantly clear to you. Some things not clear to you now may only be clear to you some time in the future. Many things will never be clear to you at all. I'm not here to draw that line, however.
The RDF Working Group made RDF unnecessarily difficult to learn in earlier days due largely to the early versions of their documentation, and while that situation is an unfortunate one the working group has made definite progress. However, such unnecessary difficulty and subsequent improvements do not mean that RDF, in the end, will be "easy". Nor will it mean that because it's not "easy" it cannot be important.
Just because you didn't understand it initially doesn't mean that it isn't important. Or of value. Or that given a fresh look you still wouldn't understand it now.
I suggest you give it an honest try.
In the meantime, we'll let you win your little battle.
You can go off and ignore the prior art in existing ontologies. Ignore the abstract data model that is consistent, generalized, and flexible enough that it can be used to represent almost any information, all while freeing you from concerns regarding persistence and serialization for transport.
We'll work on Atom-RDF while you progress toward your eventual epiphany on RDF.
And then we'll welcome you in with open arms. And perhaps the occasional joke about what took you so long. And you'll laugh too.
I did. After my epiphany in early 2001.
Yeah, didn't you know? "As-yet-invented whiz-bang tools" can do anything. ;)
I make an analogy to web standards, HTML and XHTML and CSS and so forth. I used to make similar arguments for web standards to the ones I see here for RDF: "do things this weird new way now, and you'll see indescribable benefits down the road." Forward compatibility, future browsers, cell phone browsers, all that stuff! Well, I swallowed it, and of course it was crap. Safari came out and handled table-based layouts perfectly and CSS like crap. Hiptop came out and handled table-based layouts perfectly and CSS like crap. Newer cell phones run Opera which handles everything extremely well, tag soup and all.
So I switched my argument to "it makes my own life easier". Which is still true. Validation makes my own debugging easier -- the validator can catch stupid mistakes like typos in attributes or missing end tags, but only if you have so few other errors that you can spot your mistakes in the error report. CSS makes my own development easier and means I can do more with less -- sure I could do dynamic style switching with a complicated server-side tool, but with CSS I can do it with a hand-coded stylesheet and a few lines of Javascript. Yeah, accessibility is a good idea, but it also overlaps a lot with search engine optimization, which directly benefits me (and my wallet, now that I'm running ads in my archive pages). And so forth.
I would implore the RDF advocates in this discussion to stop saying "suffer now, and you'll be showered with untold riches later", because no one here believes you. We've all been burned by that schtick before, not necessarily with RDF specifically, maybe something else, but we all certainly recognize the pattern.
You can't use the validator argument, because we already have an app-specific validator for Atom (I helped write it). I don't see how you can use the "makes my own development easier", but you're welcome to give that argument a try. Or try something else; we really are open to a good persuasive argument. But the "untold riches" argument is getting old.
"suffer now, and you'll be showered with untold riches later"
I presume that this is not a direct quote.
Suffer? There must be something that triggered this level of viceral response. Can you point at something in minAtom.rdf or maxAtom.rdf which lead you to use this particular word in this context?
Mark:
"So I switched my argument to "it makes my own life easier"."
"I don't see how you can use the "makes my own development easier" [argument]"
Thanks for making your view so clear. Its nice to know that the things that make development easier for other people are only valid if they also make development easier for you.
As for untold riches, I haven't yet seen anyone make such a promise at any point during this thread. If words like "inferencing" threaten you and read as "untold riches" then there's not much we can really do to help until you back off your defensive position and open your mind.
We'd like to help, but you won't let us. Let me know when you actually want to learn. Until then I'm going to refrain from making an effort that is surely wasted.
Just to point out, that while nice, there's really no need for Atom to reuse existing ontologies directly.
As long as the semantics of various classes and properties are identical to others, this can be stated formally using RDFS/OWL.
In short, there could well be an RDF version of Atom that didn't use a single extra vocabulary, and it could still be put to great use.
Jeremy -
Relax. I've already suggested that it's a perfectly good thing for RDF folks to have a transformation of Atom into RDF that suits their purposes. While I have some serious problems with RDF, I also recognize that it isn't just noise.
So long as the RDF folks can get their benefits without imposing costs on XML folks, I have no problem with RDF. When RDF folks start trying to impose their notions of what looks good syntactically, I have lots of problems with RDF. I don't think think that's a matter of generating noise.
As far as "let go of your ego", I thought this was a technical discussion. My doubts about your seriousness in suggesting that are painfully reinforced by this:
We'd like to help, but you won't let us. Let me know when you actually want to learn.
Thanks for knowing so much that the rest of us are merely blocked on. Right. Maybe it's the past trauma I've had from these same trenches that are blocking me?
As for what the costs look like, I ask that Avdi please look past Aaron's original demo and read up on the rest of what's involved. It's not really a one-liner.