Mark,
I assume he means we can post comments to his blog using that SOAP format. Will you also be supporting this?
Sam,
RSS Bandit will support this once I get back from FL unless there has been some change by the time I get back.
PS: Did y'all ever agree on what the item in the RSS feed that indicates where to post comments should be?
Dare, you are assuming a verb. You probably are also assuming a transfer protocol. And you are probably assuming a particular message exchange pattern.
This should probably be online by the time you get back from FL.
Dare: suppose the message exchange pattern is this: people who are registered can subscribe to my blog and I will POST updates via HTTP to the location of their choice?
They can treat this as a ping. They can aggregate it. They can route it to their in box. They can archive it. They can validate it.
Or perhaps I should also support SMTP. That's easy to do.
As you point out, I should probably also accept datagrams using the same format and interpret them as pingbacks or trackbacks or comments or...
As to the RSS element, I was hoping to use Joe's wfw:comment, but first I have a giant hamster to deal with...
Ted:
1) the namespace is declared in the document, thus: xmlns:content="http://purl.org/rss/1.0/modules/content/". By any chance are you using Mozilla 1.3? If so, view source to see the namespaces.
2) description is meant to be an abstract or excerpt. Descriptions should not contain html. content:encoded is intended to be the full content. Mark's example above illustrates this well.
Sure Sam you can use wfw:comment, on two conditions:
1. Remove the SOAP envelope.
2. Take the remaining 'item' out of the RSS 1.0 namespace.
Then it will taste right yummy.
What should I return if I don't feel like giving you the SOAP document you requested?
An HTTP error code? Which one?
A SOAP fault? What would it look like?
Mark: if doing SOAP over HTTP, the answer is yes to both. Take a look at this template.
To help decipher this: HTTP status is 500 on failures. <code> is meant for programs, and is namespace qualified. <string> is meant for humans and is simple text. <detail> is whatever you want, as long as all child elements must be namespace qualified.
Just for reference, I note the W3C RDF validator doesn't see any of the RDF within that instance, possibly negating part of the benefit of using RSS/RDF items over non-RDF RSS.
To answer the obvious question arising from that ;-), yes, adding an rdf:RDF parent element to the item makes the RDF visible.
First, to verify an assumption: that the purpose of using an RSS/RDF <item> element isn't solely because it has a defined namespace. That is to say, "ya, if it's RDF compatible too, that's a bonus."
RDF is designed to allow embedding in other document types. Just put in an <rdf:RDF> element, and everything within is parsed as RDF.
An <rdf:RDF> element does for RDF readers what an encodingStyle="http://schemas.xmlsoap.org/soap/encoding" does for SOAP readers.
Ken, I will confirm that I did this because I wanted a defined namespace. That being said, it looks like this document is not valid RDF, so this would be an inappropriate use of that namespace.
I've tried adding an <rdf:RDF> element in various places and have yet to produce meaningful results. I said I was rdf illierate, you will have to spell this out more clearly for me.
Your analogy doesn't work for me. I see no need for soap encoding.
http://bitsko.slc.ut.us/~ken/1290.soap
Making <item> the child of an <rdf:RDF> element makes this instance valid RDF.
What I meant by comparison is that RDF, itself, is an encoding. Using <rdf:RDF> signals that to an RDF reader.
Keep prodding me until you get the answer you need, please.
OK, Ken. I got it now.
Now to turn this question around: if this were RDF compatible, what additional applications would it enable?
Sam, in retrospect, the answer is: none -- merely by adding an rdf:RDF element.
RDF only works if the people creating it recognize that there is an underlying data model. In this context, even though the namespace is being borrowed from RSS/RDF, the proposal is one-step removed from any concern for an underlying data model.
The context where applications are enabled by RDF are those where it makes sense to link basic data structures (like sequences, mappings, and values) across sites and across the web using global identifiers (URIs) -- but with an emphasis on letting a library resolve references so each application doesn't have to.
Thanks Mark for breaking the ice on the Semantic Web. I'll be certain to sleep well tonight knowing I don't need to mention it.
From the RDF spec at http://www.w3c.org/RDF :
"The RDF element is optional if the content can be known to be RDF from the application context."
http://www.w3.org/RDF/Validator/ validates the message as ok.
Phil and Drew, in the case of attempting to parse the entire instance (including SOAP envelope) as RDF results in bogus data in the model. There is a case where an application could, knowing that the child of <soap:Body> was an <rss:item>, pass that context to an RDF parser to extract the RDF.
I think the question here is more along the lines of whether simply adding an <rdf:RDF> element benefits clients, robots, or auto-discoverers without being application specific.
Sam, I think you already stated clearly the immediate value of adding the <rdf:RDF> element: it would be inappropriate to use the RSS/RDF namespace if the content weren't readily parsable as RDF.
The potential value requires 1) knowing where RDF is immediately practical (no semantic hand waving), and 2) ubiquity.
As I mentioned earlier, RDF is an encoding style logically equivalent to SOAP encoding (that is, one can serialize objects to and from that encoding without the use of external schemas or going thru a DOM to access the information). One additional feature that RDF has is that one can make loosely-coupled references to external objects. That is to say, at one point a reference (pointer) can be dangling and uninstantiated, and later become resolved and instantiated. This is, in effect, what RDF crawlers do.
Where does that become practical in weblogs? Increasingly, in comments, RSS, trackbacks, and the like. Current practice has each client or aggregator author parsing XML and extracting links, following links, ad infinitum. Each application has to know where to find those links and has to parse and retain each XML instance separately. RDF libraries, on the other hand, do all of the grunt work, provide a consistent API over the whole data set, and lazily instantiate linked data as needed.
Maybe we should have a thought-experiment: assuming RDF is persona non grata yet we wanted to factor the reading, linking, and traversing of all multiple-weblog information into a single library and API, what would it look like? One solution springs to mind: create a single local XML instance where all information is merged into, then use XPath to provide both context and linked information ("from item 1090 of Intertwingly give me all trackbacks"), and lazy instantiation ("merge items and comments from those trackbacks now").
"but first I have a giant hamster to deal with..."
I am large. I am fuzzy. I contain multitudes. :)
I think you can keep RDF parsers happy without breaking the structure of the RSS item that Sam presented here.
The RDF document (that has a root element of rdf:RDF) can be external and reference each item through the use of rdf:resource. The RDF document for a site would consist of a container and a list of items for the site. Each item would point at the original item wherever its located using the rdf:resource attribute for that item. Good RDF parsers should be fine with such a structure. Here's a rough (untested) example just to show the idea.
<rdf:RDF>
<rdf:Seq ID="foo">
<rdf:li resource="http://www.intertwingly.net/blog/1290.soap" />
</rdf:Seq>
</rdf:RDF>
Building the RDF index this way provides a sufficient amount of flexibility. It will also be easier to transition RSS 2.0 items which is really the defacto at this point.
Most importantly, the rss item that Sam suggests has a namespace declaration.
Drew, from the perspective of building an "RDF Site" what you describe makes sense, but there's a small error in the mechanics that is specifically related to the issue we're talking about here: does the format of the item benefit if it can be read as RDF?
In the example you provide, there's no way for the RDF library to know that (or where) 1290.soap contains RDF. When an RDF parser parses all of 1290.soap, "as if it were RDF", bogus data is read, as indicated by the validator. It's possible, given more extensive use of header properties, that it may choke an RDF parser altogether. That's where <rdf:RDF> comes in (and it doesn't have to be at the root, as I show in my example above), it tells the parser where the RDF is.
Where's the simplicity?
Umm... all of this SOAP and RDF stuff just seems like a lot of additional layers for... what? Joe's CommentAPI is simple and clean. Is there actaully a need for all this other stuff?
For those who are RDF/SOAP gurus, I apologize if I offend. I am a near illiterate in both technologies...
Seairth, one of the major benefits of RDF comes at the next level up, when you read in the XML and convert it objects your program can use. With an RDF library, reading the XML and creating objects (data structures) is automatic, regardless of the RDF you're reading.
Since Joe's "Well Formed Web" and RESTLog makes good use of "plain" XML, I went looking for an example of what it takes to work with "plain" XML. I got a bit lost and couldn't find the kind of example I was looking for (maybe Joe can work with me to find one), but I did find one instance in Aggie.
In RssUtils, "RssDocument is the class that takes an RSS document, reads it, and turns it into programmatic representation." (setting aside for now the part about ironing out the differences between RSS variants). With RDF, that's built in to the library. No application author has to do that, and specifically does not have to do it for each kind of RDF out there.
In addition to that, RDF knows about the connections between the files. For example, when a trackback to Intertwingly from Bitworking occurs, and the comments and the channels are loaded into memory, the connections between the "programmatic representations" of each are also made by the RDF library.
Hmmm... the argument that RDF is easier than working with "plain" XML sounds to me like an argument of using an existing tool/api or rolling your own. If I were to use a tool that automatically read and translated RSS, I don't see that I would gain anything by using a tool that automatically read and translated RDF.
I admit that RDF is more generic than RSS. As a result, I can likely create richer and more varied RDF documents than I can with RSS. As we are seeing, it is even possible to create an RDF "equivalent" of an RSS item. But if all we are trying to do is pass around RSS items, doesn't RDF become overkill? It's like always hauling around and using a 100-function swiss army knife when all you ever need is a can opener...
As for the connection bit, I don't quite follow...
Ken,
Setting aside the part about the differences between RSS variants is ignoring the No 1 reason why RssUtils is written the way it is (I'm its author, BTW).
RssUtils does what the usual mechanisms of RDF and RDF Schema don't do: it allows separation of the document from its schema, w/o requiring the document schema be forced into the striped form endorsed by RDF/XML encoding rules. (See what LoadExtractors() does and how this information is used to later parse the different variants of RSS; how would that be done with an RDF parser?)
RDF/XML is, IMHO, the largest obstacle for adoption of RDF.
I hate it. Well, just the example actually. The "blogging community" seems to me to have long been enamoured of what I can only describe as "the absolute worst practices" which is to say shoving escaped HTML markup in every which way and place its possible to shove it. Case in point
content:encoded had the following: Before reading <a href="http://www.intertwingly.net/blog/1290.soap">this</a>, please read <a href="http://enquirer.com/editions/2002/01/27/tem_sam_(i_am)_has.html"> this</a>.
With escaped HTML links. The argument for why one should want to make it a communal practice to use escaped markup that is most often brought up, that it is somehow a "worse is better" scenario that allows one to add to an xml instance without knowing much about XML seems illogical to me, what is so much easier about writing <a href="some url here">here's a great escaped link </a> than <h:a href="some url here">here's a namespaced link, html namespace defined in soap envelope</h:a>.
Sorry I just don't understand it, why is just about every xml instance I see coming out of this community an occasion for me pulling out text from specific nodes and running tidy against said text in order to get markup to work with?
Re: bryan hates it
The obvious reason to me is that not all bloggers are creating strictly xhtml content. If they were, this would be much less of an issue, I think. Instead, the attitude taken is "always escape it going out and unescape it coming in". That way, html and xhtml can be handled nearly identically. Sure, it's messy. But it also seems wildly successful.
I wonder just how prolific blogs and RSS feeds would now be if everyone was required to blog in strict xhtml in the first place. My guess is not nearly as much as it is now...
re: strict xhtml and html,
I don't much care if it were strict xhtml with namespaces etc. or just well-formed html. I have a suspicion that the reason for this practice has something to do with bad dtd comprehension on the part of major blogging companies, but as I don't know anything about their background technologies etc. this is as I said just a suspicion.
I don't think that there would be a major difference in the prolificity of rss and blogs if one were required to write well-formed html, the well-formed web is in many ways easier to manage and understand than the malformed.
Perhaps if someone could do a study trying to get non-technical users to write escaped html and well-formed html and compare errors, user satisfaction with the two methodologies then one could start to have a reasonable discussion as to which is the more successful.
At any rate I am just not interested in unescaping html on the way back in, it seems a process too likely to be error prone in that one has two sources of possible error: 1. the escaping process itself 2. the matching of html fed in to html requirements for a site, these requirements often painstakingly worked up to match the idiosyncratic behaviors vis-a-vis display of various browsers.
The arguments I have seen that address number 2 depend on social control aspects of a community ostracising members that persist in posting escaped html that on display would tamper with or even do such things as impact security of a using site. I don't think social controls scale very well.
I also think one day someone's gonna hack a blog with a feed that gets consumed far and wide by everyone, that has a bunch of escaped html in it, and put in some nasty scripts or worse, and on that day the whole just escape it/unescape it methodology is gonna have to change to, if it's escaped, clean it.
I agree with bryan.
As far as a comment API goes - nobody has a right to comment on my blog, so if I were to require well formed XHTML in order to comment, then
The reason why I don't (as of yet) is merely a question of how many simultaneous battles I choose to fight. If such a standard were to exist, I would immediately support and endorse it. Until then, getting people to support RSS as a uniform "API" is a big enough challenge for me at the moment.
I will note, however, that it is still quite possible to harbor nasty scripts inside of well-formed XHTML.