Abstractions leak.
Technorati's
API returns well formed XML unless any excerpt contains an
ampersand, it which case it doesn't. If this was buried under
layers of XML-RPC or SOAP or content:encoding, this would be hard
to find and even harder to deal with.
As it is, I can just cut and paste the URL into either Mozilla
or IE, and I am told immediately what is wrong with the
response. Given this data, I can switch from
Mark
Pilgrims's pyTechnorati to
Phillip Pearson's
and be back up and running in minutes...
Don't let the
experts fool you. You won't find the real reason to use
HTTP GET
here.
The real reason why you want to use HTTP GET is found
here.
This being said, when you blow past any reasonable length of a
URL or want to do more than simple information retrieval, it makes
sense to introduce back in XML and HTTP POST. But my advice
is to do so in
this way
instead of
that way. That way you are better prepared the next time
abstractions leak.
"If this was buried under layers of XML RPC or SOAP or content:encoding, this would be hard to find and even harder to deal with"
FYI -- it wouldn't have happened with any of my implementations of XML-RPC or SOAP, because they encode ampersands for you.
BTW, there's generally a hyphen in the name of XML-RPC.
I can show plenty of examples where abstractions have leaked with XML-RPC (Apple, Apache, Array, ...). I could do the same with SOAP. Heck, I can do the same with RSS 2.0.
In my experience, when you need structure, the closer you are to pure, clean, and simple XML, the better off you are.
Wow, I hate to say it, but since the Technorati server is obviously not using a real XML library, we probably would have been better served by a simple non-XML-based plaintext data format. It still would have required a custom data unmarshaller, but we wouldn't have had to deal with XML's crazy formatting rules.
Re. plaintext: maybe, maybe not. Even people who use RFC-822 header format (much less MIME header encoding) get hit by stray newlines if they don't encode them properly.
Sam your slide show is interesting, and yes of course I learned HTML by view source and please I don't want to talk about HTML today.
But the view source of XML-RPC is in your favorite scripting language. That's where you're supposed to look to make sense of it. So many people make the mistake of looking at what goes over the wire and miss the point completely. I know you didn't, but when they see it in their favorite language they eyes start shining, bright.
BTW, the same thing applies to OPML. People who look at the XML miss the point. Look at it in Omni or Radio or some other OPML-compatible outliner and it immediately makes sense. But most programmers aren't outliner people, so it's over their heads.
BTW, these are called opinions. Hopefully that's not a problem. I just noticed that the people here are the people who tend to have strong emotions about my opinions. Don Box admonished me for not posting here. So let's see if I can express an opinion or two without getting flamed. Thanks.
Dave, is XML-RPC just for scripting languages? As you know, there is an Apache Java implementation. If you simply view source of Java, you will see HashTables and will come to the conclusion that one can pass the full range of the Unicode character set, structs can have keys of any data type, etc., etc., etc.. And it will just work... Java to Java. But it will break down when you try to interoperate with other implementations.
Having debugged my share of such interop issues (hash tables are a problem with SOAP too), I have come to the conclusion that what matters is the wire formats. And RSS 2.0 is a prime example to me of how successful this can be. Even though it too is rife with encoding issues (example: I have even seen Scripting News's RSS feed have encoding issues from time to time). But life goes on, and the ultimate authority is the specs for XML and RSS and what goes across the wire.
That's my opinion of what works and what is most successful.
And Dave, your opinions on technical matters are welcome here.
I've updated the REST wiki RestAndStructuredData page with the flurry of recent tools that let one access XML using native syntax immediately following a GET (or preceding a POST, for that matter).
"This being said, when you blow past any reasonable length of a URL or want to do more than simple information retrieval, it makes sense to introduce back in XML and HTTP POST."
"This being said, when you blow past any reasonable length of a URL or want to do more than simple information retrieval, it makes sense to introduce back in XML and HTTP POST."
That may be true, but for any application that required such a large block of data in a POST that would otherwise be performed using a GET, it would be a nice feature to return a 201 (Created) with the URL of a new resource in the Location header, a la tinyurl.com, instead of perhaps a 200 (OK). This new resource would be more easily shareable, etc.
Posted by anonymous at
Sorry Sam, I tend to say scripting when I mean programming. Of course Java is an interesting way of writing XML-RPC apps. By view-source I don't mean view-bytecode, I mean source.
Ken: sneaky way to introduce RDF into this conversation. I'm not biting. ;-)
Jeffrey/anonymous: if the "resource" includes a private key (such as the technorati API does), I'm not sure that's appropriate.
Dave: I too was talking about program source. Strings in Java are unicode. Such strings arrive intact in an XML-RPC call when both sides are Java, but such applications may not interoperate with other XML-RPC implementations. I'm not saying that's wrong or needs to be fixed, merely pointing out that that is an aspect of the XML-RPC protocol that "leaks" or "shows through" to the appplication. I wrote an entire essay on this subject dealing with SOAP.
Sam, believe it or not, I was referring to just the "Native XML APIs" updates, which goes to your comment (11:02) of what matters is the wire format. Those APIs give access directly to the wire format in a way that only SOAP or XML-RPC encoding have done in the past, thus removing a layer of abstraction, or a very large part of it.
If only I had a pound for every time I had suffered this problem with XML - almost every system I have used that works with XML ends up suffering it at some point.
Does this just indicate that most languages don't have good enough tools for dealing with XML yet? That would explain why people insist on crafting it by hand and we continue to see broken XML.
Do the Python XML wrappers deal with this for you?
Hmm. I create the feed for my blog (http://www.cincomsmalltalk.com/rssBlog/rssBlogView.xml) by using a SAX Driver. VisualWorks Smalltalk has had good XML support at this level for years. Some people just like to work harder, not smarter :)
REST is hard to understand and implement. If it was easy, people like Tim Bray wouldn't write multiple essays about it and smart people like Dave Winer and Sam Ruby wouldn't be endlessly debating it. I personally think REST is a better philosophy...
Steven is triggering me again... can't say I relate to the conclusions he's making: there _are_ libraries out there to handle both HTTP and XML bindingbut I do agree that ReST seems to remain a grass-root movement, which quite naturally fosters a...