It’s just data

Targeting Non-Elite Programmers

James Clark: I see the real pain-point for distributed computing at the moment as not the messaging framework but the handling of the payload ... This pain is experienced most sharply at the moment in the SOAP world, because the big commercial players have made a serious investment in trying to produce tools that work for the average developer. But I believe the REST world has basically the same problem: it’s not really feeling the pain at the moment because REST solutions are mostly created by relatively elite developers who are comfortable dealing with XML directly.

I’m convinced that few in the SOAP world — not exactly a category I would put James into — have an appreciation for what suitable for average, relatively low-skill programmers really means.  Conspicuously absent from James’ post is any mention of either urlencoding or XPath.

While James quickly dismisses JSON and YAML, he omits entirely another mechanism commonly used to encode request parameters: application/x-www-form-urlencoded.  This mechanism is quite suitable for expressing a small number of parameters.  James’ first TEDI example involves two strings and two integers.  If the request is safe and the length of the strings are not too unwieldy, requests involving urlencoding can be made via HTTP GET, and thereby potentially accrue benefits such as cacheability and bookmarkability.  Otherwise, the request can be made with HTTP POST, opening up the potential for file uploads.

James argues that XML has to be the mainstay, due to the extraordinary breadth of adoption; by this measure, urlencoding is certainly a contender.

Responses are a different beast entirely.  Responses tend to considerably larger and considerably more deeply nested.  For these reasons, an approach more general than urlencoding is warranted; but this does not necessarily imply XML.  XML is but one format that fits in the general category of “data formats that have the potential of being expressed as an DOM”.  Such a category would also include HTML documents.

Again, in terms of adoption, HTML certainly ranks up there.

Lets take a look at a real use case.  Visit James’ post using a modern browser, and you will see an icon that indicates that you can subscribe to his posts.  The request made to fetch his page consists solely of the name of the endpoint and zero parameters; and subsequent subscription polls likewise only require a URI and zero parameters.  Requests with zero parameters are a common use case in REST based systems.

The page containing James’ post is not well formed XML, but can be rendered into a DOM nevertheless.  Once that is accomplished, retrieving the subscription information is a simple matter of traversal, and this can be expressed as an XPath expression:

/xhtml:html/xhtml:head/xhtml:link[@rel='application/atom+xml' or @rel='application/rss+xml']/@href

Programmers using E4X or SimpleXML have an equivalent or better mechanism to express such data mining operations.  Ways that are accessible to even “average, relatively low-skill programmers”.  But no matter how this traversal is done, I know of no browser which relies on a formal XML schema in order to place a subscription icon in your window.

Similar observations can be made for feeds.  RSS does not have a formal grammar.  While Atom has a non-normative grammar, I’m unaware of anyone using it to assist with parsing activities.

Now, lets take a look at a recent, relevant, and very concrete example: Google Maps support for GeoRSS.  In order to deal with feeds produced by “average, relatively low-skill programmers” Google must deal with non well formed feeds, multiple incompatible formats (+ Atom, optionally including the declining but not quite gone Atom 0.3 variant), multiple ways to express GeoRSS data, and even deal with applications which don’t follow these specs either.  (Note: both slashgeo and flickr now produce valid GeoRSS feeds!  Thanks guys!)

In fact, such feed-based applications are positively flourishing.

Would an alternate mechanism for expressing schemas help this situation?  My assessment: not so much.  Anybody care to do the work to express a schema for the set of all possible permutations of GeoRSS augmented feeds?

I suggest that yet another grammar for expressing concepts related to validation would not nearly be as helpful as a collection of tools, tips, and techniques that deal with topics such as normalization.

[from gregorrothfuss] Sam Ruby: Targetting Non-Elite Programmers

Sam Ruby: Targetting Non-Elite Programmers...

Excerpt from at

I question whether these so-called “non-elite programmers” care about the transport layer.  I count myself among them if not by skill then by choice.  I strive every day to be a less and less elite programmer.  Do we not care more about a clear API on both sides of the transport than what the transport is?  I mean who really care’s how tasty our XML is?  I mean I care about some characteristics but more from a network engineering standpoint (how well will it load balance, support stickiness, etc etc).  Naw I care that it is reasonably fast, relatively low overhead and as simplistic of an API on both sides.  Implicit is that I care that there IS an API on both sides.  Increasingly I’m concerned with encoding but increasingly I expect the API and transport layer to handle it rather than make me deal with more than saying that it needs to be done.

Posted by Andy at


That’s just my perspective, the perspective of the host of the first few SOAPBuilders interoperability sessions (which ultimately morphed into WS-I), and one of the original authors (and current primary maintainer) of the Feed Validator.

Posted by Sam Ruby at

Sam Ruby: Targeting Non-Elite Programmers

Sam Ruby’s skeptical response to James Clark’s blog post about the need for a new schema language. Very interesting stuff. Must learn more about XPath and the application/x-www-form-urlencoded MIME type....

Excerpt from at

My point is that the “non elite” programmers care about API not transfer format.  Performance of RMI:JRMP is negligibly better than SOAP depending on implementation and usage.  Complexity of the solution was traditionally much higher...needlessly so.  It is like Ruby vs Java.  Ruby APIs aren’t horrible locator to factory to locator to factory to abstract implementation to implementation like Java.  Traditionally many of the APIs for WS implementations have been Java-like.  This is changing in Java EE at least...right about the time no one cares about Java EE anymore ;-).  Anyhow my point is taht the non-elite programmer doesn’t care about the transport at all...more the API.  Spec wonk XML spouters can was supreme about XML spec A vs B vs C...but it does no good if the API for each language (esp Java due to market location) requires way more work to use it than other solutions.  The hard work of transport integration will be by definition done by the few and consumed via APIs for the masses...I hope.

Posted by Andy at

Anyhow my point is taht the non-elite programmer doesn’t care about the transport at all...more the API.

Until it comes time for debugging and what you thought you sent and what the other side thought it received are two different things, and you are left staring at a black box wondering what you could possibly have done wrong.

Posted by Sam Ruby at

Non elite programmers debug the transport layer?

Posted by Andy at

Non elite programmers debug the transport layer?

Depending on their skills, they very well may not be able to debug such problems.  At which point, they are stuck.  Broken and stuck.  Broken, stuck, and pointing fingers.

Again, this rarely happens Java-to-Java, but there are more efficient ways to transport such requests.  And Java-to-.Net mostly works.

Posted by Sam Ruby at

Once they do debug the transport layer are they really still "non-elite"?

Posted by Andy at

GeoRSS: Worse is Better

That the Geo-Web is forming out of GeoRSS and KML rather than GML is yet another indictment against XSD. [more ..]...

Excerpt from Entries for import cartography at

Add your comment