Too Many Alternatives

The situation

[HenriSivonen] The AtomEntry proposal together with the DifferentlyAbledClients proposal creates ten alternative ways of POSTing a new HTML-like entry using the Atom API. In contrast, the metaWeblog API, with which the Atom API is supposed to compete, requires the server to support only one way. It should be obvious that one way is easier to support than ten.

The one way supported by the metaWeblog API is serializing the HTML tag soup body of the entry as a sequence of Unicode characters and then transferring that string over XML-RPC as XML character data. The metaWeblog.newPost method doesn't seem to include the data type of the payload so even if you posted XHTML, it would go down the tag soup code path. (Media objects are explicitly typed in the metaWeblog API, but it seems to me the data passed to the CMS as a media object is expected to be treated as an opaque typed sequence of bytes and not as something the CMS looks inside.)

There are two ways of putting text/html in an AtomEntry:

  1. serialize as a sequence of Unicode characters and transfer the characters as mode="escaped"

  2. serialize as a sequence of bytes and transfer the bytes as mode="base64"

Atom allows the transfer of application/xhtml+xml in addition to transferring text/html, so now there are four ways of transferring HTML-like content: two ways of transferring text/html and two ways of transferring application/xhtml+xml. These cases are distinct, because text/html requires tag soup parsing but application/xhtml+xml requires XML parsing.

Unlike text/html, application/xhtml+xml can also be put inside an AtomEntry envelope as mode="xml". That makes a fifth way of doing it.

[SamRuby] IMHO, the above is based on a fundamental misunderstanding. The metaweblog API is based on RSS which doesn't specify one way to do things, it simply doesn't specify how things are to be done. Most weblogs have titles which are intended to be pure text, [WWW]others that embed markup directly in the title, and still others that escape the markup. Net: with Atom, there aren't any options that didn't exist with RSS before, merely more information so that the recipient can make intelligent decisions.

But then the API comes in RESTful and SOAPy flavors doubling the number of cases to ten. (Also, if posting a body fragment of an (X)HTML document and posting an entire (X)HTML document with <html>, <head> and <title> are counted as distinct cases, that makes twenty!)

[SamRuby] The essences of DifferentlyAbledClients is that there exists a [WWW]small subset of clients which can not do specific methods. This would imply that the most general mechanism with the widest applicability would be the SOAP interface. However, it really doesn't make sense to 'impose' this on all clients merely because a few clients can not access the full range of methods. There is an additional benefit of optionally allowing SOAP clients in that it enables those that wish to use existing toolkits to do so, without requiring all clients to standardize on SOAP. Finally those that have actually implemented both (on the server side only) seem to have found the task [WWW]fairly easy.

The problem

When there are too many ways to do it, implementors are likely to subset the spec on their own. Whenever that happens you can't make an interoperating implementation by looking at the spec but you have to know what is the subset that is really supported by others. That would be uncool. It is better to prune the spec before implementors subset it on their own.


I suggest pruning the number of alternatives as follows:

  1. No SOAP; only REST

  2. If the payload format is an application of XML, MUST use mode="xml". That is, for application/xhtml+xml mode="xml" would be the only alternative.

  3. If the payload format is not an application of XML but is defined in terms of characters and all those characters are XML characters (Unicode characters that can occur as XML character data), MUST use mode="escaped". That is, for text/html mode="escaped" would be the only alternative.

  4. Otherwise, fall back to mode="base64"

(Personally, I'd prefer POSTing an XHTML document over HTTP without an AtomEntry envelope in between and with possible metadata in <head>. However, that idea didn't get a warm reception on the mailing list.)

Alternate Suggestion

[SamRuby] Define, develop, and deploy a compliance test suite that allows vendors to verify that they properly accept the various possible combinations of options.