Sam Ruby

Pulling on a string

2004-05-06T12:56:39-04:00

Mark Nottingham: In a nutshell, XOP is an alternate serialisation of XML that just happens to look like a MIME multipart/related package, with an XML document as the root part. That root part is very similar to the XML serialisation of the document, except that base64-encoded data is replaced by a reference to one of the MIME parts, which isn’t base64 encoded.

I guess I am supposed to take comfort in the assertions that the serializations look like and are very similar to XML and MIME respectively. Hopefully, I'll find more rigor in the working draft.

Looking at the examples in the working draft, it appears that the crux of the issue is the use of a Content-Transfer-Encoding: binary header. Unfortunately this header isn't defined — or even mentioned! — in any of the normative text of this working draft.

Or in any of the nine documents referenced by this draft.

However, in the ninth document, there are five documents referenced. And in the fifth reference, is the explaination:

 encoding := "Content-Transfer-Encoding" ":" mechanism
 
 mechanism := "7bit" / "8bit" / "binary" /
              "quoted-printable" / "base64" /
              ietf-token / x-token
 
 ietf-token := 
 
 x-token :=

Checking IANA, it seems that no more have been officially registered, though realistically, there may be some experimental ones in popular use that need to be supported. Figuring that others may have been down this road before, I check the Python libraries (as they happen to be handy and open source), and I find that they support the following:

base64
quoted-printable
uuencode
x-uuencode
uue
x-uue
7bit
8bit

Can you spot the one that is missing? You guessed it, binary.

Exhausting. After all of this, I am left with a few questions.

Exactly what content transfer encodings does an application have to support in order to be XOP compliant?
Is there a list of existing software libraries that are known to contain adequate support for MIME such that they can be used as a base upon which XOP support can be based?
In the case where such a library is not present, is there a known subset of MIME that is the minimum necessary to support XOP?

The more time I spend with specs, the more appreciation I get for Test-Driven Development.

Pulling on a string

2004-05-06T14:47:20-04:00

Sam, please dig a little further before passing judgment. Protocols aren't layered that way. Does HTTP tell you which TCP options to use? Does SOAP tell you how to use persistent HTTP connections?

XOP (not XOM) relies upon MIME multipart/related as a packaging mechanism. Although the primary use case for XOP in Web services is across HTTP transports (which, by necessity, would be encoded with a binary content-transfer-encoding), it's conceivable that a message will need to transit a hop (e.g., SMTP) that isn't binary-friendly, and therefore will need MIME content-transfer encoding of (for example) base64. Rather than force such gateways to transcode these messages, it's prudent to allow them to decide.

Python does indeed support binary CTE, in the email library. The relevant code (in email.Message.Message.get_payload()) is:

# Everything else, including encodings with 8bit or 7bit are returned
# unchanged.
return payload

Note that 'binary' is NOT an encoding as such, but an indication of the domain of the content; from RFC2045:

The Content-Transfer-Encoding values "7bit", "8bit", and "binary" all
mean that the identity (i.e. NO) encoding transformation has been
performed. As such, they serve simply as indicators of the domain of
the body data, and provide useful information about the sort of
encoding that might be needed for transmission in a given transport
system. The terms "7bit data", "8bit data", and "binary data" are
all defined in Section 2.

It's important to understand that specifications that are too constraining, while improving short-term interoperability, limit too much long-term functionality, and therefore have little value as standards. That's the place of a profile, whether its explicit or in the market. For more on this, see my blog a while back [link]

Pulling on a string

2004-05-06T15:00:35-04:00

And BTW, pretty much nobody does experimental CTEs; we briefly considered making XOP a CTE (or the corresponding mechanism in HTTP, content codings), but the bar for new CTEs is VERY high, and the mechanism is very specific to MIME.

Pulling on a string

2004-05-06T15:59:37-04:00

Protocols aren't layered that way.

I think that gets to the root of my concern. Presumptions of how things are layered seem to be endemic and problematic. As an example, I don't see how XOP would fit into a SAX or Pull based parser model as XOP radically affects the order in which data becomes available.

I'm starting to make progress prototyping a XOP client in Python:

from email.MIMEImage import MIMEImage
from email.MIMEMultipart import MIMEMultipart
import urllib

def encode_binary(msg):
  msg['Content-Transfer-Encoding'] = 'binary'

resource='http://moinmoin.wikiwikiweb.de/wiki/classic/img/moin-info.png'
image=urllib.urlopen(resource).read()
msg=MIMEMultipart()
msg.attach(MIMEImage(image,'png',encode_binary))
print msg.as_string()

Hopefully shortly I will have example code on what an multi-part Atom POST would look like if the protocol were XOP based.

P.S. The XOM => XOP typos have been fixed. Thanks!

Pulling on a string

2004-05-06T17:20:16-04:00

Re SAX/pull pipeline, in a scenario that doesn't involve threading, I'd expect the XOP head-end to have the whole MIME body available, then create a SAX/pull pipeline with the XML parser parsing the MIME root, passing events to an XOP filter that replaces XOP elements with MIME parts, which then passes events on to "my" SAX/pull module. As far as "my" module is concerned, the order the data becomes available is already resolved.

Pulling on a string

2004-05-07T17:08:49-04:00

Implementation in Python here: [link]