Mark
Nottingham: In a nutshell, XOP is an alternate
serialisation of XML that just happens to look like a MIME
multipart/related package, with an XML document as the root part.
That root part is very similar to the XML serialisation of the
document, except that base64-encoded data is replaced by a
reference to one of the MIME parts, which isn’t base64
encoded.
I guess I am supposed to take comfort in the assertions that the
serializations look like and are very similar to
XML and MIME respectively. Hopefully, I'll find more rigor in
the working draft.
Looking at the examples in the working draft, it appears that
the crux of the issue is the use of a
Content-Transfer-Encoding: binary header.
Unfortunately this header isn't defined — or even mentioned!
— in any of the normative text of this working draft.
Or in any of the nine documents referenced by this
draft.
However, in the
ninth document,
there are five documents referenced. And in the
fifth reference,
is the explaination:
encoding := "Content-Transfer-Encoding" ":" mechanism
mechanism := "7bit" / "8bit" / "binary" /
"quoted-printable" / "base64" /
ietf-token / x-token
ietf-token := <An extension token defined by a
standards-track RFC and registered
with IANA.>
x-token := <The two characters "X-" or "x-" followed, with
no intervening white space, by any token>
Checking
IANA,
it seems that no more have been officially registered, though
realistically, there may be some
experimental ones in
popular use that need to be supported. Figuring that others
may have been down this road before, I check the Python libraries
(as they happen to be handy and open source), and I find that they
support the following:
base64
quoted-printable
uuencode
x-uuencode
uue
x-uue
7bit
8bit
Can you spot the one that is missing? You guessed it,
binary.
Exhausting. After all of this, I am left with a few
questions.
Exactly what content transfer encodings does an application
have to support in order to be XOP compliant?
Is there a list of existing software libraries that are known
to contain adequate support for MIME such that they can be used
as a base upon which XOP support can be based?
In the case where such a library is not present, is there a
known subset of MIME that is the minimum necessary to support
XOP?
Sam, please dig a little further before passing judgment. Protocols aren't layered that way. Does HTTP tell you which TCP options to use? Does SOAP tell you how to use persistent HTTP connections?
XOP (not XOM) relies upon MIME multipart/related as a packaging mechanism. Although the primary use case for XOP in Web services is across HTTP transports (which, by necessity, would be encoded with a binary content-transfer-encoding), it's conceivable that a message will need to transit a hop (e.g., SMTP) that isn't binary-friendly, and therefore will need MIME content-transfer encoding of (for example) base64. Rather than force such gateways to transcode these messages, it's prudent to allow them to decide.
Python does indeed support binary CTE, in the email library. The relevant code (in email.Message.Message.get_payload()) is:
# Everything else, including encodings with 8bit or 7bit are returned
# unchanged.
return payload
Note that 'binary' is NOT an encoding as such, but an indication of the domain of the content; from RFC2045:
The Content-Transfer-Encoding values "7bit", "8bit", and "binary" all
mean that the identity (i.e. NO) encoding transformation has been
performed. As such, they serve simply as indicators of the domain of
the body data, and provide useful information about the sort of
encoding that might be needed for transmission in a given transport
system. The terms "7bit data", "8bit data", and "binary data" are
all defined in Section 2.
It's important to understand that specifications that are too constraining, while improving short-term interoperability, limit too much long-term functionality, and therefore have little value as standards. That's the place of a profile, whether its explicit or in the market. For more on this, see my blog a while back [link]
And BTW, pretty much nobody does experimental CTEs; we briefly considered making XOP a CTE (or the corresponding mechanism in HTTP, content codings), but the bar for new CTEs is VERY high, and the mechanism is very specific to MIME.
I think that gets to the root of my concern. Presumptions of how things are layered seem to be endemic and problematic. As an example, I don't see how XOP would fit into a SAX or Pull based parser model as XOP radically affects the order in which data becomes available.
I'm starting to make progress prototyping a XOP client in Python:
Re SAX/pull pipeline, in a scenario that doesn't involve threading, I'd expect the XOP head-end to have the whole MIME body available, then create a SAX/pull pipeline with the XML parser parsing the MIME root, passing events to an XOP filter that replaces XOP elements with MIME parts, which then passes events on to "my" SAX/pull module. As far as "my" module is concerned, the order the data becomes available is already resolved.