It’s just data

Atom + MIME?

I've mocked up what I think an Atom POST request for Tim Bray's How Fast is This Thing Growing? blog entry would look like using the MIME Multipart/Related Content-type.  I chose this particular entry as it has a title, a summary, full content, and two pictures - one displayed inline, and one by reference.

I've posted two versions: a 7 bit safe version using Quoted-Printable and Base64 transfer encodings, and a 24% smaller 8 bit version using 8-bit and binary encodings.  Clients would be permitted to transmit any combinations of these.  Note: what you are seeing is the actual bits that would be transmitted - HTTP + MIME + XML + Atom + XHTML + PNG, complete with authentication.

Unless somebody identifies a showstopper, I'll submit a proposal.


1) how are Content-IDs transformed or mapped into server URLs, and is the server expected to rewrite @href/@src if a transform is used?

2) how are the auxilliary resources managed after creation?  is the "entry container", the entry and its related content parts, always managed as a unit from the perspective of the Atom API?

Tim's recent atom-syntax post, Two problems with the protocol, touches on (1).  PaceDontSyndicate and PaceNonEntryResources address (2) by making auxilliary resources separately addressable, the former by creating an "unsyndicated entry" and the latter by treating non-entry resources as "ordinary" web-accessible resources.  PaceResource and PaceObjectModule may also have trouble with (2), but being siblings of an entry they may be discoverable in some other manner.

Posted by Ken MacLeod at

This may be overly nitpicky but: Why HTTP/1.0 for the POST instead of HTTP/1.1?  Also, you need a Content-Length: header

Posted by ed costello at

There probably should be some kind of URI scheme on the Content-ID, like cid or mid, like Usenet messages do.

You might also provide an option to specify a URL with Content-Location, as Microsoft does with their single file html feature. Try the "Save As...Web Archive" feature in IE if you want to see what I'm talking about.
They don't need to modify the html source because it maps the URL to the data associated with the Content-Location header within the file. In this way, an Atom + MIME file could be opened in IE without any modification and be viewed accurately.

Content-Length is optional but you might put in your specification that if provided its up to the content generator to ensure its accuracy. It's very useful to have content-length, but only if accurate.

Posted by Andy at

Might there be an option (attribute, maybe) for some sort of hash, say MD5 or CRC?  Might be "nice to have" for longer posts over less robust transmission media.

Posted by Dave Walker at

Ken: my real goal here is to inject some metrics into the decision making process.

Ed: please be nit picky.  I chose 1.0 because while 1.1 should be supported, I don't believe that 1.1 should be required.  I don't believe a Content-Length header is required, but I could be wrong.

Andy: good ideas.

Dave: odd, I've never seen an existing header for such a thing.  It would seem to be a common requirement.

Posted by Sam Ruby at

what happens if the upload is prematurely terminated?

If it happens part way through one of the parts it would be easy to see that the delivered payload is corrupted, and the server could then toss the lot. But what happens if the drop off occurs at one of the boundaries, and the server thinks it successfully received two parts (not three)? Meanwhile, the sending agent knows the connection dropped and tries again ... if the server is set to assign it's own blog id's then you get the double posting problem, no?

Then again, maybe I need to closely read rfc2387 ... is there something in the format of the payload envelope which means it is easy to spot incomplete deliveries?

Posted by eric scheid at

Some random thoughts:  This allows multipart/related type content to be transferred in a way that is both efficient (the binary version) and based on existing field tested standards, which I think is worth a lot.  A minor issue might be the need for a MIME parser/encoder in addition to an XML library.  It's hard to imagine a web server without this capability, though; is this a serious issue for clients? 

Compared to Tim's PaceNonEntryResources [link] proposal, this does not obviously allow for additional per-resource Atom-based XML metadata.  Not sure if this is a bug or a feature.

I'd propose the use of the existing MIME standard cid: URL scheme for cross-referencing URLs.  Why not?

One could perhaps use the Content-Disposition: MIME header to specify the desired semantics of an associated resource: whether it is a totally independent entity after uploading, and should be managed separately (e.g., a wedding photograph), or is more tightly bound to the content and should be treated as a mere subpart (e.g., a custom smiley graphic).

The examples don't specify whether nested MIME parts are allowed (can Content-Type: be multipart/mixed?), and that seems to be where things get confusing.  Should nested MIME be disallowed for simplicity?

Posted by John Panzer at

Sam,

I posted this to the atom list, and then I saw your proposal.  In particular, note that HTTP/1.1 (which is all that I looked at) does NOT support the Content-Transfer-Encoding MIME header.  HTTP is 8-bit clean.  The various sections that I reference below lay out the thinking in RFC 2616 with respect to MIME interoperability.  So, I think that the ONLY option here is 8-bit.

My original message was:

A few points:

- HTTP is 8-bit clean and does not support the MIME indicators for
  Content-Transfer-Encoding.

- You can use regular MIME with HTTP.  This is a cool option, but
  the support is not as pervasive as I would like.  (I've been
  looking at this in the context of the Java Mail API, which provides
  MIME support).  Among other things, this lets you get digitially
  signed messages (S/MIME) pretty much for free.  And, of course,
  you get the multipart MIME support as well.

- HTTP does provide for the client to negotiate compression of the
  entity body (Content-Encoding).  The client can request, e.g.,
  a gzip encoding of the entity using "Accept-Encoding=gzip".  See
  sections 3.5, 14.3, and 14.11 of RFC 2616 (HTTP/1.1).

See section 19.4 of RFC 2616 for an overview of the differences between
HTTP entities and MIME entities.

-bryan

Posted by Bryan Thompson at

eric scheid:

what happens if the upload is prematurely terminated?

Eric, I can interpret your question in two ways: how does the server know?  What should the server do?  I'll chose to answer the first by quoting from RFC 2046, section 5.1.1:

The boundary delimiter line following the last body part is a distinguished delimiter that indicates that no further body parts will follow.  Such a delimiter line is identical to the previous delimiter lines, with the addition of two more hyphens after the boundary parameter value.

John Panzer:

this does not obviously allow for additional per-resource Atom-based XML metadata.

In theory, this proposal and Tim's are orthogonal.  Everything that can be included in an application/atom+xml body can be included in a mime body.  Presumably, even multiple application/atom+xml messages can be included.  From this perspective, this proposal is simply way of packaging up multiple requests.

Bryan Thompson:

In particular, note that HTTP/1.1 (which is all that I looked at) does NOT support the Content-Transfer-Encoding MIME header.

HTTP/1.1 headers look a lot like MIME headers by design.  However, HTTP/1.1 headers are to be found before the first blank line; everything after that line is defined by the content-type.  In other words, tranferring MIME messsages (including Content-Transfer-Encoding) over HTTP is allowed.  In fact, Soap Messages with Attachments does exactly that.

Posted by Sam Ruby at

I think this is a valid example, but I'll note that both images are displayed by reference. This doesn't really address how one would transfer an entry with a content type of image/jpeg.

Posted by Robert Sayre at

Also, Content-Length is required in HTTP 1.0, but not HTTP 1.1.

Posted by Robert Sayre at

I thought Content-length was required unless yo are using a transfer-coding.

Posted by ed costello at

Which, as I re-read the examples, is covered by using Content-transfer-encoding.

Posted by ed costello at

A minor issue might be the need for a MIME parser/encoder in addition to an XML library

I think this is an über-major issue. I think this proposal is several orders of magnitude more difficult to comprehend than the original pure XML version, and I can't say I like it much. Sorry.

I thought Atom was going to be a pure XML application in all aspects, mostly because XML is easy. MIME is an incredibly difficult standard in comparison, and has less than a fraction of the tools that XML has. It's much easier for people to build their own pseudo MIME handler on top of XML, than it is for them to understand the basic (and ugly) MIME standard.

There might be libraries for MIME handling on most web server platforms, but they are not as easily accessible as XML libraries.

Posted by Asbjørn Ulsberg at

Add your comment