Should we extend HTTP?

In summary: probably not (at least right now). Instead, let's look at an AggregatorApi to do the same job.


[MartinAtkins : RefactorOk] Can we perhaps make an optional addition to the HTTP requests made by aggregators to indicate that an aggregator has current data as of a particular date? Then only new/changed entries can be delivered.

The header, which could be called something like 'X-Last-Polled' would be optional both for the client to send and the server to honour. Small sites may wish to trade bandwidth for the reduced CPU utilization of the feed being a static file served directly from disk or memory.

X-Last-Polled is the same as If-Modified-Since in syntax, but is a request rather than a question. Hopefully everyone can see why If-Modified-Since is not appropriate for this purpose.

Something like this will have to be standardized, even as a "best practice", or else aggregators will start trying to it themselves in incompatible ways and we'll end up having to send five different headers.


[MartinAtkins : RefactorOk] I originally considered that If-Modified-Since would work for this, but then realised that there are some users of RSS (and thus, ultimately Atom) feeds which don't make any effort to cache individual entries locally. Instead, they pull down the feed, transform it into something else (usually HTML) and that's the only data they keep. When they request again, they politely use the last-modified time on their HTML file in the If-Modified-Since header and if they don't get a response they just leave the file as it is and wait until next time. If they get a response they replace their HTML file with the new data which, if it's considered to be X-Last-Polled, will now be at worst blank, and at best only contain new stuff, thus losing anything that hadn't been seen in the mean time.

This may well cause problems for some proxies. However, some sites are already beefed up enough to be able to deal with bypassing proxies., for example, always bypasses proxies because the responses generated are dependant on who is making the request. Assuming my implementation were to be used, it would have to be specified that servers MUST use Cache-control: private when honouring X-Last-Polled. That is, unless it's valid to put X-Last-Polled in the Vary header -- I can't remember how exactly Vary is specified. (probably not best to rely on it anyway, as there are plenty of dodgy proxies out there)

This is not really in the spirit of HTTP, but the benefits of including this functionality in some form are at least twofold:

[JamesAylett RefactorOk DeleteOk] (Paraphrased from discussion now moved to AggregatorApi) Are you happy to drop the proposal to extend HTTP for the purpose of feed transfer in favour of concentrating on an AggregatorApi?

[AsbjornUlsberg] Why not just use PUSH instead of PULL?

CategoryApi, CategoryArchitecture