Abstract
This pace outlines a module or extension that is capable of handling versions of entries over and above the current core model that presumes that new entries with the same <id> [typically] replace earlier versions. The only change to the current specification, thus why it is "minimal", is that it emphasizes that a feed, index, or archive may contain more than one entry instance with the same <id>, and that consumers may choose to present or record only the latest instance with the same <id>.
Status
Open
Authors: KenMacLeod, AsbjornUlsberg, ArveBersvendsen
Alternate proposals:
-
PaceIssuedAndRelation -- Proposes changing the atom:entry/atom:id to indicate versions, and adding atom:entry/atom:version-of and atom:entry/atom:has-version for relating the entries.
-
PaceSupersede -- Proposes changing the atom:entry/atom:id to indicate versions, and adding an atom:supersedes element for indicating the version being replaced.
Rationale
There are two views of what an entry is (from Norm Walsh):
-
Entries have immutable identity. That is, an atom:id identifies an atom:entry that cannot change. Publishing a different atom:entry with the same atom:id is logically an error. In this view, an aggregator that notices an atom:entry with the same atom:id as one it's already seen can disregard it without another thought. It's either exactly the same as the one that's already been seen, or it has changes that are erroneous and it can recover from this error by ignoring the new entry.
-
Entries are snapshots of an abstraction that has identity. That is, an atom:id identifies an abstraction that might or might not change. Publishing a different atom:entry with the same atom:id represents some sort of change in the underlying abstraction that's being reflected in the Atom world by posting a new atom:entry. In this view, an aggregator that notices an atom:entry with the same atom:id as one it's already seen might choose to merge them, or pick one, or pick the one with the most recent atom:modified date, or do something to "harmonize" the two entries. The only thing it would never do is present both the atom:entries to the user as if they were two different entries.
The current Atom model (format-01), as well as RSS 1.0 and RSS 2.0, is the latter.
-
There is no reason why both models cannot be satisfied with the current format.
The crux of the issue appears to be a combination of what <id> is identifying and whether or not an entry (or resource in general) can have more than one identifier.
The current model, and that of RSS 1.0 and RSS 2.0, is Norm's latter: <id> identifies the abstraction of "an entry that may have multiple versions". It's not really an abstraction, because aggregators concretely replace all or portions of their previous contents of the entry when a new entry with the same <id> arrives.
There may exist another identifier that can identify a specific instance or version of "an entry that may have multiple versions". Atom does not specify where this other identifier is stored, and given all of the possible means for associating these other identifiers with "replaces" or "updates", etc., it makes sense for this other identifier and its corresponding relations to be specified in a "module" or "extension". This works exceedingly well over the core Atom model because the clients not interested in "managed versions" see only streams of new versions of the same entry.
There is a lesser issue of whether "the identifier" can be "parsed" so that it can serve both purposes simultaneously, both the identifier for "an entry that may have multiple versions" and an instance or version of that entry. Since only a small number of URI schemes have explicit support for versioning, it would be preferable for the format to have two separate locations for the two different identifiers.
Therefore, this Pace proposes using an Atom module or extension to provide versioning that layers over the core model of a stream of entries with the same <id> that are different instances or versions of the same entry.
Although the FeedValidator currently indicates that a feed with multiple entries with the same <id> is invalid, the format-01 spec does not have an assertion for that, although it does say clearly that "same <id> is same entry". This proposal clarifies the assertion that should be being tested.
<feed> <entry> <id>urn:myentry</id> <version:id>urn:myentry:2</version:id> <modified>2004-07-31</modified> <content>Hello There</content> </entry> <entry> <id>urn:myentry</id> <version:id>urn:myentry:1</version:id> <modified>2004-07-30</modified> <content>Hello</content> </entry> </feed>
Proposal
In Section 5.5, atom:id Element, add:
-
Multiple instances or versions of the same entry MAY appear in a feed, index, or archive. Consumers MAY choose to present or record only the instance or version of the entry with the latest [[date that indicates change per other discussion, which is <modified> in format-01]].
This Pace is consistent with PaceBasicAtomID.
Impacts
Consumers must recognize multiple entries with the same ID within the same feed instance and properly process them (such as picking the latest).
Notes
-
atom-syntax, Identity conundrum
-
atom-syntax, PaceBasicAtomID -- EliasTorres
-
atom-syntax, PaceBasicAtomID -- GrahamParks
For some reason, the modeling of versions seems also to be tightly bound to concurrent date discussions,
-
atom-syntax, Propose partial consensus on dates -- KenMacLeod, see earlier messages in this thread for discussion of revisioning.
and URIs,
-
atom-syntax, (followup thread:) URI scheme delusions -- EliasTorres