Every Echo entry needs two identifiers, which we'll call, for lack of better names 'post-id' and 'perma-link'. They need to be separate, and they need to be required.
Before we start justifying we need some definitions:
A URI that points to the post on the web. Now that needs some clarification, first [URI] is a big concept, and subsumes many other things, for example all URLs are URIs, which means links of the form http:, ftp:, mailto:, and freenet: are all URIs. Also, URNs are also URIs. Secondly, the perma-link should point to the story, not the source. For example, if you write a weblog entry about a story in the NYTimes, the perma-link needs to point to that entry on your weblog and not the story in the NYTimes. The perma-link should be dereferencable, for example, http:, but may be non-dereferenceable, though that is strongly discouraged.
An identifier that uniquely identifies the post on the web. Again, that needs some clarification. If you write a weblog entry about a story in the NYTimes, and post it to your weblog under two categories, the post-id will be the same regardless of which category it is published in. Also, the post-id is unique among all the Echo entries ever published, by anyone on the web, for all time. Once an item is published, it's post-id never changes. If you edit your entry, the post-id does not change. If you re-categorize your post, it does not change. Unique across space and time. What if you want to include some link to the source material? That is another Echo tag, in the Echo optional module Related, that allows for citing multiple sources.
A required perma-link
Perma-link should be required. This is a synidcation format, and the perma-link points back to the thing you are really interested in. The only excuse for not being able to supply a perma-link is that the resource you are describing is not on the web. That's a pretty thin excuse, but for those extremely rare cases, you can stuff a URN or some other non-dereferencable URI in this field. But really, if you can generate an XML Echo file that lives on the web that describes your resource, do you really have any excuse for not providing an HTML view of that same data?
[GrantCarpenter] Just for random expansion, this is a plausible edge case that illustrates a type of scenario where a resource may not be url accessible. Not necessarily a web log, maybe out of scope, but if it's a good use of RSS, maybe that (in and of itself) means something. Greg Reinacker doing VSS feeds
[GregReinacker] Totally agree - I brought this up on Sam's weblog, but Sam thought the permalink should just be a non-resolvable URN in this case. I think doing that greatly increases difficulty for the feed consumer. My current thinking is that I'd rather see permalink be optional, but postId remain required.
A required post-id
Now that you have a required perma-link, do you really need a post-id? This is where I need to show two things.
While a perma-link is a URI, it may not uniquely identify a weblog entry.
A method to uniquely identify a weblog entry is necessary.
The first one is easy if you consider categories. For example, I subscribe to the NYTimes RSS feeds, both the science and the technology feeds. There is overlap, and some stories appear in both the science and the technology feeds. Which means that they show up twice in my aggregator. Similarly MT users can turn on multiple archiving methods, which means that the same story can have mutliple URIs. For each archiving method, the story is the same but sits in a different context. In can sit in a weekly archive, a monthly archive, or in multiple category archives.
But if they are the same story, won't they have the same perma-link? No, the perma-link may point back to the story based on the context. For example, if you are subscribed to an Echo feed that contains just posts from a certain category, the perma-link could bring you to a page that contained just post's from that category, and that's what you want to happen. So it is possible that the same story could have multiple perma-links and that those perma-links show up in different Echo feeds.
So that leaves the last question, do you really need a unique identifier? Yes, because this will allow the aggregator builders to track posts and allow the end-user to control whether they see the same item if it appears in multiple contexts. Also, it will allow aggregators to more easily and consistently implement new functionality. For example, with a guaranteed unique id I can track changes to an entry, possibly higlighting differences in versions. I can also more easily and consistenly do threading if each entry has a unique id. I can group Echo entries that are all about the same thing.
On the CMS vendor side, some need a unique id to track items, and the post-id, particulary in the form of a URN, gives them a place to store that information in an easy to parse format.
Echo needs both a required perma-link and a required post-id. Since both are URIs, if the posts from your CMS only have one URL then just set post-id = perma-link. Sure it's a little redundant, but it's easy to implement. If you have content that isn't on the web, then use a URN for perma-link, but think long and hard about justifying what should be an extremely rare situation. Both supply potentially unique information, with the perma-link preserving the context of the weblog entry while the post-id is the same regardless of the context.
[MichaelBernstein] Joe, it seems to me that you could also cover all the edge cases by making both optional, but requiring an entry to have at a minimum a permalink or a unique identifier, and allowing an entry to have both if necessary. IOW, an entry can have:
only a permalink
only a unique identifier
both a unique identifier and a permalink
[MichaelBernstein] This assumes that in the absence of a unique identifier, the permalink can be used for this purpose.
[ChrisDent] I think it's bad news to conflate addresses with identifiers. It's relatively simple to create a meaningless, unique, persistent identifier for an entry and that returns large value. Making clients have to think about whether a permalink "can be used for" a purpose introduces some guesswork that breaks some of that value. Consider Message-IDs in emails or news postings, especially news postings. With those simple bits of info we get functional crossposting and, thank the good lord Pete, easily avoid re-reading crap we've seen before.
[KenMacLeod] I believe a large part of the confusion is that "permalink" is defined as "A URI that points to the post on the web" (emphasis added). As in, "there can be many permalinks" such that "a permalink" can't be used as an identifier. If, instead, we were to say that there was one "canonical reference" (identifier), then we could have "can also be found at" for categories and "internal identifier" for post-ids. See EntryIdentifier. Precedence rules would also allow for the principal identifer to actually not be retrievable, for those instances where there is no "canonical reference", in which case any one of listed "also-ats" would be used.