It’s just data

Trackback authentication

Jacques Distler: The anonymous nature of the internet makes the problem of “identity” a hard one. In physics, when we encounter an intractably-hard problem, our most frequent dodge is to redefine the problem to one which admits a solution, and hope that the result is a “good-enough” stand-in for the original problem. In that spirit, I (re)defined the problem as reliably associating comments posted with the websites of the commenters.

Just a suggestion: a lesser, but very much related and much more tractable, problem is trackbacks.  The reason why it is more tractable is that the trackbacks are issued by software which could reasonably be expected to have direct access to your weblog's private keys.  This could make signing totally automatic - simply check a box once, and your template could be updated and all future trackbacks would be automatically signed.

The signatures could be passed as a new CGI parameter or as a HTTP header.  Neither would likely affect any existing software that wasn't expecting this information.

Once trackback signing is widely enough adopted, people may feel comfortable turning off the ability to accept unsigned trackbacks.  And then much of the infrastructure will be in place to tackle the harder, and more important problem, of comment signing.

The key nut to crack there is to make it easy and painless to sign a comment.


I use Movable Type for my blog.  As I'm sure everybody here knows, it's a 3rd party app.  And while I do have the source, I don't have time to review it all.  So there's no way I'm giving it access to my private keys.

This seems like a good idea, but I don't trust my weblog software.  Perhaps if I had written it myself, I'd feel differently.

Posted by Scott Johnson at

Like Scott I am a bit wary about letting MT have my private key. But my objection is not because MT is a third party tool. Even GPG or PGP tools are third party. I am pretty sure not many have gone through its codes. That doesn't prevent one from trusting it.

My objection is with the practice of putting any of my private keys on a publicly accessible server.

I know someone will jump up and say that I could create a new key pair just for use by MT. That still doesn't solve the problem that a private key is residing on a very insecure (by nature) system.

Posted by Srijith at

C'mon guys! You don't use your personal key for this purpose.

You create a new key-pair with the express purpose of authenticating trackbacks.

Indeed, since the secret key has to be left on the server un-password-protected, you'd bloody-well better not use an "important" key for the purpose.

As to the "it's on a public webserver, ergo it's insecure." Have you thought about how HTTPS works? The SSL private key is unencrypted on the server. Have you thought about how SSH works? The server's private host key is unencrypted on the server.

This is a ubiquitous situation. Proper care needs to be taken, but having the private key unencrypted on the server is not a-priori insecure.

Posted by Jacques Distler at

Of course, if you think about it for a second, there's no reason the secret key could not reside, encrypted, on the server. Your weblogging software could prompt you for the password to unlock the secret key, and create the trackback signature when you prepare your post.

Of course, the irony of this discussion is that 99% of MovableType users  log into their blog by sending a password unencrypted over the 'net. Worrying about whether the server's secret key is encrypted or not seems pretty silly when you're sending the administrative password in the clear over the network.

Posted by Jacques Distler at

The key nut to crack there is to make it easy and painless to sign a comment.

It's not particularly hard,using  GPGDropThing (for MacOSX) or GPGShell (for Windows).

More relevant, I think, is that a very small proportion of people have PGP keys, and the rest are not about to go to the trouble of installing PGP/GnuPGP, and creating themselves a key-pair, just so they can sign their comments on my, and Phil Ringnalda's and Krishnan Srijith's and Urs Schreiber et al's blogs.

After all, the protection that PGP-signing affords the commenter only obtains if most of the blogs he comments on allow PGP-signed comments.

We've ... umh ... got a ways to go before that happens.

Posted by Jacques Distler at

Trackback authentication based on PGP signature is of more potential and feasibility than that of comment authentication, because,

[link]

Posted by Zhang Yining at

Coverage of PGP commenting idea

Some good coverage and discussion on the idea of PGP signing comment posts: PGP-Signed Comments - A good introduction by Jacuqes Distler on why comments should be signed. Notes...... [more]

Trackback from TriNetre - The Third Eye

at

Signing Trackbacks isn’t any good unless the signatures are verified. In order to verify the signatures using Distler’s method, the server has to initiate HTTP GETs to arbitrary servers. And still, if the signer hasn’t been seen before, all you know is that the Trackback is signed and the signature is related to a site. You still don’t know that the alleged linking resource links to you, is not spam, has the alleged title and contains text that resembles the extracted quote.

But once you accept that you have to make a connection when you receive a Trackback, you might as well accept Pingback and retrieve the alleged linking resource in order to check that it actually links to your page and in order to find out the real title of the linking page and perhaps even a extract a quote. Pingback “only” provides a URL, but what good is the additional Trackback payload when it can’t be trusted and you don’t know the character encoding of the text? With Pingback the autodiscovery is much cleaner than with Trackback, so if the protocol is changed, it makes more sense to build on Pingback.

It does not really matter whether the Pingback originated from the linking content management system or was sent by a third party. If the other page links to yours, what you have is a more useful connection than what you get from a Trackback that is signed but contains a fake title and extract and refers to a resource that does not actually link to your page.

Posted by Henri Sivonen at

I'm not sure I understand Henri's objection, but, then, I'm not sure I understand the point of signed trackbacks. Let me posit 4 scenarios, and then we can ask which of pingbacks, conventional (unsigned) trackbacks and signed trackbacks are applicable.

1) X links to your post, and wishes to inform you of that.

2) X links to your post, and a 3rd party wishes to inform you of that.

3) X has a discussion that is relevant to your post, but no explicit link, and wishes to inform you of that.

4) X has a discussion that is relevant to your post, but no explicit link, and a 3rd party wishes to inform you of that.

All three protocols would work in case 1). Pingbacks and conventional trackbacks would work in case 2), but signed trackbacks would not. Pingbacks would not work in cases 3,4). Signed trackbacks would work in case 3); conventional trackbacks would work in either.

What signed trackbacks tell you is that "X sent the trackback". But, presumably, what you really want to know is whether the discussion at X is relevant, and whether the trackback "excerpt" is representative of that discussion. Neither of these is guaranteed by the fact that X signed the trackback.

If, for instance, you are worried about trackback spam ("X" is the spammer's website), then just because the spammer signed the trackback does not make it any more relevant.

Posted by Jacques Distler at

Jacques: (3) is isomorphic to somebody placing a comment on your blog, with the exception that you are provided with the blog's name instead of the author's name.  Here's a mapping (look for table 1).

Henri apparently is choosing to focus on (1), and noting that if you need to fetch the page anyway in order to verify the link, why bother passing that information on the trackback?  However, it is worth nothing that another piece of information, namely the comment itself (a.k.a., excerpt), is also missing.  Excerpts are extremely difficult to reverse engineer from an HTML page.

Posted by Sam Ruby at

It is always possible for someone to leave a comment on your blog, saying "X discusses this, too." and provide a link.
[That's assuming you haven't disabled comments (on that entry). I've seen many a blog entry with the statement: "Comments closed; if you want to comment, send me a Trackback."]

Cases 3,4) are really no different from 1,2) in this regard. Either the author, or a third party could leave such a comment.

What distinguishes case 1) is Trackback autodiscovery, which (as Phil Ringnalda has often complained) turns Trackbacks into something closely resembling Pingbacks.

What makes a difference though, is, as you say, the Trackback excerpt.

The only scenario I can see where signed trackbacks would be useful is in preventing a 3rd party from sending a trackback with an obnoxious "excerpt". The author of X can, of course, do whatever he wants, signatures or no signatures.

Posted by Jacques Distler at

I was indeed focusing on scenario (1), because I think (3) and (4) are more theoretical than practical and significantly complicate the problem space. Scenario (3) can easily be reduced to scenario (1) if X can come up with a pretext for linking to your page. When the Trackback is legitimate, finding such a pretext to link isn’t too hard. It’s even easier to link and let the CMS ping automatically than not to link and ping manually. I think scenario (4) isn’t particularly likely to come up in practice and I think scenario (4) is too prone to spam. The rest of my reasoning led to a situation where the recipient doesn’t need to distinguish between scenarios (1) and (2).

But, presumably, what you really want to know is whether the discussion at X is relevant, and whether the trackback "excerpt" is representative of that discussion. Neither of these is guaranteed by the fact that X signed the trackback.

Exactly. And when the title and excerpt are not trusted, it makes more sense to use Pingback, because Pingback has better autodiscovery.

However, it is worth nothing that another piece of information, namely the comment itself (a.k.a., excerpt), is also missing.  Excerpts are extremely difficult to reverse engineer from an HTML page.

Getting an excerpt that has similar quality as the excerpts sent by MT isn’t that hard if the goal posts are adjusted a little so that the excerpt is centered around the link to your page instead of being anchored to the beginning of the post. (The beginning of the post is harder to locate the than the link to your page.)

The excerpts provided by Trackback aren’t that good. First of all, the byte sequence  comes without character encoding information. Secondly, the excerpt provided by Trackback is a plain string with no markup. Thirdly, the recipient cannot control the quality of the excerpts. You have to settle with whatever excerpt length and quality the sender cares to provide.

From a Java-centric point of view, I suggest the following:

(As an improvement, instead of extracting only text content to, the extraction could be taken beyond Trackback and the marked up structure could be preserved.)

Posted by Henri Sivonen at

Whenever possible, I much prefer human authored excerpts over machine authored ones.  Trackback will provide them directly (admittedly with an ambiguous encoding and format).  With Pingback, I have to rely on a more indirect means...  my approach to date have been to locate the feed associated with the page, and then identify the entry associated with that particular pingback.  I do a similar thing with referrers.

This works best if the weblog has an autodiscovery link to their feed, and if the feed provides both a summary and full content.

Code: pingback, extractor.

Posted by Sam Ruby at

Haven't thought about this in depth, but wouldn't it be adequate for practical purposes if the target simply verified that a link was indeed made from the other end before accepting a trackback? Am I missing something?

Posted by Seb at

Two things, no, three.

One, there are perfectly legitimate uses for pings where the pinger doesn't link to the pingee. Two, now that I've seen a lot more comment spam than I had the last time we discussed this, I would get around a requirement to link by simply linking from a text-decoration: none period, or some other way of hiding a link in plain sight if you became too cunning for links from just a piece of punctuation. And third, if you want to be reassured that I link to you, well, since I control when you get the ping and what I return to you, I'll just tell the script behind my page that for the next five minutes it needs to include a link to you.

Pleeeese can I turn evil? Comment spam, crapflooding, page-widening, trackback spam, RSS aggregator exploding, it's always so much easier and more fun to be the one doing the evil, not the one trying to block it.

Posted by Phil Ringnalda at

I would get around a requirement to link by...

Which is exactly why I don't think an automated link check is going to suffice for me. I'm still going to click through to see if your page is actually relevant.
(Or, more likely, summarily delete the trackback if the URL is livenudeanything.com .)

Trackback spam is, I think, intrinsically less attractive to the spammers. But it's also harder for the blog-owner to combat.

I know it's retrograde, but separating trackbacks to a separate page (still the default in MovableType) with a nofollow directive for spiders would essentially render them ineffective.

Pleeeese can I turn evil? Comment spam, crapflooding, page-widening, trackback spam, RSS aggregator exploding, it's always so much easier and more fun to be the one doing the evil, not the one trying to block it.

Ah, but the challenge is far greater for the Good Guy. The spammers and crapflooders and other lowlifes have the advantage of playing the white pieces.

You'd bore quickly of the Dark Side.

Posted by Jacques Distler at

I know it's retrograde, but separating trackbacks to a separate page (still the default in MovableType) with a nofollow directive for spiders would essentially render them ineffective.

Rendering them ineffective does not stop them.

Compare referers and robots.txt

Posted by Sam Ruby at

Compare referers and robots.txt

I was merely suggesting that there's now pagerank boost from a link on a page with a nofollow directive (hmmm. Maybe one needs a noindex directive too.)

Rendering them ineffective does not stop them.

In the sense that spammers don't know or care whether spamming your particular blog will boost their pagerank? True, as long as spamming many or most blogs will be effective.

It would only be a deterrent is if the big players like Six Apart made that the default configuration.

Posted by Jacques Distler at

Add your comment