Joi Ito: Comment spam is becoming more "sophisticated".
Originally, my policy was to erase stuff that linked to commercial
sites if they didn't add to the dialog in the comments. Now comment
spammers are actually trying to contribute to the discussion, but
still leaving links to their commercial sites. It is much harder to
identify as spam. Only by looking at the site that is linked do you
realize that it's probably spam.
My suggestion: take a look at the referrer. If they came
in from google, and via a query such as "blog comments" or
"remember info" or "posted by" (all of which I have received within
the past 72 hours), then they most likely are spammer.
Unfortunately, few blogging tools present referrer information at
this granularity.
Such spammers are what I will call "hit and run artists".
This means that a real live person is at a keyboard trying to
generally their aim is to game google. Some actually spend
the time to appear on topic. I am trying to work out the
heuristics for identifying such people.
Anybody want to contribute some anecdotes? How would you
complete this sentence...
You post template-looking message, neutral to the actual content, such as "Great site, thanks", "I was looking for such info long time!" That's hard to detect, but usually such comments doesn't go alone, so several template-looking comments from the same IP/name in a short time is definitely a spam.
It's the balance between a poster's intent and their actions...
A spammer's intent is obviously to get as many links to their Website as possible.
Now, unless you have extremely advanced mind reading technology attached to your Blog, you won't be able to pick up on this directly (and wearing a tinfoil hat may thwart this anyway).
It then comes down to trying to judge the poster's intent from their actions, and their URL.
If someone posts an off topic message, and it looks like it was written before they even read your Blog - with a link to a commercial service - then the intent is obviously to spam.
If the post is relevant, but has obviously not had more than 5 seconds thought put into it, and links to a commercial site, then this might lead you to believe that the poster is hitting as many Blogs as possible and leaving spam messages back to their own site.
Where it gets interesting, is when someone leaves a relevant and interesting comment on your Blog, which has a link to a commercial site. This might be spam or it might not.
Think about this though - if it is spam, someone is being paid to add interesting and relevant content to your site!
So if the comment is good enough, we should leave the spam-link? By the way, the comment you left on my blog? You messed up the link, and I had to go back and correct it, so I pulled your URL. Had it linked to your blog, two posts or not, I would have left it.
As to the question: beyond just the "six or eight words or less", I'm not seeing anything useful. A fair percentage of mine are coming in from links from other blogs, I assume previous victims. So the only difference I've seen is that the URL in a human-entered spam comment leads to a site that doesn't have a syndication autodiscovery tag :)
If they don't come back, it is not possible to have a two way conversation, is it? Robert Castelo: Um, the fact that you are getting paid is supposed to make me feel better? I don't think so. And I have to agree here with what Doc said about conten...
[more]
I've often found that the spammers use a keyword for the "Name" field when spamming my blog. One such spammer liked to use the name Valerian. While this might truly be someone's name, his site sold Valerian-based products. That was a dead giveaway.