Wiki Spam Update
In the past 72 hours, over two hundred updates to the Atom wiki have been turned away as spam.
There are a number of different types of spammers. Of little concern are the curious (is it true that anybody can update a page? yes). Nor are the defacers (let's update the pages to call everybody "gay". hehehe) much of a problem.
The overwhelming majority of spammers are the cropdusters:
sprinkling wide areas with links to gambling, porn, and
pharmaceutical sites. Due to the addition of
nofollow attributes on the links, these provide no
benefit to the perpetrator; but there is increasing evidence that
most of these are spammers are not literate, at least not in the
English language.
One such spammer periodically comes in from a private page on
this site and one by one
edits a number of pages; apparently unable to read the English
message text that accompanies the 403 forbidden status
code that accompany the response to each POST. To reduce
effort for both sides, I'm now blocking GETs from that site.
I now employ a number of blocking techniques, ranging from requiring login on a number of pages, blocking based on IP address or user agent or referer, blacklists on words in the content of the update, and a throttle on the rate of updates.
But the most effective is a relatively recent addition that relies on the greed of the cropduster: any page which contains more than ten additional external links is rejected. Only a handful of existing pages contain such a number of external links in total, so any attempt to add such a number of links all at once is very suspect.
Re: Wiki Spam Update
The real question is if nofollow should be applied at all. Using nofollow seems like a scorched earth approach, it says "Even if you manage to sneak some spam by me you don't get any PageRank(tm). Nyah, Nyah!". Unfortunately, the fact would remain that the wiki would still have been spammed.Message from Dare Obasanjo atRoss, probably even with more reason: wikis mostly contain internal references. Especially wikis such as Sam's that concentrate on a single subject.
A wiki is introverted, a blog is extroverted. Using nofollow on links in a wiki probably does not have much impact on the world...
Posted by Janne Jalkanen atDare, while I remain overall skeptic on the spam reducing value of nofollow, I turned it on for two reasons. Primarily to give it a chance to prove itself, but and also because spam breeds more spam. A fair number of spammers search for search terms (mostly using gambling, porn, or pharmaceutical terms), and correctly conclude that such sites are worth targeting.
Perhaps, someday, I'll try removing the nofollow attributes (it is just one line of code, and takes effect immediately). But just not yet.
Dare, is that because spammers don't care, because |rel="follow"| isn't implemented widely enough so spammers don't care. The experiment with |rel="nofollow"| just started, it might be too soon to make conclusions at the moment.
Posted by Anne at
Sam, the "spam breed more spam" argument does not make much sense here does it? Since the text the links are using will still be indexed. Or the text surrounding the links. I do not see any relation to |rel="nofollow"| with that.
Posted by Anne at
Spam breeds more spam because spammers use the "link:domainname.com" syntax to search Google for pages that have already been hit. Example: [link]
I don't know if rel="nofollow" prevents this, but the situation has certainly gotten out of control.
Sam, could you add "spammers" to your spell-checker?
Posted by Mark atSam may be correct that spammers may not be English - literate (or fluent, or techspeak fluent), but I think a big part of it is that they just don't care enough to read.... [more]
Trackback from The 80/20 Solution at
Simon Willison : Wiki Spam Update - Sam Ruby suggests blocking changes that add 10 or more new links....
Excerpt from HotLinks - Level 1 at
Wiki Spam
[link] Shit, wiki wordt natuurlijk geplaagd door spam, en ik heb net die link daar naar geplaats. :-( zucht In the past 72 hours, over two hundred updates to the Atom wiki have been turned away...Excerpt from Red.Cube at
Wiki Spam
[link] Shit, wiki wordt natuurlijk geplaagd door spam, en ik heb net die link daar naar geplaats. :-( zucht In the past 72 hours, over two hundred updates to the Atom wiki have been turned away...Excerpt from Red.Cube at
nofollow will never stop spam because spammers don't care. They don't know which sites implement it and which don't and they don't bother checking to find out. Using nofollow in a wiki is really worthless and actually does more harm than good, so you'd be better off implementing some better server side filtering (perhaps a Bayesian filter combined with a few other techniques) so the spam doesn't even get published.
Posted by Lachlan Hunt at
We run a wiki and haven't really had too much problem with wiki spammers, apart from one persistent guy from China who occasionally comes in and added dozens of links to about 10 pages every few weeks. Blocking IP wasn't enough since they had access to quite a large block. We now block edits that contain chinese characters in the anchor text, or add more than 20 links to the previous revision. Since almost all the spam appears to be done manually rather than by a robot, we added a sleep(20) to the code before printing out the error message denying the edit - that should be frustrating enough for a persistent spammer, while hopefully not too annoying to a legitimate submitter getting it once.
Posted by John McPherson at
Should you really employ nofollow in a wiki where there is no distinction between author and commentor?
Posted by Ross Mayfield at