In the past 72 hours, over two hundred updates to the Atom wiki have been turned away as spam.
There are a number of different types of spammers. Of little concern are the curious (is it true that anybody can update a page? yes). Nor are the defacers (let's update the pages to call everybody "gay". hehehe) much of a problem.
The overwhelming majority of spammers are the cropdusters:
sprinkling wide areas with links to gambling, porn, and
pharmaceutical sites. Due to the addition of
nofollow attributes on the links, these provide no
benefit to the perpetrator; but there is increasing evidence that
most of these are spammers are not literate, at least not in the
One such spammer periodically comes in from a private page on
this site and one by one
edits a number of pages; apparently unable to read the English
message text that accompanies the
403 forbidden status
code that accompany the response to each POST. To reduce
effort for both sides, I'm now blocking GETs from that site.
I now employ a number of blocking techniques, ranging from requiring login on a number of pages, blocking based on IP address or user agent or referer, blacklists on words in the content of the update, and a throttle on the rate of updates.
But the most effective is a relatively recent addition that relies on the greed of the cropduster: any page which contains more than ten additional external links is rejected. Only a handful of existing pages contain such a number of external links in total, so any attempt to add such a number of links all at once is very suspect.