As my “caught spam” counter rapidly approaches the 100,000 milestone I have noticed that there seems to be a discrepancy. Currently I get between 200 and 300 spam comments a day on this blog. This high figure really kicked in over Christmas last year but I have been using the Akismet anti-spam plugin since the previous May when the problem was significant, but a lot less.

The way Akisment works is that, as each comment arrives, details of it are checked against a central database at Automattic and an opinion is given as to whether it is spam or not. If it thinks it is spam it is put in a separate bucket and doesn’t get displayed; if not it goes for moderation as normal (I am a bit more paranoid than some and every time I consider taking off the moderation, I get a flurry of misses). The spam comments are kept for 15 days unless deleted manually and I have an opportunity to override the decision, though I haven’t seen a single false positive yet.

So every comment that arrives is given a unique number in the blog database, there doesn’t seem to be any way of bypassing that as they are all kept for a period and need to be referenced. As I write the latest spam message has been given number 62,407. This is 150 a day since I started, which seems reasonable. So how do Akismet reckon that they have stopped 99,315 on my behalf, almost 100 a day more?

The faq says

Some versions of the Akismet for WordPress plugin will hide duplicate comments, making it appear to be a different number caught than displayed.

but I can see no evidence of that and I am using the standard issued plugin.

It is not particularly important as there is no doubt that it does a good job, but it would suggest that their web site claim of nearly 2 billion spam blocked is also inflated.

