I am getting some spam comments on this blog which I fail to understand the reason for. To be accurate I am getting a flood of spam at the moment and 99.9% of it is being caught by Akismet and most of that is standard link promotion and advertising stuff, but a few are beyond logic.
They contain some innocuous comment such as “Hi, I have a similar topic on my blog” and then a URL —but that URL is www.google.com, there is no other link back to anything that may be proffitable to the spammer. I am perplexed.
I believe that these messages are sent solely to “poison” Bayesian filters. They cause the conditional probability tables for those common words to get adjusted in such a way that those words are equally probable to be spam as not — thereby causing the Bayesian filter to stop working.
Just my $0.02. 🙂
Hi DizzyD, yes, I can see your point and it is certainly true for the current surge of image spam where the real (spammy) content is a GIF and the cover content is random or scraped text. It may undermine confidence in the filter systems and occasionally one or two leak through, but I can’t see that it will have much effect on the system overall. Most people’s communication is about a small set of topics and to a limited number of people (the filters work on the headers as well) so only the occasional spam that happened to hit the same subset of words would be misdirected. This would be for email spam of course, and the case in the post here is comment spam; the example had so little content I can’t see that it would do anything.