Antibot module for Comment Spam, Alternative to Mollom End Of Life

Acquia has announced the end of life for Mollom, the comment spam filtering service.

Mollom was created by Dries Buytaert and Benjamin Schrauwen, and launched to a few beta testers (including myself) in 2007. Mollom was acquired by Acquia in 2012.

The service worked generally well, with the occasional spam comment getting through. The stated reason for stopping the service is that spammers have gotten more sophisticated, and that perhaps means that Mollom needs to try harder to keep up with the ever changing tactics. Much like computer viruses and malware, spam (email or comments) is an arms race scenario.

The recommended alternative by Acquia is a combination of reCAPTCHA and Honeypot.

But there is a problem with this combinationa: reCAPTCHA, like all modules that depend on the CAPTCHA module, disable the page cache for any form that has CAPTCHA enabled.

This is due to this piece of code in captcha.module:

// Prevent caching of the page with CAPTCHA elements.
// This needs to be done even if the CAPTCHA will be ommitted later:
// other untrusted users should not get a cached page when
// the current untrusted user can skip the current CAPTCHA.

Another alternative that we have been using that does not disable the page cache is antibot module.

To install the antibot module, you can use your git repository, or the following drush commands:

drush dis mollom
drush dl antibot
drush en antibot

Visit the configuration page for antibot if you want to add more forms that use the module, or disable it from other forms. The default settings work for comments, user registrations, and use logins.

Because of the above mentioned arms race situation, expect spammers to come up with circumvention techniques at some point in the future, and there will be a need to use other measures, be they in antibot, or other alternatives.




spam filter

Mollom also provides a spam filter which antibot doesn't. We have spammers that are fine with registering by hand and then spamming. Mollom doesn't get all of them but it does get the most vile ones. What remain are a few with links to watch sales sites or similar - maybe 2-3 a week.

No worse than before ...

For the sites I manage, we turned off registration because of bogus users who post spam several years ago. These users could be Amazon Mechanical Turk users, being paid cents for spam links. Mollom did catch some of that, but also did let some through.

After switching to antibot, the volume of comments that get through remain the same, so I am no worse off than with Mollom. A couple of comments will get through every week, just like before and have to be dealt with. The way I deal with this is to force the comment to be approved first, and an email notifies me that a comment is waiting approval.

This may not work for every site, and extra measures are needed for the scenarios that you mention.

I realize that this is temporary, and spammers will up the ante at some point, and the arms race continues. Just a fact of life ...

SpamBot is a good option to

SpamBot is a good option to catch those spammers who register for an account by hand:

This cut down on our spam by a lot when we started using it. And you're contributing to a global database of spam accounts.

No user registration ...

On my sites, I still have anonymous comments enabled, but I disabled registrations a long time ago, because of spam registrations.

The Spam Bot module will not help with anon comments.

Moreover, many spam comments come from random IP addresses, perhaps using bot nets of compromised PCs. If having a database would be the be all end all solution, Mollom would not have gone under. Alas, it is not to be ...

Security Through Obscurity

This feels less like a solution and more like "security through obscurity."

The module description says that the module simply requires the browser to have JavaScript turned on, and for the user to move the mouse or press Tab.

The SendKeys command can send a tab key press:

Application.SendKeys ("%{TAB}")

So, all the spammer needs is to do is obtain the window handle using a VB or VBA projects (you can build these in Excel or even Word, or just download Visual Studio Community Edition), and then use SendKeys plus program logic to transmit commands to the browser.

This, in effect, converts the browser into a slave of a MS Office macro-enabled document or VB Project, allowing the spamming to proceed without interruption.

For more sophisticated attempts, the user can build their own browser project in MS Access (it's an embeddable control when building forms). From there, the user can set up an event that waits until the "OnDocumentComplete" event, and then start the spam function a step later. I'm sure this approach is also possible with Visual Basic. In the "arms-race scenario" put forth in this post, this approach will eventually appear in a spammer's toolkit, if it hasn't already.

I'd suggest that antibot is still a good idea, but it should work differently. Instead, it should just override the cache at any form page you suggest--requiring a fresh download each time that particular form page is viewed. It makes the user's experience easier.

Arguably, on a personal blog, antibot might be enough unless the blogger rises in popularity in their field or topic, and ends up leading the search engine rankings. But if you're running a storefront or site that takes form submissions from customers or prospective customers, or if you're concerned about security, then antibot may not be the right choice.

Inevitable in the 'Arms Race' context

I have already mentioned that this works for now, and I am not under any illusion that it will work forever.

This is because spam (be it email, or comment) is an arms race scenario, and what you use now will not work at some point in the future, and other counter measures will be needed.

For now, it does the job, and does not disable page cache (unlike more popular alternatives). You know the saying of not needing to outrun the bear, but outrun others who are fleeing the bear? We don't need the perfect solution, we just need a good enough one.

Cache the Form and not the CAPTCHA?

I guess I see your point. Although, in some cases, I'm worried we're outrunning the bear while a nuclear bomb is going off nearby--rendering the bear moot.

My guess is that the cache gets disabled in reCAPTCHA so that it can serve different CAPTCHA images each time a visitor requests the page.

Would it be possible to hold a form submission for a separate step including the reCAPTCHA on a separate page?

If it was possible to run a two-page form (I've seen surveys that only ask a single question per page), the CAPTCHA image would be the only thing that doesn't need to be cached.

If it gets placed alone on its own page, it would not interfere with caching the form.

Then you could cache your main form.

Report this to CAPTCHA's issue queue

The problem is that a site with CAPTCHA on the login/register pages only may get by fine. But a site with CAPTCHA on every comment page means a significant percentage of the page views will have cache disabled.

The reason CAPTCHA is disabled for the page that displays a CAPTCHA is to prevent the challenge from being cached, and hence being useless, i.e. the challenge is the same for everyone, thus defeating the very idea of a unique challenge for every visitor.

For example, check #602226.

It is even more critical when using external caching (e.g. Varnish), like most high traffic Drupal sites do. See #632742.

Finally, I am no longer using CAPTCHA, reCAPTCHA or any of their variants, so I can't comment on them. They have made design restrictions that prevents them from having caching. So report that in their issue queue with any suggestions on how to make it work.