Categories
Technology and Internet

An experiment in fighting spam on public forms using “proof of work”

Spam is everywhere. If you have an email account, a mailbox, a website with comments, a cellphone, a social media account, a public form, etc. We all know it, it is a plague.

Over the years, there have been multiple attempts to fight spam, with various degrees of success, some more effective than others, some with more side effects, some simple, some complex, some proprietary…

Online, one of the most successful approaches has been captchas. Just like spam, these little and annoying challenges are everywhere. I’m not a fan of captchas, for many reasons, but mainly because the experience for humans is often painful, they waste our time, and they make us work for free.

So, for public forms on my websites, I usually don’t use any sort of measure to fight spam. The experience for humans is straightforward, and later I deal with the spam myself.

It isn’t too much work, but these are also just several low traffic websites. Obviously, this doesn’t “scale”.

So, in March, I decided to do a little experiment, to try to find a way of reducing the amount of spam I receive from these forms. I would rather not attach an external service, use a captcha or anything that could change the experience of the user.

Proof-of-work enters the room

It is public knowledge that the “Proof of work” mechanism, used in Bitcoin, started as one of these attempts to fight email spam. So this is not a novel idea, it has decades.

The question is, would some version of it work for my public forms? The answer is yes, and in fact, it didn’t take much to get some meaningful results.

So, without spending too much effort, I added a homegrown script to one of my forms that would do the following before the form is submitted to the server:

fields = (grab all form fields)
content = (concatenate contents of all fields)
difficulty = (get the desired number of leading zeros)
loop
  nonce = (get new nonce value)
  hash_input = (concatenate nonce with content)
  hash = (SHA256 digest of hash_input)
  if hash meets difficulty
    Add nonce to form
    break
  end if
end loop
submit form

On the server side, I just calculate the hash of the contents plus the nonce, and check if it matches the desired difficulty. If it doesn’t, I discard the message.

The mechanism described above, obviously, has serious flaws and could be bypassed in multiple ways (please don’t rely on it). It is not even difficult to figure out. But for the sake of the experiment, this was the starting point.

I just tuned a bit the difficulty parameter to be something a human couldn’t distinguish from a slow website and that could be impactful to bots (nothing scientific).

The results

Six months went by, and the results are better than I initially expected. Specially because I thought I would have to gradually improve the implementation as the spammers figure out how to bypass this little trick.

I also assumed that at least some of them would use headless browsers, to deal with all the JavaScript that websites include nowadays, and automatically bypass this protection.

In the end, I was dead wrong. So this little form went from ~15 spam submissions every single day, to 0. During the whole 6-month period, a total of 1 spam message went through.

But you might ask, has the traffic stopped? Did the bots stop submitting?

Chart of the number of form submissions during the last 30 days.

As the chart above shows, no, the spammers continued to submit their home cooked spam. During the last 30 days, a total of 210 POST requests were sent, but the spam was just filtered out.

So, what did I learn with this experiment? I guess that the main lesson was that these particular spammers, are really low-effort creatures. You raise the bar a little, and they stop being effective.

The second lesson was that we can definitely fight spam without relying on third parties and without wasting our users/visitors time. A better implementation is undoubtedly required, but for low traffic websites, it might be a good enough solution. We just need an off-the-shelf component that is easy to integrate (I guess some already exist, I just didn’t spend too much time exploring).

I’m also curious about other alternatives, such as requiring a micropayment to submit the form (like a fraction of a cent). Until now, this would require a third-party integration and be a pain in every way, but with the Bitcoin’s lightning network becoming ubiquitous, this might become a viable alternative (there are similar projects out there, that work great).

One reply on “An experiment in fighting spam on public forms using “proof of work””

Comments are closed.