SakeTami
Touhou-Project.com
Touhou-Project.com

patreon


Spam, Spam, Spam, egg and Spam

Hey guys! Hope you’re well. Full disclaimer: this isn’t what I intended to share for the next one of these posts but because of a combination of a lack of time and a pervasive feeling of being suffocated by life’s crap, I haven’t had as much time to advance the plans for the site as much as I would have liked. So, instead of showing off something something universally good and shiny, I’ll be talking about something that plagues websites and email servers everywhere: spam.  

In case you’re not familiar with the ubiquitous menace that is spam, in a computing context, it is broadly defined as undesired messages or information. Think junk emails about enlarging your reproductive organs or posts promising to let you earn hundreds of dollars from home by just using one weird trick that doctors hate.  

THP may not be the nexus of activity and interest that it deserves to be but it’s also no stranger to unsolicited messages. If you’ve been on the site over the years you may have noticed threads getting bumped out of the blue. It’s only sometimes a clueless anon wishing that his favorite story came back to life. More often than not, it’s spam. Links to galleries full of nudes of questionable legality, sites where you can get 100% legitimate pills at a fraction of a cost and so forth. Those are usually (relatively) swiftly deleted by moderators (mostly me) and so you’re only left with the ‘bumped’ thread sticking out like a sore thumb.

The board software has a few crude countermeasures for this most basic sort of spam. This is something that I’ve mentioned before that I had to fix but there’s a timer that forces a cooldown between posts; an entrepreneuring spammer won’t be able to simple flood the boards with a script and drown out all those delicious updates. At least not easily and not without someone (hopefully) noticing in time to ban his sad robotic behind.  

Likewise, when the spam is intermittent or random the board software accepts certain words in a list form that get filtered. That is to say, if your post contains something of the magic word, you’ll automatically get banned for a short while. This deters a lot of the automated spam as bots are usually not creative enough to change up their posts most of the time.  

As an aside, a few years ago we got spam of the worst sort, posting illegal content every so often. The solution there was to be both aggressive with the filters as well as ban certain file names silently. What’s that that you mean? Well, I mean that certain file names would not trigger an automatic ban but instead end the connection to the site. To a bot that’s spamming it would appear it was successful in its mission and did not need to change up its tactics. Sometimes subtlety is key to avoid prolonged campaigns of harassment and spam.  

Finally, there’s another tool that’s a relatively new addition: captcha. If you’ve ever filled out a form on a website in the last couple of years you’ve likeley seen Google’s reCAPTCHA which forces you to type out a certain message or click on a series of images. This is by far the most effective form we have of fighting automated spam these days as scripts and bots still have problems resolving these human-readable glyphs.   

I’ve mentioned in passing that I’ve applied a captcha system on the site if it were ever really needed. It’s a free, libre and open source implementation that doesn’t depend on any third-party servers or anything. This is primarily important for privacy-conscious people like me that don’t trust corporations like Alphabet with data. This system has been mostly dormant, only really ‘active’ on the otherwise locked /coriander/. 

I’ll be honest: THP isn’t really the target of that much spam. It could be far worse. The average is that every couple of months an old thread might get bumped or a thread past the auto-sage limit might get dedicated spam every day for a week or two. It is in the latter case last month that drove me over the edge. I didn’t want to deal with deleting links in Russian and promises of more powerful erections in threads that no one else saw. It’s just needless work.

So, the solution? Well, implementing the captcha system into old threads. As of a while ago threads that haven’t been bumped in a few months now require users to fill out an input box when posting. If you get the captcha wrong, you can’t post. Once the thread is bumped, however, the captcha requirement fades away. In other words, if a story comes back from the dead, writer and voters won’t be burdened with a captcha every time they post, just for the first post that brings the thread back from the dead.  

To implement this I had to change up some of the board’s template and create a special file that generates a captcha on demand. When it’s detected that a thread is old (a value is passed from the web page), user-side scripting makes a request that then generates the output in the post box. It’s a simple concept but getting it just right did take some thinking. In the future I may change the actual captcha values into something thematic like a series of character names but since this should be a fairly uncommon inconvenience I was in no rush to get too carried away.  

The spam I mentioned earlier? It had been ongoing for about two weeks, with me signing in more than once a day to clear it up in old threads. Now? It’s been completely eradicated and I haven’t had to deal with weird services or images featuring questionable content. Win-win-win. I think anon will forgive me for having to input a captcha to bump an old thread every once in a full moon.

I hope to be able to talk about something more exciting than spam and captcha sometime soon. I don’t have an ETA but I’m happy with some of the decisions I’ve made and the work-in-progress implementation of things. But, yeah, soon™.

I’ll likely write something a little more heady and related up for the higher tier patrons in a week or so in lieu of a fully-functional demo. Still, soon.

Until next time, take it easy!


More Creators