Touhou-Project.com

Match Three

Added 2020-07-05 15:03:34 +0000 UTC

Hey guys, hope you’re doing alright. Instead of setting it up, I’ll let this GIF do the talking for me:

Now you may be thinking, “Hey, autocompleted suggestions isn’t a topic worth talking about. I’ve seen it on plenty of sites.” Yeah, sure, it’s common enough. But just because it’s not rare doesn’t mean it’s not something complex and worth thinking about.

I’ve wanted to have tags for stories and be able to use them to filter entries for a long time. I’ll definitely be writing up another, more comprehensive, post on the topic. For now, it’s more important to understand that such a system has been planned and implemented to a considerable extent. Because the groundwork was put into place, I was able to adapt some of the work towards this.

Quite obviously, if you’re going to have suggestions, you need somewhere to store them. If you want to keep the list dynamic as things are added, then you need some sort of database (as opposed to including them in a static page). So then there was an early design decision to make: whether to make queries directly to the database or to have some sort of abstraction in place. While the former has the advantage of always delivering the most accurate results it also has the downside of needing a live connection to the database—not a huge issue on a small site such as ours but I’d just as soon minimize server-side lookups.

So that leaves the abstraction which may as well be JSON-formated text. Getting ahead of ourselves for a moment, the amount of data we’re handling will likely not approach 100 KB in files at the rate we’re going for years. The largest of those files by a lot—story titles—is currently about 30 KB and that’s not an amount of data that’s too large for even an oldish phone to pull up seamlessly when requested. Let alone to parse and be stored in memory on a single page.

I had already written up some code to output stored variables as JSON for other things (I may have mentioned it in the past when talking about the sitemap and thread watcher) so I had some idea of what I’d have to code. The devil’s in the details, though, and while I had the basics up in the time it took me to have a cup of tea, it took the better part of a day to get it properly done and bug-free. Since the JSON isn’t being generated constantly, it’s triggered adding stuff to the real database. So it needs to play nice with other code elsewhere, integrated into the management page among other things. I’ll spare the details but adding even the slightest bit more of complexity to systems that run in the hundreds or thousands of lines of code requires some thought and much testing.

Nearly everything on THP requires a SQL database entry, from post messages to board names to our storylist. And because a lot of that is 10+ years old and was organized by Kusaba X devs among others following guidelines and standards of the time, there are certain workarounds and compromises in how things are stored. In order to keep the JSON simple and avoid special cases with names and characters, I had to first manipulate many database entries.

Not only did I collate and restructure tables to be fully unicode/UTF-8 compliant but replaced escaped special characters that no longer needed to be stored that way. This means that when the JSON is generated or added to it will never require needlessly complex code to convert characters before they pop up in those suggestions above. You may not realize it but, although your browser might display a “ë” , the browser may think it’s “ë” or any of a half dozen character codes. Even if you were to input the letter with your keyboard, it may well not return a match unless you input the actual character code.

After some additional time, spread over several days, dealing with corner cases and making sure that nothing on fire, I was satisfied with the results. They’re generated from different sources and have to have a few special checks before being written but the end results are three files with near-identical structures. The only deviation is with the authors file, which segregates tripcodes from names.

Described in plain English, the client-side javascript code does the following: Watch the three input fields so that when one is clicked for the first time, it sends a request to the server to return the JSON file that matches the name of the field. When received, this data is parsed and is called up whenever the user types a character into the field. Any matches show up in a list that’s added onto the page dynamically and that list can be sorted through with the up and down arrows on the keyboard.

All in all, it’s about a hundred lines of code and was not very difficulty on a conceptual level. The tricky bits were related to treating each field with its own special cases. For example, to only return tripcodes when a user begins typing with a “!” and not to return additional author names if one is already there or if there’s a space somewhere in the field. That last bit is pretty important as the tag search field should return more tag suggestions as soon as you put a space so you can combine tags.

At the same time, all this should be done in a modular and somewhat generic fashion, since there might be a need down the line to add or remove search fields or adapt the code to other parts of the site. I also refused to use any external libraries to keep pages trim and avoid dependency hell. So, even with the other functions on the new storylist at the moment, my script is around 4 KB.

The work put in to make the DB, JSON and script sane and conform to best practices does have another advantage other than piece of mind for me and improved maintainability. There’s an obvious application in repurposing a lot of this on the management side of things. Modifying storylist entries, their tags or their children/parent stories is easier when you can call up relevant data just by typing in a field. Talking about that, however, will have to wait for another time.

There’s still much work to be done on the new storylist and plenty of other common things that you might not really think about as an end-user will require similar amounts of effort. Even as new features come in, I often have to go back and tweak things for better interoperability or to simply add boilerplate stuff like HTML or CSS to the new content. I’m not sure when the storylist will be “done” but it’s shaping up nicely and, time and resources permitting, I’ll be working on it regularly.

The next one of these will likely be about tags and likely be far shorter. It was my intention to post smaller bits of info faster but, even limiting details, some of these posts need more words to explain adequately. Still, look forward to the next one. Until then, take it easy!