SakeTami
Touhou-Project.com
Touhou-Project.com

patreon


Archived Secrets Part 1

Hey guys, I’ll be talking this time around about how the archiving system works on THP.  

The basic philosophy of KusabaX was that every board was an entity unto itself. This meant having potentially different CSS/themes, board-specific moderators and other stuff like that. Interoperability with other boards on the site wasn’t really emphasized. The built-in archival system is no exception. It’s off by default and every board can have its own directory where threads go after they die.  

Our philosophy as a site is a bit different. The boards are all interrelated by the fact that they’re all touhou-themed. We have fiction on the boards and threads isn’t really the inane chatter you’d find on other imageboards and it’s mostly “important” stuff like stories. So saving and cataloging those threads is therefore important. Even when something is bumped off, we usually wish to preserve it.  

Whenever a thread is bumped off the face of THP, it is stored in a subdirectory of the board. It’s not really ready to be an archived copy, though. Things like the postbox need to be removed and all the images made into just thumbnails (to save bandwidth). This was accomplished by a set of scripts run by Kapow in the directory. Suboptimal? Yeah, definitely. But as he already was archiving the earliest /jp/ threads he took it upon himself to be the custodian of that aspect of the site.  

As an aside, part of the reason that threads are then moved from the archive directory back to the main directory is that there was a wish to make all reflinks and post quotes work correctly. If you click on >>11 it should take you to that post. If the thread was in the archive directory, it’d give you an error. Makes sense in a way but it does create a major issue: manipulating archived threads only later on is hard when they’re in the same directory with active threads. And there’s some un-niceness with regards to where thumbnails are stored that’s a mild headache the few times you do have to mess around with them.

After a thread was cleaned up, it then was fed into the database into the storylist. Again, more scripts did the basic data. Not as efficient as having a front end or more automatic and intelligent systems. It basically required a lot of human interaction to make sure that the thread really is active, hiatused or is marked as a spinoff or whatever else. The backend built for holding that information was thorough but unwieldy and for the most part I never really interacted with it.  

The common theme here is that everything required a lot of man hours to keep on running smoothly and keeping up with things proved challenging. As people moved on and interest waned on the site, it took longer and longer for the storylist to be updated.  

I realized a long time ago that I’d have to approach the problem with another solution, one that’s easier to manage, and I still have plans to do just that. As the scale of the overhaul is big and I’ve had other priorities, most of it remains just theoretical. In the past few weeks, though, I’ve dedicated some of my time at least putting on a few band-aids here and there.

As of today and the latest batches of updates I’ve made live on THP, it’s no longer necessary to process each thread with a script or manually. Threads that get bumped off should, going forward, automatically get rebuilt in such a way that they’re ready to be moved from the archived dir without any fuss. For the time being, I’m still letting them go into their archive directory before moving them manually, mostly to check that things are working alright. If it works smoothly in the next couple of weeks I’ll just merge it outright with the main directory.  

This took some wizardry so as to create an exemption to the normal thread-generation process and then create a template against which the data would be built up. I had to touch some of the messiest code parts on THP to do that. How messy can it be? Well, there’s actual comments in the files that usually state that a partial or full rewrite was intended to be done at a future date. As the project died, that never happened. In the course of my programming, I also fixed a 12 year old bug that wasn’t letting thumbnails be copied over under certain circumstances. I also removed some unnecessary file generation and tweaked some of the javascript to work on the future archived threads well.  

All in all, a few days’ work.  

The actual hard part was dealing with the threads that were already in the archive thread but hadn’t been processed. The old scripts I had didn’t cut it as there were various idiosyncrasies in threads because of the code evolution over the last year. It took me quite some time to figure out that some threads were needlessly duplicated and—with more high-level wizardry—I created a few scripts that scanned directories and sorted out which threads I could safely delete. We’re talking about hundreds of threads over several boards so checking one by one would have taken a stupid amount of time!

Once I had the threads reduced to a more manageable amount, I had to process them into the form I wanted. It took me a whole day of non-stop work to figure out enough common denominators to do a “good enough” job of processing these various threads. There’s a few small errors in some of them but, hey, it’s the kind of thing that you can’t really notice if you’re just reading the thread for the story. I consider it a big success and then I was able to get to the most tedious part of all this process: actually adding threads to the storylist.  

I’ve rambled on enough for today so I’ll leave the details about that until next time. I’ve got another feature planned relating to the story archives that I’ll hopefully have time to work on over the weekend or next week and I hope to also be able to talk about that.

I know I said that I would be looking to add some new sexy features for users and I haven’t forgotten about that. I’ve just de-prioritized in order to make things a little less crazy first. The end result of all of this is that the story list has been updated and the work I’m doing may well make its future upkeep less of a drag. Laying down foundations or, as the case may be, putting a coaster under a wobbly chair can have a bigger impact down the line.

Oh and there’s also been a few bug fixes and quality of life changes that made it in. The “improved general stability” equivalent of my patch notes!  

Until next time, take it easy!

Comments

bless

Benjamin Oist


More Creators