The future of Furarchiver
Added 2025-02-23 15:58:58 +0000 UTCDon't panic, Furarchiver stays, even though it still is a money sink
As most of you are probably aware, Furarchiver doesn't actually archives anything. The archiver itself is run by someone anonymous. Furarchiver is merely an accessibility tool for that service. I've always made this fact very clear on the "About" page.
This service does occasionally has problems archiving stuff, but it never has been this long. New content hasn't been added for months now.
You may ask yourself, why use a 3rd party service at all, and the reasons are simple:
1. I'm lazy and don't want to to a task that someone already did
2. I get access to material that predates Furarchiver
3. I can pretend to be just a data forwarding agent
The first point is in my interest, the second in yours, and the third one is for us both. Because as a data forwarding agent I don't have to comply with removal requests for content any more than cloudflare would have to because I physically can't remove data from the backend since I don't own it. Because of that, I never coded in a blacklist feature into Furarchiver.
Going forwards, I will probably have to ditch the archiver service and write my own, and only use the archive for old content. This does open up new possibilities though. The archive only keeps files and (rather poorly) the descriptions. That's all I get from it. The account name and upload date I can extract from the file name. By writing my own, I can incorporate additional data into the system, including:
- The submission id
- Tags
- Full description (without truncated links as it is now)
- Comments (I will likely not do this because that would mean revisiting submissions all the time)
I also hope to see a reduction in broken files this way.
There's currently about 60 million submissions on FA. Furarchiver has 50 millions of them already just by making users download artists. When the new archiving service is written, I will likely run two instances, one that grabs the latest content with a delay of a few minutes, and another temporary one that starts at submission 1 (Yes you can visit this by just changing the ID in the URL: https://www.furaffinity.net/view/1/ they're incremental). It will take ages to catch up. Even if I were to download one submission every second it would take 2 years to reach the value of today. Luckily, I don't have to download files that I already have, I only need the metadata, so the traffic caused on FA will be limited mostly to the view page. 60 million is the upper bound. Some submissions are deleted by now.
I may also make the temporary archiver going backwards towards number 1 instead of incrementing from it.
I would prefer to just buy a database dump of the submissions from them but I doubt they would do that.
In any case, where does this leave us?
1. As of a few months ago, no new content is being archived by the backend for an unknown reason. It may have to do with the redesign, causing it to fail to parse the page contents.
2. Waiting for the backend to get fixed becomes a less viable option as time passes
3. Furarchiver is obviously not going away, considering It shoves terabytes worth of traffic around every day it is still in high demand. In case you're wondering, you guys are currently paying for between 2/3 and 3/4 of the server costs, and I take up the difference. Considering how inactive I am on here, it's surprising this system of financing the service is almost working out.
4. A new backend has to be created, which will take a few weeks considering I have a full time job and never did this sort of thing before.
5. It will take ages to crawl through the existing submissions.
6. Once furarchiver is up to date and the new content archiver is running, the old archiving backend can be retired
An added bonus of this is that this entire "scanning for artists" thing goes away because the new backend can push content to Furarchiver rather than it having to pull the content from somewhere.
A side effect is that the patreon benefits will go away. These benefits are exclusively for the archiving process (archiving when otherwise not allowed, and prioritizing your request), but these things will no longer be necessary in the future. I don't know what I will do instead or how to consolidate the various tiers. The only thing left would be to gate the zip download or limit the number of submissions that can be viewed, but I don't want to do that.
Comments
Ok then.
Noah Bangs
2025-03-13 06:32:39 +0000 UTCYou won't notice a change. The new archiver will be a background service that continuously pulls files from FA as they're uploaded and adds them to the local cache.
Draconics Inc
2025-03-08 20:17:55 +0000 UTCSo, were will we find the new archiver when you write it?
Noah Bangs
2025-03-08 19:26:36 +0000 UTCSame here, the service deserves to exist, the perks were never the point for me.
Latency
2025-02-23 19:01:55 +0000 UTCI'll continue supporting it out of principal. As it stands I don't even use the perk anymore. Being able to queue while the queue was full was handy, but that's not really the reason I signed up. I used the service a whole lot and felt that it was worth throwing a few bucks into in exchange for what I got out of it. I'm sure many others feel the same way.
Firestar
2025-02-23 17:21:19 +0000 UTC