bcachefs

Replication update

Added 2017-05-15 08:12:50 +0000 UTC

As I think I mentioned awhile ago, for replication the last big item left was IO error handling - that is, handling IO errors without just going read only when we've got another replica to read from (for reads) or when only some of the replicas for a replicated write failed.

The really tricky one was btree node write error handling, since on btree node write error we have to note somewhere that we can no longer read from the replica that failed, and we also have to note in the superblock that the drives we need in order to mount changed (because we're now degraded).

Well, this one is done, and passing torture tests. It's now possible to set up a multi device filesystem with replication, fail all IO to one of the devices, and use the filesystem without it going RO. Big milestone :)

I'm still finishing off the IO error handling for normal data reads and writes (improving the read retry code to handle retrying from a different replica, making the write error code smarter) - but this part is a lot less tricky than the btree write error path.

Replication is getting really close to actually being useful! We had another big milestone the other day too, one user accidentally reformatted one of the devices in his multi device, replicated filesystem, and didn't lose any data. That was with the filesystem unmounted though - once I finish the remaining error handling, you'll finally be able to yank a device from a multi device filesystem while it's in use without any interruption in service and without userspace seeing any errors.

Baby steps...