Thanks @vshabanov. Someone mentioned above that you may be looking to move servers? Is that the current plan or are you hoping today was a one off event with the current servers?
Agreed. The recent outages have not been the norm. @vshabanov has provided a great service over the last 10 years.
I moved to new servers this February. Old ones were 6 years old and were occasionally freezing (1-3 times a year). Unfortunately, new ones are not completely new (about 2-3 years old) and they had various issues during last few months (one halted few times, another one had memory related issues).
What happened today was actually the same issue I had on previous servers: freeze inside zfs-fuse ā ZFS filesystem for Linux ā server is alive but any disk I/O is halted (easily fixed with a reboot). Looks like itās a software problem (in general zfs-fuse seems to not be maintained much nowadays), so I will probably switch to a more maintained filesystem.
Todayās issue was especially visible as it happened on a frontend load balancer making bazqux.com inaccessible (beta.bazqux.com was still available). Previous ones usually lead to feed updates being stopped.
Iāve just checked Hetznerās website and they now have a few new server types and no longer offer models I switched to. So thereās a good chance that these ones will be really new (not 2-3 y.o.).
So the current plan is to move to new servers again and switch to a different filesystem. Hope it will help to mitigate issues regularly appearing during the last few months (and ZFS issues of the last few years).
Hey @vshabanov, I just learned about #hugops, a way of sending empathy and appreciation to the real people that run software. Just wanted to send this to you and appreciate your ongoing efforts to run a service I use and love. I understand completely why the downtime happened at the worst possible time and know that it was likely stressful to be in that position. And it sounds like you have a good path toward fixing the issue so that it wonāt impact you or us. Good luck on making that happen!
Just wanted to share that the downtime didnāt impact me very negatively. I use it for personal use, following sites and newsletters mostly for entertainment.
Couldnāt agree more.
Iām ashamed that people react this way to @vshabanov.
Itās not his daily job. He runs a great cheap service. So what, that once in a while his service is interrupted? Donāt wine about it or go find yourself an other service.
BazQux now runs on new servers. And these are really new with new drives. And I switched from buggy zfs-fuse to a standard ext4 (probably there are better filesystems, but ext4 is a safe default from the stability point of view). As a nice bonus, new servers have faster CPUs (almost 1.5 times faster), so the reader should be faster now.
What kind of bugs did you experience with zfs?
It was zfs-fuse
, not a proper ZFS (which isnāt available in Debian). It worked well until 2021 (maybe until some Debian upgrade). But then it started freezing every few months on random servers. Everything works fine, except any access to ZFS volume freezes (like it could be possible to cd some_dir_on_zfs
but less some_file_on_zfs
freezes somewhere in the kernel).
Initially, I thought that it was some kind of hardware failure. But Iāve got the same issue on newer hardware and the latest Debian. So I think itās a software bug, and I decided to move to the default ext4 filesystem, as it should be the most stable from a software point of view.