ZFS freezes on writes made via NFS, CIFS and Vim #15684
Unanswered
brayrobert201
asked this question in
Q&A
Replies: 1 comment
-
Workaround update: Disabling sync on the affected filesystems works. Which isn't really a fix (although I don't care about a few seconds/minutes of transactions) but it's a pointer in the right direction It looks like it's failing to flush the cache to disk. If I run sync manually when it's in a state, it freezes. No reason at all to believe it's a physical issue. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi All
I'm honestly not sure what data to attach, so advice would be lovely.
But basically, I've got an eight rust drive raidz2 pool that was created on Truenas and moved to Ubuntu Server 22.04
Running on an old IBM fileserver with 48GB of RAM. Model is Xyratex HS-1235T with 12 SAS/SATA bays. Eight are used.
48 usable TB with about 6TB free.
The ZFS version in use is 2.1.9
The kernel in use is one of the 5.15 ones
Drives are healthy as far as I can tell. I did have an issue with a sudden power cut needing me to do a week long re-scan and import a while back. It may be related to the cause, but no idea what, exactly.
There's no ZIL or SLOG. The zpool only contains storage devices.
They were created using a beta version of TrueNAS SCALE, possible around 22.02 Angelfish, (when it was first considered to be not a terrible idea). I guess the ZFS version was in the 2.0.X range. It was first setup about 2-2.5 years ago.
They're used as media storage drives with Syncthing. Typical RW speed is about 100-200MB/s. That's fine.
The issue is it freezes without giving any actual error. Just all of a sudden the filesystem freezes until I reboot.
I can do all the large reads/writes I like.
However, if I try to share a volume via NFS or CIFS, I can read all I like, but the moment I do a single write, everything freezes until the server is hard powercycled,
This includes simply using the touch command.
I've got similar when using vim. If I use vim to edit a file, as soon as I attempt to write, the filesystem freezes.
I'm quite able to echo data into a file. So it's something about the swap or whatever that vim is using.
As long as I don't add a single extra share, Syncthing is fine. If I add another share, I'm back in powercycle land.
If I mount the filesystem remotely using SSHFS, everything is just fine. Not a single problem.
It still has some detritus filesystems dating back to the initial TrueNAS Scale setup that I can't seem to remove like the k3s ones. Attempting to remove seems to produce a freeze, but I haven't fiddled too much with that.
No snapshots being taken.
The host wil happily continue running for months, as long as I don't do something silly like attempt to fiddle and is Vi or whatever.
This has been driving me mad. I haven't seen any evidence of anything vaguely similar on any other ZFS system I'm running. I haven't come across anything who has anything similar. The triggers are specific enough I think it must be a bug somewhere along the way with something, but the closest thing I've had to an actual error is dmesg giving me a message saying that a ZFS thread has hung.
Any advice or assistance is highly appreciated. Recreating the pool and getting the data transferred back (from another city, as I don't have anything that'll store this much data nearby) is on about the same level of desirability as having another child.
It's possible I should have reported this as a bug, but I really have no idea if it actually is a bug with this, or simply some alignment of the planets of my setup that causes an issue. If it's a bug, it's such a rare one it feels like it would never be worth anyone's time to resolve.
Beta Was this translation helpful? Give feedback.
All reactions