Buffer I/O error on dev zd0, logical block ..., async page read #14299
-
I have been running OpenZFS on a physical machine with Debian 11 for slightly more than a year without any issues. Recently I started a qemu virtual machine via libvirt, and decided to use a couple of zvol volumes for this VMs block devices. As soon as I started using the zvol, I noticed the following messages reported by the kernel:
These appear regularly in short bursts, supposedly when the VM is trying to use respective blocks. When this happens, I am seeing CKSUM errors incremented in
The pool that this volume is part of is built on a mirror composed of two hard drive partitions, and when these buffer I/O errors happen, I am not seeing any I/O errors from the underlying device reported by the kernel, and SMART counters for both hard drives show no errors. I tried setting I tried upgrading ZFS and the kernel to the latest versions available in Debian backports repository, but I keep seeing these errors even on the newer versions. Today I decided to stop using zvols and tried to migrate data from these volumes to qemu image files, but More details about my environment (zpool history, SMART output, list of blocks this is happening for) are in this log. At this point this seems like a bug in OpenZFS because I am not really seeing any I/O errors lower in the stack, but I am creating this as a Q&A discussion (rather than a Github issue) just in case I am missing something obvious. Any thoughts? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 7 replies
-
What does Are you running If you migrate one of the zvols to not have encryption enabled, does the problem still keep happening? (To be clear, it'd still be a bug if that were true, just asking to narrow down what's wrong.) It might be interesting to see whether 2.1.7 resolves this for you, though I can't think of any fixes that would be germane. I should remark that 2.1.6 doesn't claim to work on 6.0, so it's not impossible that could be a problem for you, though in practice I don't see any explicit 6.0 support fixes in 2.1.7 other than listing it as supported... |
Beta Was this translation helpful? Give feedback.
What does
zpool events -v
have logged when these checksum errors show up?Are you running
zpool clear
at any point, because it should definitely be showing those bubbling up past the individual devices.If you migrate one of the zvols to not have encryption enabled, does the problem still keep happening? (To be clear, it'd still be a bug if that were true, just asking to narrow down what's wrong.)
It might be interesting to see whether 2.1.7 resolves this for you, though I can't think of any fixes that would be germane.
I should remark that 2.1.6 doesn't claim to work on 6.0, so it's not impossible that could be a problem for you, though in practice I don't see any explicit 6.0 support fi…