Inactive hot spare checking/failing #12248
AeonJJohnson
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hey friends.... before I go off into the weeds and try and create something I wanted to check with the group to see if I am missing something that already exists.
I'm seeing field (hardware) problems where inactive hot spare drives die of boredom. It's the vendor's issue to unscrew but it has exposed a need to be able to check or test inactive hot spares to make sure they are in good condition and ready to go in the event of a data drive failure. I can't write to the drive (dd, badblocks) because of the labeling on it. So I have to find a test method within the ZFS construct.
Just label checking with zdb -l seems a bit weak to test, or even better......exercise the inactive hot spare a bit.
Running a dd read of the inactive hot spare would work but running that with a cronjob or something wouldn't be interruptible by ZFS operations and I don't like operating on drives outside of the ZFS construct.
Anyone know of anything that already exists......before I go and make something up? Like a scrub function for hot spares or something?
I want to use something other than SMART tests as I think the interface needs activity as much as the media does.
Plus, it should be a test that if failure occurs triggers ZFS to fail the hot spare and show the failed status in zpool status
Beta Was this translation helpful? Give feedback.
All reactions