-
Notifications
You must be signed in to change notification settings - Fork 201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Root on ZFS: warning for GRUB incompatibility with bpool snapshots #464
Conversation
Also added a warning against encrypted send/recv. Seems to be quite serious, see for example this spreadsheet by maintainers of ZFSBootMenu. |
Root on ZFS: warning againt encrypted send/recv, due to crashes Signed-off-by: Yǔchēn Guō 郭宇琛 <[email protected]>
Signed-off-by: Yǔchēn Guō 郭宇琛 <[email protected]>
Signed-off-by: Yǔchēn Guō 郭宇琛 <[email protected]>
@rincebrain You may not care about Root on ZFS at all, but I still would like to ask what should we, the Root on ZFS users, do, re: your comment at this link. Recently, due to your comment and this GRUB issue, I have come to recognize each and every ZFS feature as suspicious which should not be enabled unless there is a real need and there is no other alternative. Does this view point make any sense? I'm currently writing guides to replace ZFS native encryption with LUKS. LUKS seems to be more stable and better maintained. |
In other words: is defaulting to |
Depends how much you worry about your data being intact. The maintainers do their best, but fundamentally, every new change has some risk of having bugs introduced - the more invasive, the more likely, for some definition of each. If you have backups, or aren't keeping any data that's that bad to lose, then sure, run the bleeding edge and shrug. The more you want to be careful about not losing your data, the longer I would suggest you test things in an environment where the data isn't that dangerous to lose before upgrading or enabling a new setup. I usually recommend that people set up with the defaults in general and then only vary from it as needed for their environment, including only enabling features you deliberately need versus using All that to say, I don't think I'd suggest You could try to synthesize manually a feature list that you thought were stable enough and keep it updated, but that seems fraught. (Oh, and that spreadsheet is mine.) |
Rich Ercolani ***@***.***> writes:
Depends how much you worry about your data being intact.
As the whole value propostion of ZFS is about data integrity, and many
go through a lot of trouble to get (Root on) ZFS working precisely
because of this reason, I would put data integrity above all else.
Especially so, considering many people starts using ZFS with these
tutorials. Maybe the developers does not care about non-enterprise
users, but it is still important to build trust among home/personal
users.
The maintainers do their best, but fundamentally, every new change has
some risk of having bugs introduced
Yes, I do believe the developers have the best intentions, but, as
current software engineering practices does not encourage "correctness
proofs" in the mathematical sense, bugs are unavoidable.
Proofs such as:
https://github.com/seL4/l4v
I usually recommend that people set up with the defaults in general
and then only vary from it as needed for their environment, including
only enabling features you deliberately need
Good. Will add this note to the tutorials.
versus using zpool upgrade to turn everything on that has been added
since you made the pool (or, with compatibility=, up to that limit).
compatibility= is great in two ways: it suppresses the "zpool upgrade"
message, and it prevents the upgrade command from doing real damage.
I think it should be set everywhere, based on these two reasons.
For bootable setups, you would usually use a bpool with something like
compatibility=grub2 since they don't really support newer features
But as it turned out, even compatibility=grub2 wasn't sufficient to keep
GRUB happy, as seen in #463.
I don't think advising people to pick a specific feature set (or
"none") as a baseline is going to help matters in any specific way.
You could try to synthesize manually a feature list that you thought
were stable enough and keep it updated, but that seems fraught.
I have neither the authority, nor have the knowledge for maintainig such
a list. If compatibility=legacy plus LUKS will make the majority of
causual users happy, so be it.
(Oh, and that spreadsheet is mine.)
Apologies for the misattribution. Will fix.
|
I'm familiar with formal verification, thank you. I would be surprised if a PR that recommended turning off every feature flag was accepted, if I'm honest. "Don't use any features" isn't really a workable approach, and since most of the testing is going to be on pools with those features enabled, there's a nontrivial risk incurred as well the further you deviate from what's been tested well - so the "everything off" option isn't a global maximum of minimized risk for data loss, it's a tradeoff of risk profiles, like everything else. I personally don't think anything but native encryption has been fraught enough that I would discourage its use by default, except If you want to recommend Everything is going to be a sliding scale of risks in various dimensions. Sometimes people report bugs against a weird edge case that came up in RHEL because the particular permutation of cherrypicked kernel features hadn't been tested, particularly on older RHEL releases where the delta from mainline Linux can grow quite large. Sometimes people discover that a particular edge case like specific conditions in memory or IO pressure like openzfs/openzfs#15439 didn't come up in their testing but breaks quite easily for some other people's workloads. If you disable very user-visible features like zstd support or encryption or BRT by default, that's going to cause an increase in complaints about that not working and not knowing why, as well as the risk of strange edge cases where one thing is on and not another. I would personally suggest that running the default set of feature flags enabled, and just waiting a bit to update each time to see if anything in the common cases that wasn't found in testing somehow gets reported, is probably a better tradeoff than recommending people pick a less well tested path to run through. |
Thanks for the detailed and in-depth reply.
I'm now convinced that the tutorials should stick to the defaults -- so
as to not diverge too much from well-tested setups -- and put big, fat,
warnings against troublesome features in the text, currently
"encryption".
For boot pool, where compatibility with GRUB is paramount and features
are secondary, I **think** legacy would work. But that's just my guess.
Currently, with the sample size of one (me), compatibility=grub2 with
root dataset snapshot will break GRUB, whereas legacy wouldn't.
Reading your reply, I wonder if there is a low-traffic mailing list
where advisories on ZFS are sent. This would certainly be beneficial.
|
Oh, I also heard elsewhere that zvol in general is also quite buggy and
no one in the enterprise uses them. Datasets are the way to go. Is
this true?
|
zvols are a kind of dataset - it'd be nice if we had a term for just "filesystem+volume" because the term also covers snapshots and bookmarks, but here we are. If grub breaks with a snapshot on the root, that should probably be a straightforward fix, since I don't think it used to, though I don't know the codebase. As far as I know, zvols are actively used by a number of entities of varying sizes - I don't do this for a day job at this point, and personally, I don't have much use for zvols, but they've worked pretty well when I've used them, and the PRs adding things like blk-mq support and improving the behavior interactions with the "quota" of volsize seem to corroborate people using them actively. |
|
Again, I bear no personal animosity toward ZBM.
but I would advocate against trying to shoehorn a ZFS installation
into GRUB compatibility
ZBM or GRUB, these are just alternatives with its own set of advantages
and disadvantages. You call a separate boot pool the art of
"shoehorning", I call a custom-built kernel, distributed over the
internet, a violation of GPL/CDDL.
Eventually, I'm a lazy and tired person who does not have the means to
maintain a bootloader, which is a critical piece of software. But you
have, and that's awesome!
you're better off just leaving /boot off of ZFS entirely. Make an ext4
filesystem or put /boot on your EFI system partition
Initrd integrity is also important. I hope you did guarantee the
integrity of ZBM in your guides, across several different disks.
One main goal I wrote these guides in the first place, was to write a
reliable and easy multi-disk root on ZFS guide where bootloader failure
are also correctly handled. Remember, without multi-disk, many of ZFS's
integrity protections amounts to nothing.
Make an ext4 filesystem or put /boot on your EFI system partition.
Therefore that's not a feasible option.
|
|
So would you -- and all other people using & watching this repo --
accept my offer of transferring the maintainence of all my guides,
except NixOS, to your team, optionally including the switch from GRUB to
ZBM?
I realize that this is not my choice to make, but I still would like to
hear your opinion on this matter.
I don't have any motivation or vested interest in defending my
arguments; I would rather give up maintaining them. Non-functional
distros are dead-ends for me. For distros besides NixOS, I can't do
much beyond monitoring the automated tests, honest. (Although I
consider those tests highly reliable.)
Now, some tidbits about NixOS. Background: I use NixOS on all my
computing devices except smartphones and I am actively contributing to
nixpkgs.
As NixOS has its own ideas about atomic package management, boot
environments are pretty useless over here.
Even harmful, I would say. Stateless root is much the better
alternative.
GRUB integration with NixOS has been quite good, and the perl script bug
with GRUB-ARM64 was recently fixed in nixos-unstable. So now Root on
ZFS can also be used with UEFI/ARM or u-boot/ARM, among others. I have
received feedback on such setups elsewhere.
|
I have no interest in maintaining extra guides on OpenZFS sites. We maintain guides at docs.zfsbootmenu.org that describe what we consider best practices for ZFS on root, with the purpose of facilitating deployment of ZFSBootMenu. We maintain editorial control of those guides. If people want to amend the community-contributed OpenZFS guides to refer to our instructions, that is between those proposing the change and those with the authority to approve it. We don't care either way. |
I will add some references to ZBM then, and see whether the PR gets
approved. I do not want to make the impression that I'm intentionally
hiding anything.
|
I'll close this and reopen as seperate PRs. |
@gmelikov