-
Notifications
You must be signed in to change notification settings - Fork 402
[RFC] feature: use reflinks for extent sharing between initramfs source and archive data #1141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
cpio already has:
Couldn't this be used to kind of implement a padding option for cpio itsself? |
The cpio newc format doesn't allow for arbitrary padding, so I had to inject it via the new pad files in the initramfs image. IMO this logic isn't suitable for GNU cpio. Another option would be to drop GNU cpio and provide a Dracut specific cpio archive generator which provides padding (and performs copy_file_range, etc.). |
Here are my latest numbers using the attached benchmark script...
Dracut initramfs creation and boot times are markedly reduced, |
This feature was implemented and merged via #1531 . See the pull request for more up to date benchmark results. |
Proposal
I think we could speed up initramfs generation for some common (Btrfs / XFS)
setups by having dracut make heavier use of reflinks / COW clones during
initramfs generation. I'd guess >95% of an uncompressed+unstripped initramfs
image is duplicate data, which really shouldn't need to be shuffled
around when on the same COW clone capable FS.
Dracut already uses cp --reflink=auto when shuffling most things
into the /var/tmp staging area, so it should "just" be a matter of
making the cpio archive generation process clone-range aware
and dropping compression altogether.
This should allow for:
The following caveats would be present for dracut to successfully use reflink (otherwise fallback to read/write):
Work-in-progress implementation
Luis and I made some changes to GNU cpio to perform between source and archive via the
copy_file_range
syscall. I've pushed this patchset to https://github.com/ddiss/cpio/tree/copy_file_range_2_13Both XFS and Btrfs require proper alignment to ensure that
copy_file_range
actually results in extent sharing. To do this I worked on a Dracutpadcpio
binary which inserts dummypad
files into the initramfs cpio archive. The new binary, as well as Dracut logic to call cpio with the new parameters, can be found at https://github.com/ddiss/dracut/tree/cpio_cfr_align .Needless to say both repos are WIP, so may result in data corruption or other disasters. At this stage I'm interested in some feedback on the approach. I've done some initial benchmarks atop btrfs, with positive results in terms of both runtime and space efficiency. I'll try to post some actual numbers in the coming days.
The text was updated successfully, but these errors were encountered: