Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: fix xfs UUID regeneration #21

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from
Draft

Conversation

osnyx
Copy link
Member

@osnyx osnyx commented Feb 11, 2025

*WIP: The tests have not actually been run and are just adjusted as a guesswork

Modifying the XFS uuid of the VM image's root partition failed, because it was supposedly already mounted. In fact it was not, but we are hitting an edge case of the interplay of xfs_admin, findmnt, filesystem labels, and udev, see PL-133416 for details:

  • In fc-nixos we configure hosts to locate filesystems at boot times using filesystem labels. In particular, for the mount point / we uses the device with the label "root", and hence configure a source device of /dev/disk/by-label/root. This refers to the root filesystem device using the symlink created by udev when the partition is mapped at boot time.
  • When we later attach an nbd image in order to perform operations before booting the guest, the kernel automatically maps the partition table inside the image. It discovers that the first partition has an XFS filesystem, which also has the label "root". When udev processes the device being mapped, it overwrites the existing symlink in /dev/disk/by-label to point to the device node for the partition on the nbd image.
  • xfs_admin is a shell script, which wraps lower level tools like xfs_db with a more convenient interface.
    • The version in 21.05 simply processes the script arguments and then blindly executes xfs_db.
    • However, the version in 24.11 has gained the ability to modify some filesystem attributes at runtime using a different command, xfs_io. In order to determine which command to use, the script therefore invokes findmnt from util-linux to resolve the named block device to a mount point, if one exists.
    • findmnt resolves the mount points using procfs and sysfs. If any mountpoints use symlinks under /dev/disk they are also evaluated, to detect cases where the queried partition was mounted through one of those symlinks.
  • When reading the filesystem UUID for the nbd partition, findmnt misidentifies the nbd partition as the root partion (due to /dev/disk/by-label/root being rewritten), and then runs xfs_io to print the UUID of the currently mounted root filesystem.
  • When attempting to change the filesystem UUID for the nbd partition, the same misidentification causes xfs_admin to assume that the partition is already mounted as the root filesystem, and refuses to change the UUID.
  • From reading through xfs_admin, if this collision did not occur and it managed to correctly identify that the nbd partition is not mounted, then it should run the "uuid" command from xfs_db to change the UUID of the XFS filesystem on the nbd partition.

TL;DR: replace xfs_admin -U insert-uuid-here /path/to/device with xfs_db -x -c 'uuid insert-uuid-here' /path/to/device.

PL-133416

***WIP: The tests have not actually been run and are just adjusted as a
guesswork**

Modifying the XFS uuid of the VM image's root partition failed, because it
was supposedly already mounted. In fact it was not, but we are hitting
an edge case of the interplay of xfs_admin, findmnt, filesystem labels,
and udev, see PL-133416 for details:

- In fc-nixos we configure hosts to locate filesystems at boot times using filesystem labels. In particular, for the mount point / we uses the device with the label "root", and hence configure a source device of `/dev/disk/by-label/root`. This refers to the root filesystem device using the symlink created by udev when the partition is mapped at boot time.
- When we later attach an nbd image in order to perform operations before booting the guest, the kernel automatically maps the partition table inside the image. It discovers that the first partition has an XFS filesystem, which also has the label "root". When udev processes the device being mapped, **it overwrites the existing symlink in `/dev/disk/by-label`** to point to the device node for the partition on the nbd image.
- `xfs_admin` is a shell script, which wraps lower level tools like `xfs_db` with a more convenient interface.
  - The version in 21.05 simply processes the script arguments and then blindly executes `xfs_db`.
  - However, the version in 24.11 has gained the ability to modify some filesystem attributes at runtime using a different command, `xfs_io`. In order to determine which command to use, the script therefore invokes `findmnt` from util-linux to resolve the named block device to a mount point, if one exists.
  - `findmnt` resolves the mount points using procfs and sysfs. If any mountpoints use symlinks under `/dev/disk` they are also evaluated, to detect cases where the queried partition was mounted through one of those symlinks.
- When reading the filesystem UUID for the nbd partition, `findmnt` misidentifies the nbd partition as the root partion (due to `/dev/disk/by-label/root` being rewritten), and then runs `xfs_io` to print the UUID of the currently mounted root filesystem.
- When attempting to change the filesystem UUID for the nbd partition, the same misidentification causes `xfs_admin` to assume that the partition is already mounted as the root filesystem, and refuses to change the UUID.
- From reading through `xfs_admin`, if this collision did not occur and it managed to correctly identify that the nbd partition is not mounted, then it should run the "uuid" command from `xfs_db` to change the UUID of the XFS filesystem on the nbd partition.

TL;DR: replace `xfs_admin -U insert-uuid-here /path/to/device` with `xfs_db -x -c 'uuid insert-uuid-here' /path/to/device`.

Co-authored: Molly Miller <[email protected]>
@osnyx osnyx force-pushed the PL-133416-xfs-uuid-regen branch from b414dea to f4cc733 Compare February 11, 2025 23:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant