Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support VanillaOS images #206

Draft
wants to merge 14 commits into
base: main
Choose a base branch
from
Draft

feat: support VanillaOS images #206

wants to merge 14 commits into from

Conversation

xynydev
Copy link
Member

@xynydev xynydev commented Jul 25, 2024

I have successfully used a CLI installed with these modifications to build an image and publish it on ghcr.io/xynydev/vanillin. I have switched to using the image in my VanillaOS VM and am currently faced with this error:
image

However, it was possible to ignore the error and try to log in, which froze the system, though the systemd unit for default-flatpaks ran...

Unsolved:

  • How to dynamically switch between ostree and vanilla templates
  • How to deal with Vanilla not providing org.opencontainers.image.version and not even really having versions in the same way as Fedora
    • As an aside, the version variable probably shouldn't be an int, since it might be the case that not all base images have simple numerical versions
  • Where to set the correct image name in abroot.json, in the signing module or the CLI template
    • Further discussion: does it make sense to have a signing module that everyone always uses, or should we "upstream" that into CLI
    • For now, I solved this with a quick script

Links:

@fiftydinar
Copy link
Contributor

fiftydinar commented Jul 25, 2024

VanillaOS v1 users have to reinstall to get into VanillaOS v2, since it got refactored, as far as I'm aware.

Is there any guarantee that this scenario won't happen again when migrating from v2 to v3?

If it does happen, how would we handle those scenarios?

@xynydev
Copy link
Member Author

xynydev commented Jul 25, 2024

VanillaOS v1 users have to reinstall to get into VanillaOS v2, since it got refactored, as far as I'm aware.

Yup, I think so too. v1 is built on Ubuntu and ABRoot v1, while v2 (Orchid) is on Debian and ABRoot v2.

Is there any guarantee that this scenario won't happen again when migrating from v2 to v3?

No. Actually, I think that there might be a guarantee that this scenario will happen when a hypothetical v3 comes, because the thing about major version upgrades is that they make breaking changes. However, as long as Vanilla stays on a Debian / ABRoot base that is compatible with Orchid, there should be no problems.

If it does happen, how would we handle those scenarios?

Depends on how VanillaOS handles them. But if the build processes are incompatible, I guess we could support both, at least for a while. There's not much else that would be our business to do.

template/src/lib.rs Outdated Show resolved Hide resolved
"Unable to get the OS version from the labels"
)
})?;
let os_version = inspection.get_version().unwrap_or(0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's leave this function alone and instead call a different function for whatever the vanilla template needs. We want to try to keep the fedora templates and the vanilla templates separate as possible.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I just did it like this because I couldn't figure out any easier way to fix this error. We should have some way to detect and declare what the base image is, and call different functions based on that, and remove the hard requirement for os_version in the tagging system, etc.

{% macro stage_modules_run(modules_ext, os_version) %}


{% macro ostree_modules_run(modules_ext, os_version) %}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original module macros here should remain unchanged. I would suggest making a new module macro for the vanilla OS.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I split this to generic / ostree macros, because only rpm-ostree module calls require the things added by the ostree macro, VanillaOS, stages, etc., do not and probably never will require anything like ostree container commit. If there is some sort of integration needed like the we have with the akmods module, it would be trivial to split generic into generic and vanilla and switch the vanilla template to use the new macro.

I'm envisioning that we'd also ship a generic base image type, which would not add any OS-specific things to the Containerfile, and could thus be usable for basically any operating system that supports OCI images as a distribution mechanism (without extra work from us, but with extra work from the custom image maintainer).

@xynydev
Copy link
Member Author

xynydev commented Jul 25, 2024

I tried to solve the FsGuard / integrity check error by generating the FsGuard filelist for /, but that errored out. Generating it for /usr worked but didn't solve the integrity check error. It might be about where our modules write stuff, but it's something we have to figure out. There's probably something I'm missing...

@xynydev xynydev self-assigned this Jul 27, 2024
@mirkobrombin
Copy link

mirkobrombin commented Jul 28, 2024

VanillaOS v1 users have to reinstall to get into VanillaOS v2, since it got refactored, as far as I'm aware.

Is there any guarantee that this scenario won't happen again when migrating from v2 to v3?

If it does happen, how would we handle those scenarios?

That's not gonna happen.

@xynydev
Copy link
Member Author

xynydev commented Jul 30, 2024

I just removed all the custom changes from my BlueBuild-generated custom image, built, pushed, and updated, and still got hit with the FsGuard issue. Just to make sure, I also built the Vib custom image template locally, pushed to the same image with a different tag, and rebased to that, and as expected there was no FsGuard pop-up.

The BlueBuild-generated Containerfile has basically the exact same lines for FsGuard as the Vib-generated one. The issue is probably with some of the "boilerplate" in the image, so then next step I guess is to slowly remove that until it starts working...

@xynydev
Copy link
Member Author

xynydev commented Jul 30, 2024

Those previous two chore commits did not help this issue at all. Maybe a GitHub attestation is required for an image to go through FsGuard properly? I can't find the part in the vib-build.yml where the attestation is done, but I doubt it would work with local test builds anyway, so I'd have to move to CI builds for testing.

@mirkobrombin
Copy link

mirkobrombin commented Jul 31, 2024

I just removed all the custom changes from my BlueBuild-generated custom image, built, pushed, and updated, and still got hit with the FsGuard issue. Just to make sure, I also built the Vib custom image template locally, pushed to the same image with a different tag, and rebased to that, and as expected there was no FsGuard pop-up.

The BlueBuild-generated Containerfile has basically the exact same lines for FsGuard as the Vib-generated one. The issue is probably with some of the "boilerplate" in the image, so then next step I guess is to slowly remove that until it starts working...

FsGuard only checks for the binaries listed in its hash table. If any of the file hashes changes, FsGuard will complain. FsGuard generation should always happen as the final step of the build process, nothing else should happen before that since may introduce changes.

@xynydev
Copy link
Member Author

xynydev commented Aug 1, 2024

The current Containerfile generated by this PR runs FsGuard second to last, and after that RUN rm -fr /tmp/* /var/tmp/* /sources/*, just like the one generated by Vib.


I've been doing some local changes to the Containerfile to test out different things:

  • Normal Containerfile generated by the current state of this PR: FsGuard error
  • Only running FsGuard and an rm cleanup step: FsGuard error
  • Only running FsGuard without a cleanup step: FsGuard error
  • Only the initial FROM statement: no FsGuard error

This indicates an error with the FsGuard line in the Containerfile.


I took a closer look at the vib-fsguard plugin.go. It seems that I did not realize that some parts of the build don't happen in the Containerfile, since I wasn't that familiar with the inner workings of Vib.

I had realized I had to download genfilelist.py to /sources/fsguard/ to make this work, but not that I would also have to download the FsGuard binary release as a .tar.gz and unpack it to /sources/fsguard/, since not doing that did not give a build-time error. The vib-fsguard module does both of those things before building the Containerfile, and I had just copied the steps from the Containerfile without realizing that I would have to build the /sources/ directory too.

And that was it! It works now. Now onto the other (more Rust-y) challenges.

...or so I thought. Ha, silly me. I still got the FsGuard error. Time to continue plugging at it until something makes sense.

@xynydev
Copy link
Member Author

xynydev commented Aug 7, 2024

Ok. Last time I was in a rush, so I made an oopsie. I used the arm64 URL for FsGuard instead of the amd64 URL. I was correct last time, but the verification still failed thanks to that detail.

ie., it actually works now! 🎉

xynydev and others added 4 commits August 7, 2024 11:30
…mplate struct

(currently switches to using the vanilla template struct only, need to add ability for user to specify struct)
… key

this key name is temporary and shall be changed with recipe v2
@xynydev
Copy link
Member Author

xynydev commented Aug 7, 2024

First successful CI build: https://github.com/xynydev/vanillin/actions/runs/10282268597

base-image-type: vanilla is required to specify the base image type currently.

@xynydev
Copy link
Member Author

xynydev commented Aug 7, 2024

Other issues:

  • When not building with squash: true, upgrades are broken for images built by BlueBuild
    • abroot just says "No upgrade available."
    • Running abroot uprgade --verbose gives this as the error: MANIFEST_UNKNOWN message:OCI index found, but Accept header does not support OCI indexes
  • abroot upgrade seems to constantly freeze at a certain generated layer, but I'm not sure why. It could be trying to apt-get install gnome-calculator.
    • The layer is 518MB which is bigger than most layers, but the second biggest layer of 443MB is fetched pretty fast no problem.
    • gnome-calculator was not the problem, after removing it the 518MB layer persisted and the fetch froze again.
    • This issue was fixed by creating a new VM with a 128GB disk instead of a 64GB disk and using the stable ISO instead of testing. The problem could've been either of those.

@xynydev
Copy link
Member Author

xynydev commented Aug 7, 2024

Near-term TODO for me:

  • Test out if unsquashed builds work (again, with the new VM)
    • I was able to reproduce this error in the new VM, which is odd, because the vib template definitely builds using Docker and does not squash...
    • If not, auto-enable squashing if base-image-type: vanilla
      • This also requires setting the build driver to podman, and the logic for that (which also exists partly in the github action) seems to be pretty convoluted, so this is maybe not something that should be done now, especially as the error does not make sense and might just be something intermittent.
  • Find a few VanillaOS users interested in testing out BlueBuild, expand new apt-get module, test different modules and apply initial fixes (if required)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants