Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion: Implementation of a Docker based build system #1190

Closed
aesrentai opened this issue Jul 18, 2022 · 9 comments
Closed

Discussion: Implementation of a Docker based build system #1190

aesrentai opened this issue Jul 18, 2022 · 9 comments

Comments

@aesrentai
Copy link

aesrentai commented Jul 18, 2022

The current build system is, to put it lightly, complex and slow, involving building an entire toolchain the first time heads is built. The goal of this is to ensure reproducible builds, but this has apparently been broken for over two years #734 with no sign of it being fixed. A simple solution, which would make it much easier and faster for end users, as well as make reproducible builds much more simple, would be to use docker to build Heads.

Pros:

  • Faster: no need to compile our own toolchain and host our environment
  • Reproducible: By deploying a build system similar to what Signal has done for their APK (pinning specific package versions on a snapshot.debian.org archive), we can ensure the build system is identical for all users
  • Complexity: We can eliminate much of the top level makefile and many modules
  • Simplicity: no more dependency hell (even using the dependency list in the CI I had issues build heads)
  • Crossplatform: I hate Windows and MacOS, but being able to build heads everywhere would be quite nice

Cons:

  • Space: hosting an entire docker image, even a minimal one, would incur extra overhead. I doubt it would be any significant amount compared to the amount we already use
  • The continued dockerization of everything, arguably good and bad depending on who you ask

I'm not particularly interested in hacking this together unless there is a possibility this would eventually be merged upstream, which is why this is an issue, not a pull request. If there is interest, I could very quickly put together a working Dockerfile for testing. If we choose to use docker, I believe it would be best to:

  • Entirely remove the existing build system and only support reproducible builds using docker
  • Allow building using the host toolchain directly
  • Only provide support for building using Docker
@tlaurion
Copy link
Collaborator

tlaurion commented Jul 18, 2022

@aesrentai Thanks for bringing this up, again! @osresearch you might want to jump into this thread.

Some Nlnet funding is coming at the Heads project to resolve this issue, where Heads buildsystem is intending to change from Makefile to https://github.com/osresearch/linux-builder, which is desired to be happening on top of NixOS to minimize the number of hacks required to fix reproducibility issues (too many issues opened, but most important is more this one #927 for historical analysis, evaluated approaches, where more recently, debate happened under the linux-builder project itself osresearch/linux-builder#1)

Nix permits to PIN package list to a fixated commit ID, so that "latest" available package list is fixated, preventing the toolstack to change over time unless that chosen package list commit id changes. This is documented under (Pinning packages with URLs inside a Nix expression). If for whatever reason we need to upgrade exposed binaries to build anything requiring newer toolstack, we would need to simply change that package list to a later commit to maintain the reproducibility of the builds for everything from that Heads change commit forward.

The desired outcome for this is for anyone Linux based to be able to deploy that Nix layer on top of their OS if they desire so, where a produced docker image is desired to be used under CIs and promoted to be used by users. Agreed that giving instructions/script to automate building through docker should be the supported way (what is used on CI should be the supported way), where reproducing the docker image would also be possible (on a content level).

The ideal would be to have CI build that docker image in a separate project, and for Heads to use that docker image to build boards and produce ROMs, cached by CI and eventually part of releases.

As to if we can get rid or not of musl-cross/musl-cross-make being locally built, the current insight is that we would still need to cross-compile everything here so that Heads can still build for Power and other architectures in the future. So yes, the docker image might also specialize in the future. Otherwise becoming big. This is why it might be necessary to have multiple scripts under Heads to prepare different Nix buildstacks to build for different architectures, and where CircleCI would need to have musl-cross/muls-cross-make already built/deployed to be able to build for those architectures. But maybe not. Maybe having those musl-cross instances merged into Nix migt be the better way to go if they do not exist.

The idea here would be to have a everything poked per modules configure scripts deployed by nix and simply jump into that nix-shell to do the builds.

Note that linux-builder can parallelize the builds as well but is not yet par with current Heads buildsystem.

The reason we would love Nix, as opposed to your proposed debian-pinning approach, is that even if there is paths bleeding in the builds, those would be consistent across builds since everything nix has consistent paths.

Thoughts/Advice?

@aesrentai
Copy link
Author

aesrentai commented Jul 18, 2022

@tlaurion First, thank you for the quick response, and it's good that the upstream maintainers recognize that there are many possible improvements to the build system.

Assuming Nix refers to NixOS, I know absolutely nothing about them other than I frequently see it appear on HN/Twitter (where I see frequent positive opinions but no one actually using on a day to day basis), so I can't evaluate the pros and cons of Nix vs Debian. Debian is my favorite distro, but I'm also not partial at all to the OS we stick inside the docker container anyways. I have no problems with Nix as a base distro with packages based on commit ID.

That said, I am slightly confused about how this whole scheme works. Essentially,

The desired outcome for this is for anyone Linux based to be able to deploy that Nix layer on top of their OS if they desire so

which means that we can run some command and enter a build environment based off that nix layer which would be identical to the Nix docker image we use in testing:

a produced docker image is desired to be used under CIs and promoted to be used by users

or end users can use the docker image directly. This may be where my unfamiliarity with Nix comes into play, but this really feels like extra effort to support a whole new compatibility layer where we can instead be lazy and just ship a Nix based docker container and not have to worry about any Nix layer. Unless you're referring to docker when you say "deploy that Nix layer" in which case it seems that entire paragraph can be shortened to "ship a docker container with Nix in it"

This is why it might be necessary to have multiple scripts under Heads to prepare different Nix buildstacks to build for different architectures, and where CircleCI would need to have musl-cross/muls-cross-make already built/deployed to be able to build for those architectures.

I would definitely prefer "big" over different, and I don't see why we can't just ship musl-cross-make with the docker image. That said, support for multiple architectures is important and I agree it's something we need to maintain (in particular, I fully intend to install Heads on an ARM64 or RISCV laptop one day).

Additionally, the use of some more custom tooling, linux-builder, seems non ideal to me. It's just another piece of software we need to maintain which adds maintenence burden and is one of the reasons that I don't like the current build system-- it's so complicated that I really don't want to touch it other than to delete as much of it as I can. That said, the only alternative I can think of off the top of my head is the Qubes Builder, which does a lot of what we want, but is also very large for the task at hand (we don't need a full ISO, just a kernel and initrd and a small number of packages).

The reason we would love Nix, as opposed to your proposed debian-pinning approach, is that even if there is paths bleeding in the builds, those would be consistent across builds since everything nix has consistent paths.

"paths bleeding"? What do you mean by this? Additionally, I don't see how debian snapshots would be affected by this since the docker image would, itself, be deterministic, based on a fixed base image and the same packages every time.

I'm going to jerry rig a Dockerfile that uses the current buildsystem wholesale (ie, doesn't touch the makefile or any part of the build system, so now we're going to be running our own toolchain in a docker image for twice the inefficiency) based on debian. Even if we go with Nix hopefully this pushes things forward since the build system woes appear to be long outstanding with little action to try to fix them, and if I get a working build system based on docker that is reproducible and works with CI I'm going to push to have it merged even if it is not ideal. That's another possible advantage of docker: we initially ship debian but later on we can transition to nix by just changing the dockerfile.

As an aside, is the only reason we keep musl-cross-make around is because we need to get gcc to link against musl? If so this entire scheme feels grossly overcomplicated since musl-gcc has static cross compilation binaries and I see no reason we shouldn't use those instead.

@aesrentai
Copy link
Author

aesrentai commented Jul 18, 2022

Also, I really see two seperate issues here:

  1. Building the toolchain (musl-cross and other required bits)
  2. Building and assembling the final binary.

Docker takes care of 1 since we can just apt-get install (or otherwise just grab binaries from the internet) where as linux-builder should take care of 2

It may also be worth it to take the opportunity to unify all the working directories (build, install, and packages) into a single working directory to make the filesystem more sane, and perhaps even unify all the configs since having configs outside the config folder is not ideal to say the least.

Also, while we should support any target architecture, when it comes to host architecture it seems like a safe bet that they're going to be running x86_64.

@tlaurion
Copy link
Collaborator

I will come back to you later on, but basically you are redoing what osresearch did (without pinning packages) which CircleCI was using before 9ab033a

Which was deployed over https://hub.docker.com/r/osresearch/heads-ubuntu/

@aesrentai
Copy link
Author

aesrentai commented Jul 19, 2022

I saw those docker images originally and figured there was no more interest in those since the last activity was 4 years ago and, more importantly, I didn't see it anywhere in the official build guides. It only took me ~30 mins to spin up that docker demo anyways in #1191

As an aside, you mentioned that we are going to migrate to a new build system based on linux-builder, but has anyone ever explored the possibility of using Meson? linux-builder is essentially the current build system ported to python, which is a significant improvement over GNU make, but is still just a bunch of scripts we've hacked together rather than a robust and extendable build system. I'm probably going to see how long it takes me to rewrite the entire build system in Meson even if there is no upstream interest just because it seems like the perfect fit for this project. Among other things:

  • support for reproducible builds
  • simple yet powerful cross compilation support
  • ability to integrate external modules with different build systems
  • relatively easy to read, especially compared to make

@tlaurion
Copy link
Collaborator

The discussion continued here: osresearch/linux-builder#1 (comment)

@tlaurion
Copy link
Collaborator

tlaurion commented Jul 25, 2022

The reason this work is fundable is because toolsets exists to not recreate the wheel and facilitate maintainership in the future.

  • Here is NixOS documentation on how to build reproducible docker images (why and hows can be deduced from there)
  • Here is how you can use NixOS derivation to build coreboot (skipping coreboot's toolchain, not recommended unless we reproduce toolchain for all coreboot versions (Heads currently builds for 4.11, 4.13 and 4.15, where 4.17 is desired for current boards based on 4.13 and 4.15 but where Talos II and kgpe-d16 and Librem server will continue to be based on specific coreboot branches)
  • The desired (thought desired) path is to fixate the package list to be able to not bind the NixOS tools to specific versions(to be able to only specify needed tools to be installed) and replicate what we currently do with debian (in CI and on OS) to install what is needed to build Heads. But this time, cover more largely all poked modules binaries poked at modules configure step prior of building.
  • One current problem with Heads buildsystem is that blobs scripts are manually launched prior of builds for those builds to be successful (no dependency on blobs script launching). As of now, those steps are hidden under CircleCI steps.
    • The ideal here is that all blobs packages requirements packages were already in the docker image, and of course, that the board configuration could call the blobs download/extract scripts directly (since we cannot host the outcome blobs of those scripts). As of now, those blobs are downloaded per CI and made available to board's coreboot compilation to be successful.
  • As for coreboot version specific buildstack, as said before, I am not sure it would be the task of the docker image to contain them since those would be really big. I think that having those cached by CircleCI (as currently) is not a big problem (cache layers are kept 30 days), where end users would download newer coreboot version and build buildstack when there is a version bump for coreboot module. Not all users builds for more then a board or two.
  • The outcome of this, if build path is fixated as well, is that CircleCI and local builds should have matching rom checksums anyway.
  • Not sure how to proceed with releases at that point, but the expected (hoped) outcome is that the ROMs will be reproducible, and the roms for a Heads commit ID could maybe be outputed at the end of a CircleCI build. Not sure how the interactions between CircleCI and Github could be coordinated on that level, but a Heads commit ID should produce the same roms hashes, which could be outputted back to github for each commit.

@aesrentai @osresearch makes sense?
Then the reason why linux-buider is desired is for parallelization of the builds on local machine, easier to understand code (Makefile was proven to be hard to make changes and have desired outcome. One recent example of this is the rom hashes under hashes.txt containing the full path of the rom, not its local path, we lost ROM hashes since pinning gawk local version. Make and gawk got removed since not needed anymore... Its hard to maintain the build system as of now, and pinning Debian packages doesn't seem any more easy to maintain.)

@tlaurion
Copy link
Collaborator

tlaurion commented Aug 16, 2022

Assuming Nix refers to NixOS

@aesrentai:
Nix here refers to NixOS' package management system, not the OS itself as a whole. Nix package management system can be applied as a layer on top of any Linux distribution, and what is to be exposed to a nix-shell defined prior of jumping in that environment, pinning the versions to be exposed.

Basically, being able to deploy Nix package management layer on top of any linux host (or even Qubes template) for developers, create docker images the same way, or or use NixOS directly, prior of jumping in toa nix-shell that forces usage of such deployed packages versions and paths that will be the same for everyone deploying that Nix package layers, that if bleeding into the builds, would bleed the same paths and binaries and versions, consistently.

As of today, I repeat, building from CircleCI bleeds the building path (/project) into some of the modules through rpath modules upstream issues. Heads currently tries (and failed in maintaining along the way) proper patches to make sure neither rpath, build path or anything https://reproducible-builds.org/ related, as well as relaying on stable tarball sources (and on that, I aggree with you, debian archives should be used, and coreboot also use those tarballs instead of project's webpages, which also failed historically for different reasons, including non-renewed on time certs and or removal of archives etc).

The second point of the "How" section of https://reproducible-builds.org/ is what we want to resolve without adding another maintainership problem:

First, the build system needs to be made entirely deterministic: transforming a given source must always create the same result. For example, the current date and time must not be recorded and output always has to be written in the same order.

Second, the set of tools used to perform the build and more generally the build environment should either be recorded or pre-defined.

Third, users should be given a way to recreate a close enough build environment, perform the build process, and validate that the output matches the original build.

This is where we historically failed (and we are not alone). Even if environment is same, but build path is different will cause issues unless patching sources of modules. Ideally, we must abstract from that, and contribute upstream. Ideally we would not have to patch tarballs unless we customize for our special use cases (depending on mbetls being one good reason because SPI space is scarce).

So docker is a good avenue, yes. But having something that fixates once and for all what is used and exposed on/from that environment to builds is the way to go. And Nix package system addresses exactly that problem space if used as a build layer.

@tlaurion tlaurion added this to the reproduciblebuilds milestone Oct 5, 2022
@tlaurion
Copy link
Collaborator

tlaurion commented May 25, 2024

We switched to nix develop created docker images under #1269

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants