Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nitropad-nv41 v0.2.0-2147-g1e583e0 rom kernel panic #1668

Closed
13 tasks done
aluciani opened this issue May 10, 2024 · 8 comments
Closed
13 tasks done

nitropad-nv41 v0.2.0-2147-g1e583e0 rom kernel panic #1668

aluciani opened this issue May 10, 2024 · 8 comments

Comments

@aluciani
Copy link

aluciani commented May 10, 2024

Please identify some basic details to help process the report

A. Provide Hardware Details

1. What board are you using (see list of boards here)?
nitropad-nv41

2. Does your computer have a dGPU or is it iGPU-only?

  • iGPU-only

3. Who installed Heads on this computer?

  • Self-installed

4. What PGP key is being used?

  • Nitrokey 3 mini

5. Are you using the PGP key to provide HOTP verification?

  • Yes

B. Identify how the board was flashed

1. Is this problem related to updating heads or flashing it for the first time?

  • Updating heads

2. If the problem is related to an update, how did you attempt to apply the update?

  • Using the Heads GUI

3. How was Heads initially flashed

  • External flashing

4. Was the board flashed with a maximized or non-maximized/legacy rom?

  • Maximized

5. If Heads was externally flashed, was IFD unlocked?

  • Don't know

C. Identify the rom related to this bug report

1. Did you download or build the rom at issue in this bug report?

  • I built it

Please provide the release number or otherwise identify the rom downloaded

3. If you built your rom, which repository:branch did you use?

  • Heads:Master heads-nitropad-nv41-v0.2.0-2147-g1e583e0.rom

4. What version of coreboot did you use in building?

  • 4.8.1 (current default in heads:master)

5. In building the rom where did you get the blobs?

  • Extracted from the online bios using the automated tools provided in Heads

Please describe the problem

I wanted to update heads on my nitropad-nv41, I booted my debian 11, ran the command make BOARD=nitropad-nv41 then put on a usb key to update heads on the nitropad. I updated using the GUI (retain settings).
I rebooted and now I have a problem:

Kernel panic:No working init found. Try passing init= option to kernel. See Linux Documentation/admin-guide/init.rst for guidance.
Kernel offset: 0x2e000000 from 0xfffffffff81000000 (relocation range: 0xffffffff00000000-0xffffffffbfffffff)
---[ end Kernel panic not syncing. Try passing init= option to kernel. See Linux Documentation/admin-guide/init.rst for guidance.]---

The bootsplash is displayed just before and then I get this

Describe the bug
kernel panic from heads at startup just after the bootsplash

To Reproduce
Steps to reproduce the behavior:

  1. startup a debian 11
  2. buld the image with make BOARD=nitropad-nv41
  3. flash the rom into the board
  4. reboot the board
  5. See error

Expected behavior
heads should boot and go on gui

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
Add any other context about the problem here.
can add a picture of the error on the screen for more context ..... I wanted to send the rom binary but i can t I think

EDIT : I was using the .zip file containing the rom and the hash

@aluciani
Copy link
Author

aluciani commented May 11, 2024

update : so not reproducible on my side, just build the same rom on the same computer, just flashed it externally and the nitropad-nv41 booted my debian again

Here is the only proof I will have of this bug
photo_2024-05-11_11-22-12

@tlaurion
Copy link
Collaborator

Repro.

Look at build log last lines at https://app.circleci.com/pipelines/github/linuxboot/heads/767/workflows/0b1c3842-40cd-444c-b64c-e5fb2d5a2114/jobs/16361/parallel-runs/0/steps/0-102:

2024-05-10 20:53:32+00:00 INSTALL   build/x86/coreboot-nitrokey/nitropad-nv41/coreboot.rom => build/x86/nitropad-nv41/heads-nitropad-nv41-v0.2.0-2147-g1e583e0.rom
if cmp --quiet "/heads/build/x86/coreboot-nitrokey/nitropad-nv41/coreboot.rom" "/heads/build/x86/nitropad-nv41/heads-nitropad-nv41-v0.2.0-2147-g1e583e0.rom" ; then echo "`date --rfc-3339=seconds` UNCHANGED build/x86/coreboot-nitrokey/nitropad-nv41/coreboot.rom" ; fi ; cp -a --remove-destination "/heads/build/x86/coreboot-nitrokey/nitropad-nv41/coreboot.rom" "/heads/build/x86/nitropad-nv41/heads-nitropad-nv41-v0.2.0-2147-g1e583e0.rom" ; 
29bd879c005bc5968b8f2d67a2f14f07d63ff5397624e0199440d1bbf0e50373  build/x86/nitropad-nv41/heads-nitropad-nv41-v0.2.0-2147-g1e583e0.rom
33554432:build/x86/nitropad-nv41/heads-nitropad-nv41-v0.2.0-2147-g1e583e0.rom
rm -rf "/heads/build/x86/nitropad-nv41/update_pkg"
mkdir -p "/heads/build/x86/nitropad-nv41/update_pkg"
cp "/heads/build/x86/nitropad-nv41/heads-nitropad-nv41-v0.2.0-2147-g1e583e0.rom" "/heads/build/x86/nitropad-nv41/update_pkg/"
cd "/heads/build/x86/nitropad-nv41/update_pkg" && sha256sum "heads-nitropad-nv41-v0.2.0-2147-g1e583e0.rom" >sha256sum.txt
cd "/heads/build/x86/nitropad-nv41/update_pkg" && zip -9 "/heads/build/x86/nitropad-nv41/heads-nitropad-nv41-v0.2.0-2147-g1e583e0.zip" "heads-nitropad-nv41-v0.2.0-2147-g1e583e0.rom" sha256sum.txt
  adding: heads-nitropad-nv41-v0.2.0-2147-g1e583e0.rom (deflated 62%)
  adding: sha256sum.txt (deflated 14%)
29bd879c005bc5968b8f2d67a2f14f07d63ff5397624e0199440d1bbf0e50373  /heads/build/x86/nitropad-nv41/heads-nitropad-nv41-v0.2.0-2147-g1e583e0.rom
33554432:/heads/build/x86/nitropad-nv41/heads-nitropad-nv41-v0.2.0-2147-g1e583e0.rom

Rebuilt locally:

2024-05-11 13:14:43+00:00 INSTALL   build/x86/coreboot-nitrokey/nitropad-nv41/coreboot.rom => build/x86/nitropad-nv41/heads-nitropad-nv41-v0.2.0-2147-g1e583e0.rom
29bd879c005bc5968b8f2d67a2f14f07d63ff5397624e0199440d1bbf0e50373  build/x86/nitropad-nv41/heads-nitropad-nv41-v0.2.0-2147-g1e583e0.rom
33554432:build/x86/nitropad-nv41/heads-nitropad-nv41-v0.2.0-2147-g1e583e0.rom
rm -rf "/home/user/heads/build/x86/nitropad-nv41/update_pkg"
mkdir -p "/home/user/heads/build/x86/nitropad-nv41/update_pkg"
cp "/home/user/heads/build/x86/nitropad-nv41/heads-nitropad-nv41-v0.2.0-2147-g1e583e0.rom" "/home/user/heads/build/x86/nitropad-nv41/update_pkg/"
cd "/home/user/heads/build/x86/nitropad-nv41/update_pkg" && sha256sum "heads-nitropad-nv41-v0.2.0-2147-g1e583e0.rom" >sha256sum.txt
cd "/home/user/heads/build/x86/nitropad-nv41/update_pkg" && zip -9 "/home/user/heads/build/x86/nitropad-nv41/heads-nitropad-nv41-v0.2.0-2147-g1e583e0.zip" "heads-nitropad-nv41-v0.2.0-2147-g1e583e0.rom" sha256sum.txt
  adding: heads-nitropad-nv41-v0.2.0-2147-g1e583e0.rom (deflated 62%)
  adding: sha256sum.txt (deflated 14%)
29bd879c005bc5968b8f2d67a2f14f07d63ff5397624e0199440d1bbf0e50373  /home/user/heads/build/x86/nitropad-nv41/heads-nitropad-nv41-v0.2.0-2147-g1e583e0.rom
33554432:/home/user/heads/build/x86/nitropad-nv41/heads-nitropad-nv41-v0.2.0-2147-g1e583e0.rom

@123ahaha : Maybe you didn't follow change of buildsystem to nix that happened yesterday through that exact commit?

@tlaurion
Copy link
Collaborator

tlaurion commented May 11, 2024

@123ahaha

EDIT : I was using the .zip file containing the rom and the hash

The buildsystem embeds, solely the calculated rom hash through sha256sum.txt, into the zip file so the user doesn't have to validate it manually as previously. This is why zip files exist for internal firmware upgrade since #1526 (November 17th 2023)

That is, solely, to guarantee that the zip file which is now used as firmware upgrade package matches the rom hash from build time. But that hash doesn't imply it was reproducible.

What changed yesterday by merging #1661 is that CircleCI will produce the same roms then built locally if local is built clean, which CircleCI does.

I'm sorry I cannot reproduce your issue. That shows that for some reason, your initrd.cpio.xz file, containing the init script which couldn't be found, being part of heads.cio (scripts, security policies) got corrupted somehow. Why/how? We cannot know unless you still had the rom image lying around.

Let's go practical in the goal of understanding what is packed under rom, shall we.
Check https://output.circle-artifacts.com/output/job/6916d09c-5784-4292-bb44-d559923d9d17/artifacts/0/build/x86/nitropad-nv41/hashes.txt and bear with me for a minute.

If you take a look at hashes.txt produced alongside of your rom, you will see inside of that file the hashes of each file (last step of building them) in which cpio (outputted in the file when cpio is created), prior of the initramfs (initrd.cpio.xz) and kernel (bzimage) that is stiched reproducibly inside of the rom by the final coreboot building phase of the rom (stitching them as coreboot payload).

So if the rom hash is reproducible, therefore all other parts are reproducible:

  • heads-nitropad-nv41-v0.2.0-2147-g1e583e0.rom, containing:
    • bzImage (compiled in kernel)
    • initrd.cpio.xz, containing:
      • heads.cpio (scripts, security policies)
      • tools.cpio: (libraries, binaries)
      • modules.cpio (kernel modules to be loaded on demand inside of heads)

So again, if rom is having same hash output locally than over CircleCI, then we have reproducible builds.


Do you happen to still have the old rom image on your usb thumb drive, which caused the initrd.cpio.xz to be corrupted in your local build?

A reminder that build instructions at https://github.com/tlaurion/heads/blob/ecbfdbc57b23ef0b884b394e1ad97491b8d2f8b6/README.md#build-docker-from-nix-develop-layer-locally are to be followed.

@JonathonHall-Purism if this kind of issues happen more then one other time in the future, I think I will move the local docker build creation steps in the devel section of heads-wiki and force users to download docker image only, so that nothing can come in between what is done by CI and what users can do.

@123ahaha can you shed some lights on what happened or better, send me your old rom you flashed from USB Thumb drive?

If no repro, it didn't happen, unfortunately. Goal here now is to understand why you happened to be able to produce a rom that was different somehow without it being marked "dirty" from 1e583e0

@aluciani
Copy link
Author

Maybe you didn't follow change of buildsystem to nix that happened yesterday through that exact commit

I did, but I thought we could still use the old method, "make BOARD=nitropad-nv41", I was on a debian-11 system so didn't need the nix build system ...

send me your old rom you flashed from USB Thumb drive?

Yeah ... I'm sorry, I'm still not used to production environments and bug reports, so I thought about it last night, but this morning I just rm -rf the repo git heads and I started from 0, and to start again on something clean I cleaned my USB key ... so no I no longer have the corrupted rom ...

If no repro, it didn't happen, unfortunately

I think it's going to end up like this, I can't supply the rom and the only things I can help with are the steps I took to get to flash using the heads GUI tool.
The only proof is the photo of the screen taken. (which doesn't prove much if you want to go all the way, and doesn't help much).

I'm going to describe once again the steps I took to reach the kernel panic:

  1. boot a debian 11 (clean, specific for build head, bare OS not a Qubes vm)
  2. go to the git repo, do a git pull to get the new commits
  3. make BOARD=nitropad-nv41
  4. plug in my USB key, cp /home/user/heads/build/x86/nitropad-nv41/heads-nitropad-nv41-v0.2.0-2147-g1e583e0.rom /media/user/4GO
  5. boot nitropad-nv41, go to update utility
  6. update with freshly copied .zip file
  7. reboot
    -> nitropad reboot, displays bootsplash and kernel panic.

@tlaurion
Copy link
Collaborator

Maybe you didn't follow change of buildsystem to nix that happened yesterday through that exact commit

I did, but I thought we could still use the old method, "make BOARD=nitropad-nv41", I was on a debian-11 system so didn't need the nix build system ...

send me your old rom you flashed from USB Thumb drive?

Yeah ... I'm sorry, I'm still not used to production environments and bug reports, so I thought about it last night, but this morning I just rm -rf the repo git heads and I started from 0, and to start again on something clean I cleaned my USB key ... so no I no longer have the corrupted rom ...

If no repro, it didn't happen, unfortunately

I think it's going to end up like this, I can't supply the rom and the only things I can help with are the steps I took to get to flash using the heads GUI tool. The only proof is the photo of the screen taken. (which doesn't prove much if you want to go all the way, and doesn't help much).

I'm going to describe once again the steps I took to reach the kernel panic:

  1. boot a debian 11 (clean, specific for build head, bare OS not a Qubes vm)
  2. go to the git repo, do a git pull to get the new commits
  3. make BOARD=nitropad-nv41
  4. plug in my USB key, cp /home/user/heads/build/x86/nitropad-nv41/heads-nitropad-nv41-v0.2.0-2147-g1e583e0.rom /media/user/4GO
  5. boot nitropad-nv41, go to update utility
  6. update with freshly copied .zip file
  7. reboot
    -> nitropad reboot, displays bootsplash and kernel panic.

Same thing here, I can only go with logic, but the point I don't get is how you got a "dirty" rom not being tagged dirty, building per old instructions.

I will try to repro using my debian-12 machine and build qemu rom the way thingswere done prior of #1661 being merged and see if there needs to be a bit gfat warning in case initrd gets corrupted.

@123ahaha thanks for the report, I just don't quite know what to do with it yet.

@tlaurion
Copy link
Collaborator

tlaurion commented May 11, 2024

Nope. Couldn't reproduce

I guess you rebuilt on clean checkout when building clean reproducible rom the second time where not the first time.


Only thing to report is that the target/qemu.mk target for qemu boards was modified since docker image builds as root from its container, so I had to do a sudo call to launch the make run part

Previous call was prior of #1661 working:

cd ~/heads && make BOARD=qemu-coreboot-fbwhiptail-tpm2 PUBKEY_ASC=~/pubkey.asc inject_gpg 
make BOARD=qemu-coreboot-fbwhiptail-tpm2  USB_TOKEN=Nitrokey3NFC PUBKEY_ASC=~/pubkey.asc ROOT_DISK_IMG=~/qemu-disks/debian-9.cow2 run`

But now produces:

----------------------------------------------------------------------
!!!!!! BUILD SYSTEM INFO !!!!!!
System CPUS: 12
System Available Memory: 7538 GB
System Load Average: 2.43
----------------------------------------------------------------------
Used **CPUS**: 12
Used **LOADAVG**: 18
Used **AVAILABLE_MEM_GB**: 7538 GB
----------------------------------------------------------------------
**MAKE_JOBS**: -j12 --load-average=18 

Variables available for override (use 'make VAR_NAME=value'):
**CPUS** (default: number of processors, e.g., 'make CPUS=4')
**LOADAVG** (default: 1.5 times CPUS, e.g., 'make LOADAVG=54')
**AVAILABLE_MEM_GB** (default: memory available on the system in GB, e.g., 'make AVAILABLE_MEM_GB=4')
**MEM_PER_JOB_GB** (default: 1GB per job, e.g., 'make MEM_PER_JOB_GB=2')
----------------------------------------------------------------------
!!!!!! Build starts !!!!!!
mkdir -p "/home/user/heads/build/x86/qemu-coreboot-fbwhiptail-tpm2/vtpm"
swtpm_setup --create-config-files root skip-if-exist
File /home/user/.config/swtpm-localca.conf already exists. Refusing to overwrite.
make: *** [targets/qemu.mk:31: /home/user/heads/build/x86/qemu-coreboot-fbwhiptail-tpm2/vtpm/.manufacture] Error 1

Which would now need to now be (UNSUPPORTED ANYWAY POST #1661):

cd ~/heads && make BOARD=qemu-coreboot-fbwhiptail-tpm2 PUBKEY_ASC=~/pubkey.asc inject_gpg 
sudo make BOARD=qemu-coreboot-fbwhiptail-tpm2  USB_TOKEN=Nitrokey3NFC PUBKEY_ASC=~/pubkey.asc ROOT_DISK_IMG=~/qemu-disks/debian-9.cow2 run

@tlaurion
Copy link
Collaborator

@123ahaha https://osresearch.net/general-building/#generic should address the confusion linked to nix buildstack change and prevent other users encountering initrd corruption and rom non-reproducibility.

Feel free to tag @tlaurion if you disagree with this issue closure.

@JonathonHall-Purism
Copy link
Collaborator

This could be plausibly explained by a bad flash, IMO. I've had exactly this happen for a bad flash - the initrd follows the kernel, so it is possible for a flash to be interrupted at a point where the kernel loads but the initrd is corrupt, producing this result. Hardware flash would then solve it, which is consistent with the observed behavior.

There was a rebuild in between (not sure whether it was a clean + build or just 'make' again), but that too does not point to a problem in the build enviroment.

If the rebuild actually did change the ROM (no way to know now), a more likely explanation IMO is a deficiency in our Makefile that didn't rebuild something that was needed. I'm pretty sure there are still some examples of this (e.g. if you commit, it doesn't necessarily rebuild everything that depends on the version, or if you check out a new commit and module installed files have changed, old files can be left in install/x86/, etc.).

In the future, for a postmortem of a nonbooting flash, it'd be really helpful to dump the ROM contents before hardware flashing for postmortem analysis, if possible 🙂

@JonathonHall-Purism if this kind of issues happen more then one other time in the future, I think I will move the local docker build creation steps in the devel section of heads-wiki and force users to download docker image only, so that nothing can come in between what is done by CI and what users can do.

Probably best for a larger discussion if you really thinking about making such a change - but IMO I think it is worthwhile to keep the docker image build steps where they are.

  1. Like I said above I don't really think the evidence here points to a problem in the build environment 🤔
  2. The goal of reproducible builds is to distribute trust, rather than centralizing trust in whoever produced that particular build. If the build can only be reproduced by downloading a nonreproducible binary artifact produced by a single party, it defeats the purpose of reproducible builds.
  3. Reproducing by rebuilding the build environment with Nix actually works today 🤩 I generally think Nix is precise enough that it will always produce a functionally-equivalent build environment from the information in the repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants