-
Notifications
You must be signed in to change notification settings - Fork 1.4k
COS-3013: overlay node image before bootstrapping if necessary #8742
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
#!/bin/bash | ||
set -euo pipefail | ||
|
||
UNIT_DIR="${1:-/tmp}" | ||
|
||
if ! rpm -q openshift-clients &>/dev/null; then | ||
ln -sf "/etc/systemd/system/node-image-overlay.target" \ | ||
"${UNIT_DIR}/default.target" | ||
fi |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
# This is a separate unit because in the assisted-installer flow, we only want | ||
# `node-image-overlay.service`, not the isolating back to `multi-user.target`. | ||
|
||
[Unit] | ||
Description=Node Image Finish | ||
Requires=node-image-overlay.service | ||
After=node-image-overlay.service | ||
|
||
[Service] | ||
Type=oneshot | ||
# and now, back to our regularly scheduled programming... | ||
ExecStart=/usr/bin/echo "Node image overlay complete; switching back to multi-user.target" | ||
ExecStart=/usr/bin/systemctl --no-block isolate multi-user.target |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
[Unit] | ||
Description=Node Image Overlay | ||
Requires=node-image-pull.service | ||
After=node-image-pull.service | ||
|
||
[Service] | ||
Type=oneshot | ||
ExecStart=/usr/local/bin/node-image-overlay.sh | ||
RemainAfterExit=yes |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
[Unit] | ||
Description=Node Image Overlay Target | ||
Requires=basic.target | ||
|
||
# for easier debugging | ||
Requires=sshd.service getty.target systemd-user-sessions.service | ||
|
||
Requires=node-image-overlay.service | ||
Requires=node-image-finish.service |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
[Unit] | ||
Description=Node Image Pull | ||
Requires=network.target NetworkManager.service | ||
After=network.target | ||
|
||
[Service] | ||
Type=oneshot | ||
# we need to call ostree container (i.e. rpm-ostree), which has install_exec_t, | ||
# but by default, we'll run as unconfined_service_t, which is not allowed that | ||
# transition. Relabel the script itself. | ||
ExecStartPre=chcon --reference=/usr/bin/ostree /usr/local/bin/node-image-pull.sh | ||
ExecStart=/usr/local/bin/node-image-pull.sh | ||
MountFlags=slave | ||
RemainAfterExit=yes |
This file was deleted.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
#!/bin/bash | ||
set -euo pipefail | ||
|
||
ostree_checkout=/ostree/repo/tmp/node-image | ||
if [ ! -d "${ostree_checkout}" ]; then | ||
ostree_checkout=/var/ostree-container/checkout | ||
fi | ||
|
||
echo "Overlaying node image content" | ||
|
||
# keep /usr/lib/modules from the booted deployment for kernel modules | ||
mount -o bind,ro "/usr/lib/modules" "${ostree_checkout}/usr/lib/modules" | ||
mount -o rbind,ro "${ostree_checkout}/usr" /usr | ||
rsync -a "${ostree_checkout}/usr/etc/" /etc | ||
|
||
# reload the new policy | ||
echo "Reloading SELinux policy" | ||
semodule -R | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We could, but I'd rather like to wait and see what kind of issues we hit before pre-emptively doing that. This may inform other things in this approach. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,87 @@ | ||
#!/bin/bash | ||
set -euo pipefail | ||
|
||
# shellcheck source=release-image.sh.template | ||
. /usr/local/bin/release-image.sh | ||
|
||
# yuck... this is a good argument for renaming the node image to just `node` in both OCP and OKD | ||
coreos_img=rhel-coreos | ||
{{ if .IsOKD }} | ||
coreos_img=stream-coreos | ||
{{ end }} | ||
until COREOS_IMAGE=$(image_for ${coreos_img}); do | ||
echo 'Failed to query release image; retrying...' | ||
sleep 10 | ||
done | ||
|
||
# need to use rpm-ostree here since `bootc status` doesn't work in the live ISO currently | ||
# https://github.com/containers/bootc/issues/1043 | ||
booted_version=$(rpm-ostree status --json | jq -r .deployments[0].version) | ||
|
||
echo "Currently on CoreOS version $booted_version" | ||
echo "Target node image is $COREOS_IMAGE" | ||
|
||
# try to do this in the system repo so we get hardlinks and the checkout is | ||
# read-only, but fallback to using /var if we're in the live environment since | ||
# that's truly read-only | ||
ostree_repo=/ostree/repo | ||
ostree_checkout="${ostree_repo}/tmp/node-image" | ||
hardlink='-H' | ||
if grep -q coreos.liveiso= /proc/cmdline; then | ||
jlebon marked this conversation as resolved.
Show resolved
Hide resolved
|
||
ostree_repo=/var/ostree-container/repo | ||
ostree_checkout=/var/ostree-container/checkout | ||
mkdir -p "${ostree_repo}" | ||
echo "In live ISO; creating temporary repo to pull node image" | ||
ostree init --mode=bare --repo="${ostree_repo}" | ||
# if there are layers, import all the content in the system repo for | ||
# layer-level deduping | ||
if [ -d /ostree/repo/refs/heads/ostree/container ]; then | ||
echo "Importing base content from system repo for deduplication" | ||
ostree pull-local --repo="${ostree_repo}" /ostree/repo | ||
fi | ||
# but we won't be able to force hardlinks cross-device | ||
jlebon marked this conversation as resolved.
Show resolved
Hide resolved
|
||
hardlink='' | ||
else | ||
# (remember, we're MountFlags=slave) | ||
mount -o rw,remount /sysroot | ||
fi | ||
|
||
# Use ostree stack to pull the container here. This gives us efficient | ||
# downloading with layers we already have, and also handles SELinux. | ||
echo "Pulling ${COREOS_IMAGE}" | ||
while ! ostree container image pull --authfile "/root/.docker/config.json" \ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My assumptions / understanding just to confirm:
Question: even though we only want the "node" and "coreos" layers, is it correct to assume that this command will pull the entire image including its rhel base image, unless that rhel base image happens to be the same as the one we're running at the moment? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yes, exactly!
Nomenclature here gets messy, but basically I've been using:
The really confusing thing is that the node image in the payload is still called Here, we're running on a RHEL CoreOS system (e.g. we still need Ignition for example). And the
Think of the "layers" described in the enhancement as more conceptual than literal. In practice, the base RHCOS OCI images are chunked into many layers, grouping together files based on heuristics to try to maximize likelihood of sharing content. The node layer I guess is both conceptual and literal, since it all happens in one Though I will note it'll be more inefficient than that at the start because the node image will still be built the old way while the bootimages have moved to the new model. And of course, one can't not mention zstd:chunked here which should make it much easier to get efficiency than when playing with layers where we need to make sure everything is lined up just right (and notably that layer digests haven't been recalculated somewhere along the way in the pipelines). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
If you want an example of this BTW, you can look at the layers for any of the images in https://quay.io/repository/fedora/fedora-coreos?tab=tags. |
||
"${ostree_repo}" ostree-unverified-image:docker://"${COREOS_IMAGE}"; do | ||
echo 'Failed to fetch release image; retrying...' | ||
sleep 10 | ||
done | ||
|
||
# ideally, `ostree container image pull` would support `--write-ref` or a | ||
# command to escape a pullspec, but for now it's pretty easy to tell which ref | ||
# it is since it's the only docker one | ||
ref=$(ostree refs --repo "${ostree_repo}" | grep ^ostree/container/image/docker) | ||
if [ $(echo "$ref" | wc -l) != 1 ]; then | ||
echo "Expected single docker ref, found:" | ||
echo "$ref" | ||
exit 1 | ||
fi | ||
ostree refs --repo "${ostree_repo}" "$ref" --create coreos/node-image | ||
|
||
# massive hack to make ostree admin config-diff work in live ISO where /etc | ||
# is actually on a separate mount and not the deployment root proper... should | ||
# enhance libostree for this (remember, we're MountFlags=slave) | ||
jlebon marked this conversation as resolved.
Show resolved
Hide resolved
|
||
if grep -q coreos.liveiso= /proc/cmdline; then | ||
mount -o bind,ro /etc /ostree/deploy/*/deploy/*/etc | ||
fi | ||
|
||
# get all state files in /etc; this is a cheap way to get "3-way /etc merge" semantics | ||
etc_keep=$(ostree admin config-diff | cut -f5 -d' ' | sed -e 's,^,/usr/etc/,') | ||
|
||
# check out the commit | ||
echo "Checking out node image content" | ||
ostree checkout --repo "${ostree_repo}" ${hardlink} coreos/node-image "${ostree_checkout}" --skip-list=<(cat <<< "$etc_keep") | ||
|
||
# in the assisted-installer case, nuke the temporary repo to save RAM | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How much RAM do we need to budget for this? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What is special about the assisted installer here? Does the assisted installer treat the bootstrap node separately? Perhaps this is mixed up with agent installer? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I assume it's because assisted-installer runs the "bootstrap in place" process from a live ISO, so the filesystem is in RAM. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Pretty much the size of the node image itself. I said ~1G when we discussed, but looking now, that's the compressed size. Uncompressed it's ~2.2G. I would add a healthy percent margin on top. Worth mentioning I did test a SNO install with this in a 16G VM which is the minimum requirement. I'm not sure though how much headroom was left. I was actually pleasantly surprised when working on this to learn that bootstrapping needs less than 16G of container images. :) Do you need more precise information?
Right, it's because we're booted from the ISO. We can't import into the system OSTree repo since it's read-only as part of the squashfs, so we have to create a separate OSTree repo and import there. That separate repo is in |
||
if grep -q coreos.liveiso= /proc/cmdline; then | ||
echo "Deleting temporary repo" | ||
rm -rf "${ostree_repo}" | ||
fi |
This file was deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eranco74 @carbonin this is the proxy test for whether the current filesystem needs all of the "node layer" stuff to be layered on. Any concerns for the assisted-installer live ISO? We don't install this, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we don't install anything additional on the rootfs