Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a doc for container provisioning and updates #540

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

cgwalters
Copy link
Member

The layering model is an entirely new way to do systems management. Let's document the current state.

Copy link
Member

@dustymabe dustymabe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments. I'm still not sure we should doc this (we were making progress on coreos/fedora-coreos-tracker#1263 but then f38 and rhel9 fires took us away) yet.

If we do I think we should call out more clearly the need to implement your own build system and the loss of managed autoupdates (update graph). The autoupdates part is kind of implied because of the documented update unit/timer, but we should make it clearer.

modules/ROOT/pages/deriving-container.adoc Show resolved Hide resolved
modules/ROOT/pages/deriving-container.adoc Outdated Show resolved Hide resolved
modules/ROOT/pages/deriving-container.adoc Outdated Show resolved Hide resolved
modules/ROOT/pages/deriving-container.adoc Outdated Show resolved Hide resolved
modules/ROOT/pages/deriving-container.adoc Outdated Show resolved Hide resolved
modules/ROOT/pages/deriving-container.adoc Outdated Show resolved Hide resolved
@cgwalters cgwalters force-pushed the doc-firstboot-rebase-container branch from f54e15e to 1626789 Compare May 2, 2023 12:20
@cgwalters
Copy link
Member Author

I'm still not sure we should doc this (we were making progress on coreos/fedora-coreos-tracker#1263

I commented on the issue. Indeed, documenting this is tantamount to support. But I think the state of things will be pretty clear to users from the docs.

If we do I think we should call out more clearly the need to implement your own build system and the loss of managed autoupdates (update graph). The autoupdates part is kind of implied because of the documented update unit/timer, but we should make it clearer.

I think automatic updates are still present, it just requires slightly more work. But I believe that anyone who is doing anything nontrivial with a FCOS like system will already be invested in systems management infrastructure, and specifically using containers.

cgwalters added a commit to cgwalters/rpm-ostree that referenced this pull request May 2, 2023
This replaces the need for `StandardOutput=null` in the automatic
upgrade unit, and can also be used for custom upgrade units, such
as that proposed in
coreos/fedora-coreos-docs#540
Copy link
Member

@jlebon jlebon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documenting it as is today seems fine but given the high friction and shortcomings, let's label it as experimental?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be linked from the nav tree? Not sure where it'd belong... Going along with the experimental labeling comment, maybe easiest is to have a new "Experimental features" parent with this as the first child?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would you consider experimental about this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or, to flip it around - in your view, what would be the criteria to "graduate"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just doesn't seem to me like we've worked through all the details. Container layering impacts multiple parts of FCOS we're deeply opinionated about. We started on this in e.g. coreos/fedora-coreos-tracker#1219 and coreos/fedora-coreos-tracker#1263 but then... fires happened.

At a minimum, I think we should resolve coreos/fedora-coreos-tracker#1263. But also, I think we would need a larger rework of what tests we run and how the docs are structured so it's properly integrated in our provisioning and configuration story. (But ideally, we also introduce a better UX for this stuff.)

How about something like this near the top:

NOTE: Container layering is a new approach to provisioning and configuring Fedora CoreOS. The underlying features are considered stable, but its integration into Fedora CoreOS are subject to change and nodes may in the future require reprovisioning. Note that in this mode, automatic updates are not directly managed by the Fedora CoreOS team and rely on user-managed services.

?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just doesn't seem to me like we've worked through all the details.

Yeah but, speaking bluntly I feel like that's not going to happen unless I keep pushing it, hence this PR.

At a minimum, I think we should resolve coreos/fedora-coreos-tracker#1263.

OK, I have my opinion written there.

but also, I think we would need a larger rework of what tests we run

That's true, filed coreos/fedora-coreos-tracker#1484

But ideally, we also introduce a better UX for this stuff.)

That's partly in coreos/butane#428

and nodes may in the future require reprovisioning.

Mmmm. I guess that's a pivotal decision here. I am not forseeing any changes which would require reprovisioning. Are you?

Note that in this mode, autom1tic updates are not directly managed by the Fedora CoreOS team and rely on user-managed services.

This part is covered below there too, but sure we can emphasize it. I tossed up this page in https://hackmd.io/QM1V-FujTmalikgi5JQHPw to avoid round trips.
(yeah it's not actually markdown sadly, but I'm just using it as a realtime editor)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just doesn't seem to me like we've worked through all the details.

Yeah but, speaking bluntly I feel like that's not going to happen unless I keep pushing it, hence this PR.

You're welcome to help work through the details, too. 🙂

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I added a note to the top here which covers what is IMO the top last issue we considered "blocking" around the barriers. It's just up front about that outstanding bug, but I'm hopeful we'll have more infrastructure there by the time we see the need for another barrier.

cgwalters added a commit to cgwalters/rpm-ostree that referenced this pull request May 2, 2023
This replaces the need for `StandardOutput=null` in the automatic
upgrade unit, and can also be used for custom upgrade units, such
as that proposed in
coreos/fedora-coreos-docs#540
cgwalters added a commit to coreos/rpm-ostree that referenced this pull request May 2, 2023
This replaces the need for `StandardOutput=null` in the automatic
upgrade unit, and can also be used for custom upgrade units, such
as that proposed in
coreos/fedora-coreos-docs#540
@cgwalters
Copy link
Member Author

and nodes may in the future require reprovisioning.

Breaking this one out to the toplevel because I disagree with it the most. Everything done on the (rpm-)ostree side for this has been with an eye to compatibility and making things a seamless transition. Many people are using it for various use cases, from their personal desktops and it shipped in OpenShift 4.12.

Saying that reprovisioning may be required is in fundamental conflict with that. I personally am signed up and committed to debug any problems from this that may arise in the future. I don't understand why others on the team don't feel the same way.

cgwalters added a commit to cgwalters/rpm-ostree that referenced this pull request May 3, 2023
This is aiming to help replace the hacky systemd unit
created in coreos/fedora-coreos-docs#540

And to also be useful in other contexts (i.e. not coreos/ignition).
@cgwalters
Copy link
Member Author

but its integration into Fedora CoreOS are subject to change

Also of course, merging this is effectively saying that's not the case; the butane configuration here is proposed to work into the forseeable future.

The auto-update unit is pretty lame, but definitely functional. I've got some PRs up to try to clean this up and more directly support it in coreos/rpm-ostree#4392 but I wouldn't consider that a blocker, because ultimately the users pursuing this path are already in a position where they likely want to "own" more of the OS update logic, so these sample units are just a starting point. In effect actually layering is kind of saying that things like the zincati remote locking API is instead something that can be just owned by 3rd party agent logic; we provide lower level tools. This also relates to coreos/zincati#904

So the integration will certainly hopefully improve, from my PoV, but I would not say "change" as in "possibly breaking change".

cgwalters added a commit to cgwalters/coreos-layering-examples that referenced this pull request May 4, 2023
@cgwalters cgwalters force-pushed the doc-firstboot-rebase-container branch from 1626789 to c769533 Compare May 4, 2023 12:26
@cgwalters
Copy link
Member Author

OK, I've updated this with a "Understanding Ignition versus container content" section, and moved the autoupdate code into coreos/layering-examples#58

@cgwalters cgwalters force-pushed the doc-firstboot-rebase-container branch 2 times, most recently from 848fe4f to bba5303 Compare May 4, 2023 17:50
@cgwalters
Copy link
Member Author

OK, fleshed out the "When to use Ignition" section even more!

cgwalters added a commit to cgwalters/rpm-ostree that referenced this pull request May 5, 2023
This is aiming to help replace the hacky systemd unit
created in coreos/fedora-coreos-docs#540

And to also be useful in other contexts (i.e. not coreos/ignition).
cgwalters added a commit to cgwalters/rpm-ostree that referenced this pull request May 5, 2023
This is aiming to help replace the hacky systemd unit
created in coreos/fedora-coreos-docs#540

And to also be useful in other contexts (i.e. not coreos/ignition).

Closes: coreos#2843
@cgwalters
Copy link
Member Author

I reworked coreos/fedora-coreos-tracker#1363 to be a tracker that includes this PR

lukewarmtemp pushed a commit to lukewarmtemp/rpm-ostree that referenced this pull request Jun 7, 2023
This replaces the need for `StandardOutput=null` in the automatic
upgrade unit, and can also be used for custom upgrade units, such
as that proposed in
coreos/fedora-coreos-docs#540
lukewarmtemp pushed a commit to lukewarmtemp/rpm-ostree that referenced this pull request Jun 7, 2023
This is aiming to help replace the hacky systemd unit
created in coreos/fedora-coreos-docs#540

And to also be useful in other contexts (i.e. not coreos/ignition).

Closes: coreos#2843
The layering model is an entirely new way to do systems
management.  Let's document the current state.
@cgwalters cgwalters force-pushed the doc-firstboot-rebase-container branch from bba5303 to 2a55dc3 Compare September 6, 2023 19:23
@travier
Copy link
Member

travier commented Sep 7, 2023

From my perspective, this is blocked on coreos/fedora-coreos-tracker#1367

@cgwalters
Copy link
Member Author

cgwalters commented Sep 7, 2023

From my perspective, this is blocked on coreos/fedora-coreos-tracker#1367

blocked is IMO a strong term. Why is it blocked on that versus it just being a nice-to-have?

That's a pretty easy thing for someone who is not us to do on their own infrastructure. In fact I'd say supporting "mirror the base image to a custom registry and control it using tooling you know" is really a lot of the point of this.

@travier
Copy link
Member

travier commented Sep 7, 2023

For me this is really blocking any real usage of ostree native containers for FCOS.

Users need tags for FCOS releases to be able to test / run previous releases, use Git Ops to test their updates, use fixed versions in their CI, etc.

We have the same issue for Silverblue & friends.

@navaati
Copy link

navaati commented Nov 3, 2023

Hi. @cgwalters in coreos/rpm-ostree#4392 you said:

This is aiming to help replace the hacky systemd unit created in #540

I have no idea how it relates, now that the aforementioned PR is merged can you update this one to make use of it ? I may very well missing something :).

Also I see that coreos/butane#428 is closed, does that mean that the idea of easily using that feature from ignition is abandoned ? What would the end-game UX look like ?

Thanks y’all FCOS team for carrying the dream of immutable infrastructure !

After=network-online.target
[Service]
# This ordering is important
After=ignition-firstboot-complete.service

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is now coreos-ignition-firstboot-complete.service

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

From

[core@forgejo-actionsrunner ~]$ cat /etc/os-release 
NAME="Fedora Linux"
VERSION="39.20240210.3.0 (CoreOS)"
ID=fedora
VERSION_ID=39

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants