-
Notifications
You must be signed in to change notification settings - Fork 359
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Flatpak preinstallation as part of a DNF install #6056
base: main
Are you sure you want to change the base?
Conversation
f2f1cb2
to
0ec6303
Compare
Funky. This seems like the biggest thing that would mean this wouldn't work for bootc installs as is.
This also relates to containers/bootc#833 (part of bootc's LBIs). And this overall effort also kind of relates to #5197 - specifically where I was thinking that it makes sense to restructure things in the general case so that we do an install to a disk, and then we parse the target root and make further decisions based on that for the bootc flow at least. When we do that, then the need for a lot of magic goes away and we can just parse The thing I would note here is yes (short of the "artifact pattern" I brought up for bootc) we do lose the space calculation but...what I could imagine doing in the general case is having a "minimum size" as an annotation on the container image at least for bootc. Yes it's obviously nice to have precise byte-level info but eh...as long as it's roughly correct that seems sufficient in the general case. |
The list of refs to install comes from a virtual function in the payload, so for rpm-ostree/bootc, you do one of two things: A) Download the contents of /etc/flatpak/preinstall.d to find the same information (this would be pretty easy with an ostree repository, but not so much with a bootc image as OCI or containers/storage). The virtual provides are just a duplicate of this information for the installer's convenience. What is hard to support with the current framework the case where we are installing directly from registry.redhat.io and then that image pulls in some set of Flatpaks for preinstallation also from registry.redhat.io.
Yeah, in the bootc case where the image is bound to set of installed containers, that could be done For DNF case is that until software selection is done, we have no idea whether this is a minimal server install, or a desktop install with runtime+firefox+thunderbird, and reserving several gigabytes extra of space for every RHEL install just in case seemed really ugly. |
Side note: zstd:chunked makes it possible to efficiently access a subset of the files from a remote container image. Not saying we should rely on it or that it's even a good idea, just noting it.
I would implement this as "installer generation copies /etc/flatpak/preinstall.d from the target to the installer ISO". I can imagine use cases where the installer environment has a flatpak that we don't want to be present on the target.
I was thinking of the "minimum space label" in the bootc context, not the RPM context. But I could totally imagine a |
Even so, you'd need to handle layering, etc. Would be messy. Maybe if it's implemented it inside the rpm-ostree stack...
The case where the installer is ostree based and needs a Flatpak (Firefox say) at install time and we want to use the image we're shipping for preinstallation rather than baking the Flatpak into the installer's ostree is tricky - currently installing an image as a Flatpak involves importing it from scratch into the target's /var/lib/flatpak/repo - while doing that at installer boot time might work it's definitely ugly. I can wave my hands here and say something about "composefs support for Flatpak" and "common object store" - but quite a bit of non-trivial work would be involved :-) |
Are you trying to handle a case where there's a single ISO that can support server or desktop install (like the current RHEL 9 ISO) but where flatpaks can exist on the installer ISO but whether or not to install those is a dynamic determination? Rereading, it seems you must be, which I can understand. But it sure does ratchet the complexity up versus shipping a separate opinionated desktop ISO. I know this may seem somewhat radical but did you consider at all making the payload metadata here run via a OCI and not by RPM? And by this I mean something like shipping a container image like |
Yes - in RHEL 10, Firefox is provided only by Flatpak, so the installer ISO has to install the Flatpak iff. the user has selected the appropriate software. (Also Thunderbird, but less important.)
Unfortunately, "server with GUI" is still a thing - we need to treat adding a desktop interface as something that you can optionally add onto a server install.
Don't quite follow how that would work in detail, but generally trying to keep this a fairly minimal change to how we install RHEL. A) system image defines what Flatpaks should be installed B) Anaconda invokes Flatpak to make that happen before first boot. The "magical" Provides links just allow us to fast-forward and figure out what Fltapaks should be installed as soon as the DNF transaction is resolved. |
I don't think in practice there's a way to avoid that. Moving some apps into flatpaks has a large impact on UX both at install time and postinstall. In particular one thing I have learned the hard way is a lot of important customers do disconnected installs - we have to support that super well. And they do it for good reason (being directly connected to the internet brings large risks). Moving things into flatpaks from RPMs means that these people will have to figure out how to mirror the containers in addition to RPMs. This problem was so painful for OpenShift that we moved everything (except the base RHCOS disk images) into a release image which is just a reference to 100 other containers, including the operating system content. So people who want to disconnected installs just need to know how to mirror containers (not RPMs!). Yes, my suggestion of having the metadata for what to install be referenced via a container image (or perhaps better for this use case an OCI artifact) may seem radical but since containers are already involved (via flatpak) I don't think it actually is too radical. Is it hackier to reference containers from an RPM or RPMS + containers from another container (or OCI artifact)? I think the latter is less hacky. It's of course worth noting that I have a pretty strong interest in my own giant change for how RHEL is installed via bootc, where RPM is not the technical root, but I do want flatpak preinstalls to work for that too. If we structured the input to both as a container image (or artifact) that would work more smoothly I think. |
BTW I should clarify that I am not an Anaconda developer so my comments here should not be construed as blocking, merely a discussion. |
@@ -0,0 +1,112 @@ | |||
# | |||
# Kickstart module for Live OS payload. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copy paste typo :)
Hi @owtaylor, we did some changes in the code to enable pylint changes and we also changed the |
@cgwalters, I see your points and honestly I'm thinking how to involve or not involve DNF into the loop. I know the current implementation is attached to DNF payload which works for standard installations but I'm not sure how it will be handled by bootc so we need to find this out. After we have this initial version set, I'm planning to schedule a meeting between you, @owtaylor, us and DNF team to figure the ideal solution for RHEL-11. We should think it through to find out what is the best solution (and I'm not saying that this is not the one). In general it would be great to find solution which would be integrating all the pieces together. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a partial review right now. So far I wasn't able to find any big issue but the whole concept needs a bit of discussion in the team. I mean the whole solution of attaching this to DNF will not work for other types of payload.
I'll finish this review on January. Thanks for the PR!
# | ||
# Kickstart module for Live OS payload. | ||
# | ||
# Copyright (C) 2019 Red Hat, Inc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also the year should be fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
Extract the code for creating a "downloader" (prepared session.get() method) from the tree_info file - we'll use this when querying and downloading Flatpak repositories from remote sources as well.
We'll need the functionality of picking a download location and calculating required space for Flatpak installation as well as for DNF installation, so move the utility functions into payloads/base/utils.py.
This is needed for the flatpak-oci-authenticator daemon, which is needed for all installs from OCI remotes, whether we are authenticating to them or not.
Add new functionality for running Flatpak preinstallation after installing the base system. It's initially implemented only for the DNF payload. Flatpak installation is done as an extra "side" payload that the payloads service calls after the base payload to install additional content. The base payload provides the list of Flatpak refs that should be installed from the side payload and the Flatpaks are installed from a location that is determined from the base payload's primary source - if the source is a local or remote install tree we look for a OCI image layout there, if the base payload is the network (CDN, closest mirror), we install directly from the configured Flatpak remote. For http/ftp payloads we mirror the OCI image layout locally before runing the installation.
0ec6303
to
e193012
Compare
This PR handles disconnected installation whether from a DVD or hard drive install, or from an airgapped ftp/http server - the install tree contains an OCI image layout with all the Flatpaks. It does not, of course, handle disconnected updates - that's outside the scope of Anaconda. It's possible to handle in the same way as this PR works, create an OCI image layout with the latest versions of the Flatpaks (using
I think this is very compatible with bootc - the current approach is intended to be quite aligned with the way you are handling LBI's. If we wanted to pull the set of Flatpaks to preinstall out from the filesystem tree and expose it in container metadata when creating the bootc image, we could definitely do that - that would give the same ability we have for DNF to preflight space calculations. But I don't see a need to break the integration with DNF software selection that we have in this patch - different install method, different integration with the base payload. |
I've rebased it to origin/main, dealing with the import reordering, and checked that the result passes ruff. I have not actually tried an install with the rebase (need to pack for the holidays), but hopefully the imports are at least very close to correct. 😃 |
The integration with DNF here is pluggable, and adapting to bootc should not be hard. I was hoping to actually just go and implement that to clarify how it would work, but figuring out how to create a modified anaconda+bootc install image is more than I have cycles for in the short term.
That sounds good. Basic bootc support is really more something we'll need for 10.x but should be < 100 lines of code on top of this. |
/kickstart-test --testtype smoke |
Is anaconda-webui on the roadmap for 10.x? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First round of the review from me. Some notes below.
|
||
if self._active_payload.needs_flatpak_side_payload(): | ||
payload = self.create_payload(PayloadType.FLATPAK) | ||
assert isinstance(payload, FlatpakModule) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please do not use asserts in the production code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for the purposes of type checking (I'm using vscode with pylance) - structurally we know the result is a FlatpakModule but tooling doesn't. I would say this is the idiomatic way to do it, but it could be also written:
payload = typing.cast(self.create_payload(PayloadType.FLATPAK), FlatpakModule)
Would that be better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we want to change code because of tooling. I'm not strictly against the second solution but it would have to be with comment explaining why this is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like:
# create_payload() creates a different subclass of PayloadBase depending on the argument;
# typing.cast() the result, since we are going to call FlatpakModule methods
payload = typing.cast(self.create_payload(PayloadType.FLATPAK), FlatpakModule)
I can also just drop it... it's not the only place in Anaconda that shows type warnings in vscode, obviously.
[ having typechecking in your editor really makes development in Python much more productive, highly recommend ]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we want to include now here and in that case I think it should be resolved by the python typing properly on level of method declaration. I'll bring this on the meeting discussion if we don't want to embrace it in our code.
If possible, I wouldn't include it here as the first place this way.
return collection, "/".join(parts) | ||
|
||
|
||
class NoSourceError(Exception): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this exception could reach UI (other side of DBus) it should be placed in modules/common/errors/payload.py
and it needs to have @dbus_error
decorator. The decorator will change the exception to the DBus error and you can create the exception on the UI side.
Also there is already SourceError
exception which I think could be used here with a details in the message.
If it is used only internally then I would recommend it to move to payload/flatpak/errors.py
anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can't reach the UI with the current code structure - both places where it could be thrown are caught in FlatpakManager. Currently we log and ignore to try and make things somewhat robust if a user hasn't set things up properly for a DVD/FTP/HTTP install - you at least get everything else installed. But yes, there is some argument for erroring out as well, which could encourage this to be D-Bus error that just throws through. Do you have an opinion on log+ignore or error out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it is just about printing error than it should be fine to just use log.error
without breaking the installation completely.
|
||
@property | ||
def labels(self): | ||
return self.config_json["config"]["Labels"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great if we can just use a tool for this data grabbing. The tool could be part of the flatpak package and be inline when the metadata will change. In general, if possible we are trying to avoid having too much dependency on someones metadata in Anaconda because these could change in time and most of the time the authors don't know we are using it.
From a quick look it seems that half of the source.py
code is about metadata parsing and reading.
This is probably for future improvement but I would like to know what you are thinking about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem here is that only Anaconda knows how to fetch files from a FTP server with the right proxy options, etc. So all of the fetching has to be done by Anaconda and that leaves (from my perspective) a quite messy and ragged interface between the "fetching" part and the "decoding" part. I would have trouble structuring it.
I don't think there's much room for instability here - the file formats are OCI specified, and the metadata strings have to be stable or existing versions of Flatpak will stop being able to use Flatpaks from the registry. Nothing we're using here has changed in ~5 years.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point. We should be fine with this, definitely for now.
pyanaconda/timezone.py
Outdated
import zoneinfo | ||
from collections import OrderedDict | ||
from functools import cache | ||
|
||
import langtable | ||
import zoneinfo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this expected change in this commit? Looks to me like a bug from the rebase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that snuck in accidentally - I ran ruff --fix
at some point, though I don't know why it would do that move. Will fix that in the next iteration.
# pylint: disable=environment-modify | ||
os.environ["FLATPAK_DOWNLOAD_TMPDIR"] = os.path.join(conf.target.system_root, "var/tmp") | ||
# pylint: disable=environment-modify | ||
os.environ["FLATPAK_CONFIG_DIR"] = os.path.join(conf.target.system_root, "etc/flatpak") | ||
# pylint: disable=environment-modify | ||
os.environ["FLATPAK_OS_CONFIG_DIR"] = os.path.join(conf.target.system_root, "usr/share/flatpak") | ||
# pylint: disable=environment-modify | ||
os.environ["FLATPAK_SYSTEM_DIR"] = os.path.join(conf.target.system_root, "var/lib/flatpak") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these safe to be used on a running system?
We support a lot of variety environments and one of it is Live or image installation. It would be bad if we break the user system after installation because we have changed the environment variables of the system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These environment variables changes should only affect anaconda and it's child processes, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last I checked setting os.environ
in a multithreaded process was still generally unsafe, and Anaconda is definitely one. xref https://sourceware.org/bugzilla/show_bug.cgi?id=15607
And lots of other places, e.g. https://internals.rust-lang.org/t/synchronized-ffi-access-to-posix-environment-variable-functions/15475
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last I checked setting
os.environ
in a multithreaded process was still generally unsafe, and Anaconda is definitely one. xref https://sourceware.org/bugzilla/show_bug.cgi?id=15607
The import happens before threads are started. I originally had it very explicitly happening at process initialization: https://github.com/owtaylor/anaconda/blob/flatpak-preinstallation-try-1/pyanaconda/modules/payloads/__main__.py but @jkonecny12 didn't like Flatpak code leaking into a generic place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point. In that case you can move it to payloads/flatpak/flatpak.py
. My point was mainly to move it out of the shared payloads code which is imported to every payload.
:param download_location: path to location for temporary downloads | ||
:param progress: used to report progress of the download | ||
:returns sideload location, including the transport (e.g. oci:<path>), or None | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you want to add pass
or ...
also here even if it is not necessary.
if runtime_ref not in result: | ||
result.append(runtime_ref) | ||
except (NoSectionError, KeyError): | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably at least warn about this with log.warning
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so - it's not an error condition, it's just an Runtime (common), or an application that doesn't use a runtime (logically possible, IIRC, but exceedingly rare.)
return result | ||
|
||
def _download_blob(self, downloader, download_location, digest, stream=False): | ||
assert digest.startswith("sha256:") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove this assert too. Instead raise an exception if necessary.
Fixes from comments on rhinstaller#6056
bd2dcdf
to
7df1152
Compare
I pushed a new version that hopefully addresses all your comments. I left the fixes in a fixup patch for ease of re-review - I can squash and repush if they look OK. |
The changeset looks good to me. Feel free to squash this. And thanks for this, it really simplifies review ;) |
Add new functionality for running Flatpak preinstallation after installing the base system. It's initially implemented only for the DNF payload.
Flatpak installation is done as an extra "side" payload that the payloads service calls after the base payload to install additional content. The base payload provides the list of Flatpak refs that should be installed from the side payload and the Flatpaks are installed from a location that is determined from the base payload's primary source - if the source is a local or remote install tree we look for a OCI image layout there, if the base payload is the network (CDN, closest mirror), we install directly from the configured Flatpak remote registry.
flatpak-preinstall(REF)
RPM provides rather than directly examining the /etc/flatpak/preinstall.d of the installed system - this allow us to do accurate pre-installation computation of the required install size.I have not worked on tests yet, to wait on final agreement on the structure of the code.
TODO
Tests pass and cover your changes. You can run tests locally manually, or have them run as part of the PR. A team member will have to enable tests manually after inspecting the PR.
Release notes and docs if the PR adds something major or changes existing behavior.