-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tools: Use DynamicUser
for cockpit.service
#20425
Conversation
@travier FYI |
LGTM then (with my limited knowledge of the code base) :) |
src/systemd/cockpit.service.in
Outdated
@@ -3,14 +3,16 @@ Description=Cockpit Web Service | |||
Documentation=man:cockpit-ws(8) | |||
Requires=cockpit.socket | |||
Requires=cockpit-wsinstance-http.socket cockpit-wsinstance-https-factory.socket | |||
After=cockpit-wsinstance-http.socket cockpit-wsinstance-https-factory.socket | |||
# we need to start before the sockets so that the dynamic user exists | |||
Before=cockpit-wsinstance-http.socket cockpit-wsinstance-https-factory.socket |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like there's a race here...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specifically: if cockpit.service
manages to come online (ie: by socket activation the first time someone connects) and receives the first few bytes of data, noticing that it's not a TLS connection, it will immediately attempt to contact the http
socket, which according to the dependency information here is not necessarily there yet.
I think that this race is probably not going to be lost often (since spawning the process takes time, and systemd is probably directly on its way to create those listeners immediate after the process is forked off) but it's definitely a possibility...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way I dealt with this in #16811 is by having a separate cockpit-wsinstance-socket-owner.service
which was the thing that created the dynamic user. Everything depended on that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for spotting! I agree, this is racy. I adopted your approach with the intermediate unit. Tested with static and with dynamic user, works nicely!
Aww crap -- I broke this independently in cockpit-project/bots#6348 - fixed in PR #20440 Update: rebased |
Not necessary any more since commit 644116a.
This is an unrelated top flake. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a nice cleanup, thanks. I'm happy to approve it in its current form.
I'd be happier if there was a test for what happens when we explicitly systemctl stop cockpit-ws-user.service
in various states of "running" cockpit-ws
.
If you want to first investigate that this works fine, in theory, we could punt that addition to a follow-up?
systemd does not do this by default as for "front-end" services the .service unit may continue to run when stopping the .socket. But this isn't the case for our cockpit-wsinstance* sockets -- they go down together with cockpit.service. This keeps /run/cockpit/wsinstance/ clean while the sockets aren't running. Purely cosmetical, but nicer.
There are no autoconf substitutions in them. This makes it a bit easier to compare them against running systemd and try changes.
It fits better at the top after Description than in between the dependency relations. Some units already have it like that.
Since commit 644116a, the webserver certificates don't have to be owned by the cockpit-ws user/group any more. This allows us to use `DynamicUser` for cockpit.service, which eliminates the persistent `cockpit-ws` system user. The ordering matters a lot here to avoid race conditions: The wsinstance socket units need to start before cockpit.service (so the the first incoming request through cockpit.socket can get forwarded to an instance), but after the dynamic user got created. So we can't have cockpit.service create that dynamic user by itself. Instead, create a separate `cockpit-ws-user.service` helper unit that orders itself in between. Thanks to Allison Karlitskaya for the idea! Note that we can't yet eliminate `cockpit-wsinstance` as that's the owner of our `cockpit-session` suid root binary.
@allisonkarlitskaya Verified, I added an integration test which kills the user service helper, and also strengthened the general cleanup check. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the tests :)
Since commit 644116a, the webserver certificates don't have to
be owned by the cockpit-ws user/group any more. This allows us to use
DynamicUser
for cockpit.service, which eliminates the persistentcockpit-ws
system user.Note that we can't yet eliminate
cockpit-wsinstance
as that's theowner of our
cockpit-session
suid root binary.Spawned from #20365. This will eliminate at least half of the problem. PR #16811 was an attempt to do this for wsinstance, but this is much harder. #20365 is based on top of that to use sysusers.d for cockpit-wsinstance.
I tested this in Fedora both with upgrades from earlier versions and with a fresh install. If the user already exists statically, it will just be used. Otherwise, the service starts up fine now.
We actually cover both cases in our CI: Cockpit's VMs have cockpit-ws pre-installed and thus test "existing static user", while packit installs the package from scratch and tests the dynamic user case.