use DynamicUser=yes for all cockpit components #16811

allisonkarlitskaya · 2022-01-11T08:00:34Z

Switch over to systemd allocating user IDs for us dynamically, dropping our static users.

We create these users dynamically:

cockpit-session-socket is the group owner of /run/cockpit/session
cockpit-wsinstance-socket is the group owner of all sockets in /run/cockpit/wsinstance/
cockpit-tls gets its own user
cockpit-wsinstance-http{,s@}.service each get a user. In this case, I think the separate https instances even each get a separate user.

Right now it's not working. There seem to be quite some edge cases, particularly involving the use of DynamicUser in combination with SupplementaryGroup. Also, the approach to using virtual services to create the dynamic user is really strange.

I feel like we maybe shouldn't proceed here until we get better support for this from systemd.

Launch cockpit-session via socket activation on /run/cockpit/session #16808
Clean up obsolete cockpit-wsinstance user in spec and debian packaging on upgrade

martinpitt · 2022-04-13T11:21:42Z

I filed systemd/systemd#23067 to start the discussion.

martinpitt · 2024-05-10T10:03:07Z

Rebased after the rebasing of #16808 and the recent heavy changes to sysuser management. Locally this at least works for a simple local session, let's see if there's any fallout beyond what's already in #16808.

cockpit-wsinstance-http{,s@}.service each get a user.

I didn't port this bit yet, let's do that step separately. We have much more freedom to change this once everything is a DynamicUser.

martinpitt · 2024-05-10T11:52:21Z

Cool -- the test failures are by and large the same as in #16808. Just the unit lifecycle test fails in addition, but that'll be straightforward to fix -- and indeed we should adjust this some more to check proper cleanup after stopping.

cgwalters · 2024-11-05T15:26:00Z

xref containers/bootc#870 too here

Stop failing with "didn't receive expected authorize message", as that's confusing -- the user did nothing wrong. Instead, silently exit successfully, and let cockpit-ws handle the timeout. This is mostly occurring when enabling Negotiate (kerberos) authentication, and thus cockpit-ws always starts *two* cockpit-session attempts (one for Negotiate, one for PAM), like in TestIPA.testQualifiedUsers.

This makes it easier for callers to treat the reply as textual string. The only remaining user is cockpit-session, where we will need this behaviour in the following commit.

We want to parse more fields than just "response" from the authorize message in the message. So remove the very rigidly harcoded parsing in `read_authorize_response()` and let it return the whole JSON string instead. Add a new `get_authorize_key()` function that parses a single value from that, and adjust the callers accordingly. This is a very restricted parser to keep things simple: No spaces in the structure, and no escaping. We can assume all that as cockpit-ws sends very controlled messages, and '"' isn't used in base64 values. In the worst case we get a truncated value, which will just fail authentication. Localize some variables while we are at it.

Drop the duplicated (but not identical!) implementation of `read_authorize_response()` and use the real implementation.

Run socket-activated cockpit-session with correct context. By default it is `init_t`, but that will produce a memfd with the wrong permissions, which the session cannot read. Allow cockpit-ws to connect to /run/cockpit/session. Allow restricted user_t and sysadm_t sessions to communicate (but not connect) to cockpit-ws through the session unix socket. (Covered by TestLogin.testSELinuxRestrictedUser)

…codes cockpit-session exits with 5 on authentication failures, including `authentication-unavailable`; or with 127 on authentication timeouts. These aren't a reason to mark the unit as failed.

The point of this is really to determine the cgroup of our caller, i.e. the cockpit-ws process. With [email protected] this is different than cockpit-session's own cgroup. Rename the variables to clarify this.

If cockpit-ws directly spawns cockpit-session, it can process the 127 exit code by itself. But there is no exit code if that happens via unix socket. Check this condition explicitly and report `no-cockpit` via the protocol. This triggers a more specific error path in cockpit-ws and the login page, adjust TestConnection.testBasic accordingly.

The test idles on the login page for more than the 60s authorize timeout. When running cockpit-session via unix socket, this causes some unsightly "session timed out during authentication" messages. This ought to be handled better in cockpit-ws, but for the time being ignore these messages.

The various perform_*() functions all assume a non-NULL rhost, as several functions such as `btmp_log()` do unchecked strncpy() on them. Add assertions to (1) make them fail more gracefully and usefully), and (2) document that requirement more explicitly. This is currently guaranteed by having an explicit fallback to `""` if `$COCKPIT_REMOTE_PEER` isn't set.

cockpit_session_launch() doesn't set this env if the remote host is unknown. That is never the case in practice at least in our tests, but callers should still be aware of it.

Passing the remote peer from ws to cockpit-session via the `$COCKPIT_REMOTE_PEER` environment variable does not work in unix socket mode. So make that part of the protocol instead and attach it to the authorize response.

When cockpit-session's stdin is a Unix socket, it is being spawned by cockpit-ws through [email protected]. In that case it doesn't make sense to look at its own cgroup, but we need to check the cgroup of the socket peer (i.e. cockpit-ws). We must guard against PID recycling attacks: 1. Eve logs into cockpit, gets ws pid E, and hacks ws: connect to session, forks, keeps the session fd in a different process, and kills pid E. 3. Eve waits until Alice logs in again and happens to get ws pid E (which can happen with a sufficient number of forks, social engineering, and some luck). cockpit-session checks that pid E is in cgroup /cockpit/alice, and starts an alice session for Eve's ws. (Note: SO_PEERCRED gives you pid/uid/gid at the time connect() was made.) Thus require that the peer (ws) must have started earlier than cockpit-session. This is the same approach that polkit uses as a fallback if pidfds are not available: https://github.com/polkit-org/polkit/blob/main/src/polkit/polkitunixprocess.c Note that pidfds don't help us: There is no API to directly get from a pidfd to a cgroup, startup time, or /proc/<pid> dirfd, this has to happen via `pidfd_getpid()` and opening /proc/pid. But that's exactly what we want to avoid, and thus is pointless (they are also only available since kernel 6.5).

ws times out the authorization attempt after one minute (`cockpit_ws_auth_response_timeout`). Introduce a similar timeout on the session side. This makes PID recycling attacks harder, as their victim now has to log in within one minute.

Unless it's otherwise specified in the configuration file, we now spawn cockpit-session by connecting to /run/cockpit/session if that exists. Fall back to calling cockpit-session directly for custom setups. We leave the cockpit_ws_session_program variable in place to allow the tests to override things. Update the unit files for cockpit-ws to ensure that the socket is available when cockpit-ws is running. Adjust TestConnection.testBasic accordingly: When running cockpit-session via unix socket activation, its group permissions are irrelevant. More thoroughly move the binary away and also disable the socket, to fail both of cockpit-ws' session creation attempts. Co-Authored-By: Martin Pitt <[email protected]>

systemd spawns this for us now, so we don't need the setuid bit anymore. Clean up the statoverride in the Debian packaging on upgrades. However, that means that cockpit-ws cannot be run as `cockpit-wsinstance` user outside of the unit any more. Adjust our tests to run it as root instead.

Unfortunately, socket units can't set DynamicUser=yes, so add a dependency on a separate .service created just for this purpose. /run/cockpit/session is now owned by group cockpit-session-socket which also gets added as a supplementary group to the cockpit-ws units.

martinpitt · 2024-11-15T12:27:11Z

I want to see this all in action, the "look ma! no static users!" hands-free operation. I rebased/unconflicted this on top of current #16808, and fixed it up a little bit.

Similar to the last commit, we create a dynamic group for the sockets in /run/cockpit/wsinstance and add a supplementary group to cockpit-tls.

We only dynamically generate a single .socket file now. The rest of them are in version control and installed verbatim.

Co-Authored-By: Martin Pitt <[email protected]>

allisonkarlitskaya added the blocked Don't land until something else happens first (see task list) label Jan 11, 2022

allisonkarlitskaya requested a review from martinpitt January 11, 2022 08:00

allisonkarlitskaya temporarily deployed to cockpit-dist January 11, 2022 08:06 Inactive

martinpitt removed their request for review April 7, 2022 09:02

martinpitt mentioned this pull request Apr 13, 2022

DynamicUser= support for AF_UNIX socket units for ownership of the inode systemd/systemd#23067

Open

KKoukiou added the review-2022-12 label Dec 14, 2022

This was referenced Jan 3, 2023

ws: please provide sysusers.d entries for system users/groups #15027

Open

systemd: Use sysusers.d to create ws users #18112

Closed

martinpitt removed the review-2022-12 label Feb 15, 2023

martinpitt mentioned this pull request May 3, 2024

tools: Use DynamicUser for cockpit.service #20425

Merged

martinpitt force-pushed the dynamic-users branch from c01589f to 8460b13 Compare May 10, 2024 10:02

martinpitt added the no-test For doc/workflow changes, or experiments which don't need a full CI run, label May 10, 2024

martinpitt force-pushed the dynamic-users branch from 8460b13 to 4752cca Compare May 10, 2024 10:03

martinpitt mentioned this pull request Nov 1, 2024

/usr/libexec/cockpit-session has wrong owner in deployment #21201

Open

cgwalters mentioned this pull request Nov 5, 2024

/usr/libexec/cockpit-session has wrong owner in deployment containers/bootc#870

Closed

martinpitt added 11 commits November 15, 2024 12:56

common: NUL-terminate cockpit_frame_read() result

578ab40

This makes it easier for callers to treat the reply as textual string. The only remaining user is cockpit-session, where we will need this behaviour in the following commit.

ws: Re-use session-utils.c in mock-auth-command.c

1506287

Drop the duplicated (but not identical!) implementation of `read_authorize_response()` and use the real implementation.

systemd: Allow [email protected] to fail with some known exit …

44323f0

…codes cockpit-session exits with 5 on authentication failures, including `authentication-unavailable`; or with 127 on authentication timeouts. These aren't a reason to mark the unit as failed.

session: Rename my_cgroup* to ws_cgroup*

ab007ff

The point of this is really to determine the cgroup of our caller, i.e. the cockpit-ws process. With [email protected] this is different than cockpit-session's own cgroup. Rename the variables to clarify this.

doc: Clarify that $COCKPIT_REMOTE_PEER is optional

e167fa1

cockpit_session_launch() doesn't set this env if the remote host is unknown. That is never the case in practice at least in our tests, but callers should still be aware of it.

martinpitt and others added 6 commits November 15, 2024 12:56

ws, session: Pass remote peer address in authorize message

3fdfcdc

Passing the remote peer from ws to cockpit-session via the `$COCKPIT_REMOTE_PEER` environment variable does not work in unix socket mode. So make that part of the protocol instead and attach it to the authorize response.

session: Add authorize response timeout

04e4125

ws times out the authorization attempt after one minute (`cockpit_ws_auth_response_timeout`). Introduce a similar timeout on the session side. This makes PID recycling attacks harder, as their victim now has to log in within one minute.

tools: Move cockpit-session.socket to cockpit-ws package

f7da0f0

martinpitt mentioned this pull request Nov 15, 2024

Launch cockpit-session via socket activation on /run/cockpit/session #16808

Draft

12 tasks

martinpitt force-pushed the dynamic-users branch from 4752cca to e0fcb87 Compare November 15, 2024 12:26

allisonkarlitskaya and others added 3 commits November 15, 2024 13:35

systemd: dynamic group for wsinstance sockets

12b353e

Similar to the last commit, we create a dynamic group for the sockets in /run/cockpit/wsinstance and add a supplementary group to cockpit-tls.

.gitignore: only ignore cockpit.socket

ccfda34

We only dynamically generate a single .socket file now. The rest of them are in version control and installed verbatim.

systemd, tools: stop creating static cockpit-wsinstance user

cd54812

Co-Authored-By: Martin Pitt <[email protected]>

martinpitt force-pushed the dynamic-users branch from e0fcb87 to cd54812 Compare November 15, 2024 12:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use DynamicUser=yes for all cockpit components #16811

use DynamicUser=yes for all cockpit components #16811

allisonkarlitskaya commented Jan 11, 2022 •

edited by martinpitt

Loading

martinpitt commented Apr 13, 2022

martinpitt commented May 10, 2024 •

edited

Loading

martinpitt commented May 10, 2024

cgwalters commented Nov 5, 2024

martinpitt commented Nov 15, 2024

use DynamicUser=yes for all cockpit components #16811

Are you sure you want to change the base?

use DynamicUser=yes for all cockpit components #16811

Conversation

allisonkarlitskaya commented Jan 11, 2022 • edited by martinpitt Loading

martinpitt commented Apr 13, 2022

martinpitt commented May 10, 2024 • edited Loading

martinpitt commented May 10, 2024

cgwalters commented Nov 5, 2024

martinpitt commented Nov 15, 2024

allisonkarlitskaya commented Jan 11, 2022 •

edited by martinpitt

Loading

martinpitt commented May 10, 2024 •

edited

Loading