Support being used as a binfmt_misc handler, try 2 #157

nrabulinski · 2025-03-14T12:32:47Z

Supersedes #54

Original description:

This series allows krun to be used as a binfmt_misc handler: it uses interactive launch mode to make sure stdin/stdout are going where users expect, passes through the current working directory, expands the number of environment variables that are passed through, and daemonizes the vm process, so a process that waits on the first process launched via krun is not prevented from progressing by others.

I took some technical liberties when adapting that patchset to the current codebase.
So far the differences from the original implementation are:

Don't use net_ready lockfile anymore. This did not work on my laptop as the client was always winning the race and taking the lock before the server was even spawned. Instead we just wait for the server to spawn for up to 2.5s (can be tweaked further, though probably should be lowered down(?) once Autoconfigure network without DHCP #64 or similar lands)
I removed completely launch field from the guest configuration as the guest-server will never be executing any command now.
I didn't port over connecting I/O with socat as it's no longer needed now that tty allocation is implemented on main
Fixed the issue of the server exiting if there's been a process running in the microVM for over a year with no other activity

What still needs to be done:

Porting over passing through current directory and rest of the environment
Adding binfmt_misc config files

My own ideas for this patchset (feedback welcome):

Running in shell mode (passing in the argument string to $SHELL instead of running the command directly)
Automatically deciding whether to allocate tty or not based on the caller
Either remove -h/--help or make it so that muvm only treats arguments after -- and/or the first positional argument as arguments for the guest as the current behavior is counter-intuitive (EDIT: Not necessarily needed as it turns out that's already how bpaf works)

This is always what you want if running via binfmt Signed-off-by: Nikodem Rabuliński <[email protected]>

Signed-off-by: Nikodem Rabuliński <[email protected]>

nrabulinski · 2025-03-14T20:52:54Z

I ported over the changes I consider most important for muvm to be usable with binfmt (as well as in general bringing in UX improvements) so this should be ready for review.

I also implemented the aforementioned auto-tty-allocation, with --no-tty being available if, for whatever reason, one does not want a tty allocated.

Before merging I'd still like to resolve warnings (at least once that this PR introduced, but ideally all of them) and add the aforementioned binfmt_misc config, but I don't think I'll be introducing any more features or big changes.

EDIT: I'd also like to people to actually try this branch in the wild as my usecase is very specific, on top of already me not being on fedora asahi in the first place 😅

EDIT 2: Also, when porting over the env inheritance, I decided to not go the route of two separate functions, and instead to add inherit_env argument to prepare_env_vars. This means we'll always overwrite the firefox profile or other variables we want to have control over, while allowing users to go the route of not inheriting the environment with --no-inherit-env or they can overwrite what we're doing with the --env argument

WhatAmISupposedToPutHere · 2025-03-14T21:33:45Z

If i understood it correctly, switching to allocating a tty by default will break the steam launcher script.
How are you handling the scenario of a command being launched just as the linger timer expires and the server starts shutting down?

nrabulinski · 2025-03-14T21:44:13Z

Will it? I specifically check if stdout is a tty and otherwise don't allow allocating one since on main if I try running muvm with -t in a script it errors anyway. But I'm down to change that behavior if it does break things
No worse than currently if you tried to run muvm just as the last command exited and the server started quitting. In my case though the wait time for an error will probably be a bit longer though since, as mentioned, I'm trying to connect for 2.5s if the server lock file exists

WhatAmISupposedToPutHere · 2025-03-14T21:53:25Z

For shutdown case - would doing the following on shutdown work better?

Delete lock file from inside the vm.
Drain all remaining connections, and send them the "go away and retry" command.
Shut down.

nrabulinski · 2025-03-14T21:54:21Z

Thinking about 2 some more, I could probably handle it a bit better if I made all_exited an atomic and modified it from the worker instead of the current approach? Maybe?

I'm afraid there always be some kind of a race, the question is what failing condition we're willing to tolerate, and realistically speaking I don't think what my PR adds in day-to-day use is worse than what we currently have.
I may be underestimating how easy the race is to win though and it's not my call to make at the end of the day

nrabulinski · 2025-03-14T22:01:39Z

For shutdown case - would doing the following on shutdown work better?

Delete lock file from inside the vm.

Drain all remaining connections, and send them the "go away and retry" command.

Shut down.

1 Could be tricky because the VM itself doesn't know about the lock file. Do you know if it's possible to somehow make the outside libkrun process delete the lock file? Otherwise I think it maybe should be possible to return a response "sorry, not accepting any more connections" and make the new processes wait for the lock file, and whoever wins becomes the new VM parent process.

Other than that it sounds pretty good, albeit it would make the control flow more complicated. I can prototype that and then you can decide if we should go ahead with that approach?

Launches the initial command via the server too. Server is set to linger for 10 seconds after last command finishes. Co-authored-by: Nikodem Rabuliński <[email protected]> Signed-off-by: Nikodem Rabuliński <[email protected]>

Change the approach to environment variables to delete ones that are likely to cause problems, instead of only passing through a pre-approved list. Co-authored-by: Sasha Finkelstein <[email protected]> Signed-off-by: Nikodem Rabuliński <[email protected]>

Instead of requiring a user to ask for tty allocation, which usually is what you want anyway, default to allocating one if stdin is a tty unless the user specifically doesn't want to. Signed-off-by: Nikodem Rabuliński <[email protected]>

Aside from just being better UX, some commands are sensitive to it and will not work correctly without it. Co-authored-by: Nikodem Rabuliński <[email protected]> Signed-off-by: Nikodem Rabuliński <[email protected]>

WhatAmISupposedToPutHere and others added 2 commits March 14, 2025 13:34

Always use interactive mode.

60e4ffd

This is always what you want if running via binfmt Signed-off-by: Nikodem Rabuliński <[email protected]>

Update usage in README

c72c9aa

Signed-off-by: Nikodem Rabuliński <[email protected]>

nrabulinski force-pushed the binfmt-reborn branch from e229028 to 688aa30 Compare March 14, 2025 12:34

nrabulinski marked this pull request as ready for review March 14, 2025 20:44

nrabulinski requested review from slp, teohhanhui, alyssarosenzweig, asahilina and WhatAmISupposedToPutHere as code owners March 14, 2025 20:44

WhatAmISupposedToPutHere and others added 4 commits March 15, 2025 11:12

Daemonize the vm process

2ff7d90

Launches the initial command via the server too. Server is set to linger for 10 seconds after last command finishes. Co-authored-by: Nikodem Rabuliński <[email protected]> Signed-off-by: Nikodem Rabuliński <[email protected]>

Default to allocating a tty

af98f63

Instead of requiring a user to ask for tty allocation, which usually is what you want anyway, default to allocating one if stdin is a tty unless the user specifically doesn't want to. Signed-off-by: Nikodem Rabuliński <[email protected]>

Pass through the current working directory

1edd91d

Aside from just being better UX, some commands are sensitive to it and will not work correctly without it. Co-authored-by: Nikodem Rabuliński <[email protected]> Signed-off-by: Nikodem Rabuliński <[email protected]>

nrabulinski force-pushed the binfmt-reborn branch from 94d9067 to 1edd91d Compare March 15, 2025 10:13

This was referenced Mar 16, 2025

Add init-command option #158

Open

Allow disabling x86 emulators and/or automatically disable them based on host arch #159

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support being used as a binfmt_misc handler, try 2 #157

Support being used as a binfmt_misc handler, try 2 #157

nrabulinski commented Mar 14, 2025 •

edited

Loading

nrabulinski commented Mar 14, 2025 •

edited

Loading

WhatAmISupposedToPutHere commented Mar 14, 2025

nrabulinski commented Mar 14, 2025 •

edited

Loading

WhatAmISupposedToPutHere commented Mar 14, 2025

nrabulinski commented Mar 14, 2025

nrabulinski commented Mar 14, 2025

Support being used as a binfmt_misc handler, try 2 #157

Are you sure you want to change the base?

Support being used as a binfmt_misc handler, try 2 #157

Conversation

nrabulinski commented Mar 14, 2025 • edited Loading

nrabulinski commented Mar 14, 2025 • edited Loading

WhatAmISupposedToPutHere commented Mar 14, 2025

nrabulinski commented Mar 14, 2025 • edited Loading

WhatAmISupposedToPutHere commented Mar 14, 2025

nrabulinski commented Mar 14, 2025

nrabulinski commented Mar 14, 2025

nrabulinski commented Mar 14, 2025 •

edited

Loading

nrabulinski commented Mar 14, 2025 •

edited

Loading

nrabulinski commented Mar 14, 2025 •

edited

Loading