Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support being used as a binfmt_misc handler, try 2 #157

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

nrabulinski
Copy link

@nrabulinski nrabulinski commented Mar 14, 2025

Supersedes #54

Original description:

This series allows krun to be used as a binfmt_misc handler: it uses interactive launch mode to make sure stdin/stdout are going where users expect, passes through the current working directory, expands the number of environment variables that are passed through, and daemonizes the vm process, so a process that waits on the first process launched via krun is not prevented from progressing by others.

I took some technical liberties when adapting that patchset to the current codebase.
So far the differences from the original implementation are:

  • Don't use net_ready lockfile anymore. This did not work on my laptop as the client was always winning the race and taking the lock before the server was even spawned. Instead we just wait for the server to spawn for up to 2.5s (can be tweaked further, though probably should be lowered down(?) once Autoconfigure network without DHCP #64 or similar lands)
  • I removed completely launch field from the guest configuration as the guest-server will never be executing any command now.
  • I didn't port over connecting I/O with socat as it's no longer needed now that tty allocation is implemented on main
  • Fixed the issue of the server exiting if there's been a process running in the microVM for over a year with no other activity

What still needs to be done:

  • Porting over passing through current directory and rest of the environment
  • Adding binfmt_misc config files

My own ideas for this patchset (feedback welcome):

  • Running in shell mode (passing in the argument string to $SHELL instead of running the command directly)
  • Automatically deciding whether to allocate tty or not based on the caller
  • Either remove -h/--help or make it so that muvm only treats arguments after -- and/or the first positional argument as arguments for the guest as the current behavior is counter-intuitive (EDIT: Not necessarily needed as it turns out that's already how bpaf works)

This is always what you want if running via binfmt

Signed-off-by: Nikodem Rabuliński <[email protected]>
Signed-off-by: Nikodem Rabuliński <[email protected]>
@nrabulinski
Copy link
Author

nrabulinski commented Mar 14, 2025

I ported over the changes I consider most important for muvm to be usable with binfmt (as well as in general bringing in UX improvements) so this should be ready for review.

I also implemented the aforementioned auto-tty-allocation, with --no-tty being available if, for whatever reason, one does not want a tty allocated.

Before merging I'd still like to resolve warnings (at least once that this PR introduced, but ideally all of them) and add the aforementioned binfmt_misc config, but I don't think I'll be introducing any more features or big changes.

EDIT: I'd also like to people to actually try this branch in the wild as my usecase is very specific, on top of already me not being on fedora asahi in the first place 😅

EDIT 2: Also, when porting over the env inheritance, I decided to not go the route of two separate functions, and instead to add inherit_env argument to prepare_env_vars. This means we'll always overwrite the firefox profile or other variables we want to have control over, while allowing users to go the route of not inheriting the environment with --no-inherit-env or they can overwrite what we're doing with the --env argument

@WhatAmISupposedToPutHere
Copy link
Collaborator

  1. If i understood it correctly, switching to allocating a tty by default will break the steam launcher script.
  2. How are you handling the scenario of a command being launched just as the linger timer expires and the server starts shutting down?

@nrabulinski
Copy link
Author

nrabulinski commented Mar 14, 2025

  1. Will it? I specifically check if stdout is a tty and otherwise don't allow allocating one since on main if I try running muvm with -t in a script it errors anyway. But I'm down to change that behavior if it does break things
  2. No worse than currently if you tried to run muvm just as the last command exited and the server started quitting. In my case though the wait time for an error will probably be a bit longer though since, as mentioned, I'm trying to connect for 2.5s if the server lock file exists

@WhatAmISupposedToPutHere
Copy link
Collaborator

For shutdown case - would doing the following on shutdown work better?

  1. Delete lock file from inside the vm.
  2. Drain all remaining connections, and send them the "go away and retry" command.
  3. Shut down.

@nrabulinski
Copy link
Author

Thinking about 2 some more, I could probably handle it a bit better if I made all_exited an atomic and modified it from the worker instead of the current approach? Maybe?

I'm afraid there always be some kind of a race, the question is what failing condition we're willing to tolerate, and realistically speaking I don't think what my PR adds in day-to-day use is worse than what we currently have.
I may be underestimating how easy the race is to win though and it's not my call to make at the end of the day

@nrabulinski
Copy link
Author

For shutdown case - would doing the following on shutdown work better?

  1. Delete lock file from inside the vm.

  2. Drain all remaining connections, and send them the "go away and retry" command.

  3. Shut down.

1 Could be tricky because the VM itself doesn't know about the lock file. Do you know if it's possible to somehow make the outside libkrun process delete the lock file? Otherwise I think it maybe should be possible to return a response "sorry, not accepting any more connections" and make the new processes wait for the lock file, and whoever wins becomes the new VM parent process.

Other than that it sounds pretty good, albeit it would make the control flow more complicated. I can prototype that and then you can decide if we should go ahead with that approach?

Launches the initial command via the server too.
Server is set to linger for 10 seconds after last command finishes.

Co-authored-by: Nikodem Rabuliński <[email protected]>
Signed-off-by: Nikodem Rabuliński <[email protected]>
Change the approach to environment variables to
delete ones that are likely to cause problems,
instead of only passing through a pre-approved list.

Co-authored-by: Sasha Finkelstein <[email protected]>
Signed-off-by: Nikodem Rabuliński <[email protected]>
Instead of requiring a user to ask for tty allocation,
which usually is what you want anyway, default to allocating one
if stdin is a tty unless the user specifically doesn't want to.

Signed-off-by: Nikodem Rabuliński <[email protected]>
Aside from just being better UX, some commands are sensitive to it
and will not work correctly without it.

Co-authored-by: Nikodem Rabuliński <[email protected]>
Signed-off-by: Nikodem Rabuliński <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants