Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run in a new process group when not using --foreground #2574

Merged
merged 1 commit into from
Sep 1, 2024

Conversation

nirs
Copy link
Member

@nirs nirs commented Aug 31, 2024

Previously the hostagenet process was running in the foreground even when not using the --foreground option. It was using the same pgid of limactl process. If the guest was running, but limactl was interrupted the hostagent received the signal and was killed, stopping the VM.

A simple way to fix this issue is to start the hostagent process in a new process group. This way it will be killed sending signals to the limactl process group.

Example run with this change:

% ps -o pid,pgid,ppid,command
  PID  PGID  PPID COMMAND
39442 39442 39440 -zsh
63233 63233 39442 _output/bin/limactl start --vm-type vz --tty=false
63299 63299 63233 /Users/nsoffer/src/lima/_output/bin/limactl hostagent ...

We can improve this later by adding an option to daemonize the hostagent process (like qemu --daemonize).

Fixes #2573

@nirs
Copy link
Member Author

nirs commented Aug 31, 2024

Example run (vz):

% ps -o pid,pgid,ppid,command
  PID  PGID  PPID COMMAND
50928 50928 74198 _output/bin/limactl start --vm-type vz --tty=false
50974 50974 50928 /Users/nsoffer/src/lima/_output/bin/limactl hostagent --pidfile /Users/nsoffer/.lima/default/ha.pid --socket /Users/nsoffer/.lima/default/ha.sock
74198 74198 74189 -zsh
74202 74202 74194 -zsh
% ps -o pid,pgid,ppid,command
  PID  PGID  PPID COMMAND
50928 50928 74198 _output/bin/limactl start --vm-type vz --tty=false
50974 50974 50928 /Users/nsoffer/src/lima/_output/bin/limactl hostagent --pidfile /Users/nsoffer/.lima/default/ha.pid --socket /Users/nsoffer/.lima/default/ha.sock
50993 50974 50974 ssh -F /dev/null -o IdentityFile="/Users/nsoffer/.lima/_config/user" -o IdentityFile="/Users/nsoffer/.ssh/id_ed25519" -o StrictHostKeyChecking=no
74198 74198 74189 -zsh
74202 74202 74194 -zsh
% ps -o pid,pgid,ppid,command
  PID  PGID  PPID COMMAND
50974 50974     1 /Users/nsoffer/src/lima/_output/bin/limactl hostagent --pidfile /Users/nsoffer/.lima/default/ha.pid --socket /Users/nsoffer/.lima/default/ha.sock
74198 74198 74189 -zsh
74202 74202 74194 -zsh

@nirs
Copy link
Member Author

nirs commented Aug 31, 2024

Example run (qemu):

% ps -o pid,pgid,ppid,command
  PID  PGID  PPID COMMAND
51707 51707 74198 _output/bin/limactl start --vm-type qemu --tty=false
51713 51713 51707 /Users/nsoffer/src/lima/_output/bin/limactl hostagent --pidfile /Users/nsoffer/.lima/default/ha.pid --socket /Users/nsoffer/.lima/default/ha.soc
51759 51713 51713 /opt/homebrew/bin/qemu-system-aarch64 -m 4096 -cpu host -machine virt,accel=hvf -smp 4,sockets=1,cores=4,threads=1 -drive if=pflash,format=raw,r
74198 74198 74189 -zsh
74202 74202 74194 -zsh
% ps -o pid,pgid,ppid,command
  PID  PGID  PPID COMMAND
51707 51707 74198 _output/bin/limactl start --vm-type qemu --tty=false
51713 51713 51707 /Users/nsoffer/src/lima/_output/bin/limactl hostagent --pidfile /Users/nsoffer/.lima/default/ha.pid --socket /Users/nsoffer/.lima/default/ha.soc
51759 51713 51713 /opt/homebrew/bin/qemu-system-aarch64 -m 4096 -cpu host -machine virt,accel=hvf -smp 4,sockets=1,cores=4,threads=1 -drive if=pflash,format=raw,r
51777 51713 51713 ssh -F /dev/null -o IdentityFile="/Users/nsoffer/.lima/_config/user" -o IdentityFile="/Users/nsoffer/.ssh/id_ed25519" -o StrictHostKeyChecking=n
74198 74198 74189 -zsh
74202 74202 74194 -zsh
% ps -o pid,pgid,ppid,command
  PID  PGID  PPID COMMAND
51707 51707 74198 _output/bin/limactl start --vm-type qemu --tty=false
51713 51713 51707 /Users/nsoffer/src/lima/_output/bin/limactl hostagent --pidfile /Users/nsoffer/.lima/default/ha.pid --socket /Users/nsoffer/.lima/default/ha.soc
51759 51713 51713 /opt/homebrew/bin/qemu-system-aarch64 -m 4096 -cpu host -machine virt,accel=hvf -smp 4,sockets=1,cores=4,threads=1 -drive if=pflash,format=raw,r
51825 51713 51713 ssh -F /dev/null -o IdentityFile="/Users/nsoffer/.lima/_config/user" -o IdentityFile="/Users/nsoffer/.ssh/id_ed25519" -o StrictHostKeyChecking=n
51826 51713 51713 /usr/libexec/sftp-server -e -d /Users/nsoffer -R
51829 51713 51713 ssh -F /dev/null -o IdentityFile="/Users/nsoffer/.lima/_config/user" -o IdentityFile="/Users/nsoffer/.ssh/id_ed25519" -o StrictHostKeyChecking=n
51830 51713 51713 /usr/libexec/sftp-server -e -d /tmp/lima
51874 51713 51713 ssh -F /dev/null -o IdentityFile="/Users/nsoffer/.lima/_config/user" -o IdentityFile="/Users/nsoffer/.ssh/id_ed25519" -o StrictHostKeyChecking=n
74198 74198 74189 -zsh
74202 74202 74194 -zsh
% ps -o pid,pgid,ppid,command
  PID  PGID  PPID COMMAND
51713 51713     1 /Users/nsoffer/src/lima/_output/bin/limactl hostagent --pidfile /Users/nsoffer/.lima/default/ha.pid --socket /Users/nsoffer/.lima/default/ha.soc
51759 51713 51713 /opt/homebrew/bin/qemu-system-aarch64 -m 4096 -cpu host -machine virt,accel=hvf -smp 4,sockets=1,cores=4,threads=1 -drive if=pflash,format=raw,r
51825 51713 51713 ssh -F /dev/null -o IdentityFile="/Users/nsoffer/.lima/_config/user" -o IdentityFile="/Users/nsoffer/.ssh/id_ed25519" -o StrictHostKeyChecking=n
51826 51713 51713 /usr/libexec/sftp-server -e -d /Users/nsoffer -R
51829 51713 51713 ssh -F /dev/null -o IdentityFile="/Users/nsoffer/.lima/_config/user" -o IdentityFile="/Users/nsoffer/.ssh/id_ed25519" -o StrictHostKeyChecking=n
51830 51713 51713 /usr/libexec/sftp-server -e -d /tmp/lima
74198 74198 74189 -zsh
74202 74202 74194 -zsh

Previously the hostagenet process was running in the foreground even
when not using the --foreground option. It was using the same pgid of
limactl process. If the guest was running, but limactl was interrupted
the hostagent received the signal and was killed, stopping the VM.

A simple way to fix this issue is to start the hostagent process in a
new process group. This way it will be killed sending signals to the
limactl process group.

It looks like this is already implemented for Windows based on the docs
for syscall.CREATE_NEW_PROCESS_GROUP[1].

Example run with this change:

    % ps -o pid,pgid,ppid,command
      PID  PGID  PPID COMMAND
    39442 39442 39440 -zsh
    63233 63233 39442 _output/bin/limactl start --vm-type vz --tty=false
    63299 63299 63233 /Users/nsoffer/src/lima/_output/bin/limactl hostagent ...

We can improve this later by adding an option to daemonize the hostagent
process (like qemu --daemonize).

[1] https://learn.microsoft.com/en-us/windows/win32/procthread/process-creation-flags

Fixes lima-vm#2573
Signed-off-by: Nir Soffer <[email protected]>
nirs added a commit to nirs/ramen that referenced this pull request Aug 31, 2024
With the fix[1] lima starts the hostagenet process in a new process
group so we can run limactl normally as it should. When we terminate
after errors limactl is terminated without harming the hostagenet
process.

[1] lima-vm/lima#2574

Signed-off-by: Nir Soffer <[email protected]>
Copy link
Member

@AkihiroSuda AkihiroSuda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@AkihiroSuda AkihiroSuda merged commit e2d123f into lima-vm:master Sep 1, 2024
27 checks passed
@AkihiroSuda AkihiroSuda added this to the v1.0 milestone Sep 1, 2024
@nirs nirs deleted the background-fix branch September 1, 2024 18:16
@nirs
Copy link
Member Author

nirs commented Sep 1, 2024

@AkihiroSuda thanks for merging quickly. Do you have an estimate when this fix will be available in a release? We can depend on local lima build via go install, but it will be easier to require a released version.

@AkihiroSuda
Copy link
Member

Maybe around October?

@nirs
Copy link
Member Author

nirs commented Sep 1, 2024

October would be fine, thanks!

nirs added a commit to nirs/ramen that referenced this pull request Sep 1, 2024
With the fix[1] lima starts the hostagenet process in a new process
group so we can run limactl normally as it should. When we terminate
after errors limactl is terminated without harming the hostagenet
process.

The fix is available in upstream but not released yet. It should
available in lima around October.

[1] lima-vm/lima#2574

Signed-off-by: Nir Soffer <[email protected]>
nirs added a commit to nirs/ramen that referenced this pull request Sep 2, 2024
With the fix[1] lima starts the hostagenet process in a new process
group so we can run limactl normally as it should. When we terminate
after errors limactl is terminated without harming the hostagenet
process.

The fix is available in upstream but not released yet. It should
available in lima around October.

[1] lima-vm/lima#2574

Signed-off-by: Nir Soffer <[email protected]>
nirs added a commit to nirs/ramen that referenced this pull request Sep 2, 2024
With the fix[1] lima starts the hostagenet process in a new process
group so we can run limactl normally as it should. When we terminate
after errors limactl is terminated without harming the hostagenet
process.

The fix is available in upstream but not released yet. It should
available in lima around October.

[1] lima-vm/lima#2574

Signed-off-by: Nir Soffer <[email protected]>
nirs added a commit to nirs/ramen that referenced this pull request Sep 2, 2024
With the fix[1] lima starts the hostagenet process in a new process
group so we can run limactl normally as it should. When we terminate
after errors limactl is terminated without harming the hostagenet
process.

The fix is available in upstream but not released yet. It should
available in lima around October.

[1] lima-vm/lima#2574

Signed-off-by: Nir Soffer <[email protected]>
nirs added a commit to nirs/ramen that referenced this pull request Sep 2, 2024
With the fix[1] lima starts the hostagenet process in a new process
group so we can run limactl normally as it should. When we terminate
after errors limactl is terminated without harming the hostagenet
process.

The fix is available in upstream but not released yet. It should
available in lima around October.

[1] lima-vm/lima#2574

Signed-off-by: Nir Soffer <[email protected]>
nirs added a commit to nirs/lima that referenced this pull request Sep 8, 2024
Similar to the how we run the hostagent process[1], we want to run the
usernet process in the background. Now a program using killpg to cleanup
child processes will not terminate the usernet process.

Example run with this change:

    % ps -o pid,pgid,ppid,command
      PID  PGID  PPID COMMAND
    55768 55768 55767 -zsh
    56126 56126 55768 limactl start userv2.yaml --tty=0
    56128 56128 56126 /Users/nsoffer/src/lima/_output/bin/limactl usernet ...
    56131 56131 56126 /Users/nsoffer/src/lima/_output/bin/limactl hostagent ...

    % ps -o pid,pgid,ppid,command
      PID  PGID  PPID COMMAND
    55768 55768 55767 -zsh
    56128 56128     1 /Users/nsoffer/src/lima/_output/bin/limactl usernet ...
    56131 56131     1 /Users/nsoffer/src/lima/_output/bin/limactl hostagent ...

[1] lima-vm#2574

Signed-off-by: Nir Soffer <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Host agent process running in the foreground even when not using --foreground
2 participants