Skip to content

minikube stop very slow with vfkit driver #20503

Closed
@nirs

Description

@nirs

What Happened?

Stopping minikube driver is very slow due to unreliable implementation, incorrect error handling, and pointless backup attempts.

To reproduce

  1. Configuration
minikube config set vm-driver vfkit
minikube config set container-runtime containerd
  1. Start cluster
minikube start
  1. Stop cluster
minikube stop --v=4 --alsologtostderr 2>stop.log

Timeline

  1. Stopping node started
I0309 15:00:42.297876   13723 out.go:177] * Stopping node "minikube"  ...
  1. Performing first backup
I0309 15:00:42.307878   13723 machine.go:156] backing up vm config to /var/lib/minikube/backup: [/etc/cni /etc/kubernetes]
I0309 15:00:42.307953   13723 ssh_runner.go:195] Run: sudo mkdir -p /var/lib/minikube/backup
I0309 15:00:42.307982   13723 sshutil.go:53] new ssh client: &{IP:192.168.106.7 Port:22 SSHKeyPath:/Users/nir/.minikube/machines/minikube/id_rsa Username:docker}
I0309 15:00:42.339888   13723 ssh_runner.go:195] Run: sudo rsync --archive --relative /etc/cni /var/lib/minikube/backup
I0309 15:00:42.386662   13723 ssh_runner.go:195] Run: sudo rsync --archive --relative /etc/kubernetes /var/lib/minikube/backup
  1. Stopping the driver
I0309 15:00:42.433319   13723 main.go:141] libmachine: Stopping "minikube"...
I0309 15:00:42.433868   13723 main.go:141] libmachine: get state: {State:VirtualMachineStateRunning}

We got vfkit process via the HTTP API.

  1. Trying to set state to HardStop via HTTP API
I0309 15:00:42.452986   13723 stop.go:66] stop err: Post "http://_/vm/state": EOF
W0309 15:00:42.453070   13723 stop.go:165] stop host returned error: Temporary Error: stop: Post "http://_/vm/state": EOF

This always fails with EOF - maybe because vfkit is exiting immediately before returning the response.

  1. Starting retry
I0309 15:00:42.453102   13723 retry.go:31] will retry after 1.148164137s: Temporary Error: stop: Post "http://_/vm/state": EOF
  1. Stopping the host again
I0309 15:00:43.601796   13723 stop.go:39] StopHost: minikube
I0309 15:00:43.608726   13723 out.go:177] * Stopping node "minikube"  ...
I0309 15:00:43.616885   13723 machine.go:156] backing up vm config to /var/lib/minikube/backup: [/etc/cni /etc/kubernetes]
I0309 15:00:43.617271   13723 ssh_runner.go:195] Run: sudo mkdir -p /var/lib/minikube/backup
I0309 15:00:43.617329   13723 sshutil.go:53] new ssh client: &{IP:192.168.106.7 Port:22 SSHKeyPath:/Users/nir/.minikube/machines/minikube/id_rsa Username:docker}
W0309 15:01:58.617799   13723 sshutil.go:64] dial failure (will retry): dial tcp 192.168.106.7:22: connect: operation timed out
W0309 15:01:58.618132   13723 stop.go:55] failed to complete vm config backup (will continue): create dir: NewSession: new client: new client: dial tcp 192.168.106.7:22: connect: operation timed out

The retry attempt is in the wrong level, or maybe the backup is in the wrong level. We just backed up this host, and there is no reason to repeat the backup.

The backup attempt times out after 135 seconds, since the vfkit process is not running at this point.

  1. Stopping the driver gain
I0309 15:01:58.618315   13723 main.go:141] libmachine: Stopping "minikube"...
I0309 15:01:58.619098   13723 stop.go:66] stop err: Machine "minikube" is already stopped.
I0309 15:01:58.619145   13723 stop.go:69] host is already stopped

The stop attempt fails because the host is already stopped. It should succeed without bogus errors.

Issues

  • Using HardStop instead of Stop - does not do graceful shutdown of the host
  • Using HardStop always fails, and we don't handle the error properly
  • Retry is in the wrong level, trying to backup again after a successful backup
  • Backup times out because the host is not running and previous error not handled properly

Attach the log file

I tried several times and the behavior is same.

stop.log

Operating System

macOS (Default)

Driver

N/A

Metadata

Metadata

Assignees

No one assigned

    Labels

    co/vfkitVFkit related issueskind/bugCategorizes issue or PR as related to a bug.os/macos

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions