Closed
Description
What Happened?
Stopping minikube driver is very slow due to unreliable implementation, incorrect error handling, and pointless backup attempts.
To reproduce
- Configuration
minikube config set vm-driver vfkit
minikube config set container-runtime containerd
- Start cluster
minikube start
- Stop cluster
minikube stop --v=4 --alsologtostderr 2>stop.log
Timeline
- Stopping node started
I0309 15:00:42.297876 13723 out.go:177] * Stopping node "minikube" ...
- Performing first backup
I0309 15:00:42.307878 13723 machine.go:156] backing up vm config to /var/lib/minikube/backup: [/etc/cni /etc/kubernetes]
I0309 15:00:42.307953 13723 ssh_runner.go:195] Run: sudo mkdir -p /var/lib/minikube/backup
I0309 15:00:42.307982 13723 sshutil.go:53] new ssh client: &{IP:192.168.106.7 Port:22 SSHKeyPath:/Users/nir/.minikube/machines/minikube/id_rsa Username:docker}
I0309 15:00:42.339888 13723 ssh_runner.go:195] Run: sudo rsync --archive --relative /etc/cni /var/lib/minikube/backup
I0309 15:00:42.386662 13723 ssh_runner.go:195] Run: sudo rsync --archive --relative /etc/kubernetes /var/lib/minikube/backup
- Stopping the driver
I0309 15:00:42.433319 13723 main.go:141] libmachine: Stopping "minikube"...
I0309 15:00:42.433868 13723 main.go:141] libmachine: get state: {State:VirtualMachineStateRunning}
We got vfkit process via the HTTP API.
- Trying to set state to
HardStop
via HTTP API
I0309 15:00:42.452986 13723 stop.go:66] stop err: Post "http://_/vm/state": EOF
W0309 15:00:42.453070 13723 stop.go:165] stop host returned error: Temporary Error: stop: Post "http://_/vm/state": EOF
This always fails with EOF - maybe because vfkit is exiting immediately before returning the response.
- Starting retry
I0309 15:00:42.453102 13723 retry.go:31] will retry after 1.148164137s: Temporary Error: stop: Post "http://_/vm/state": EOF
- Stopping the host again
I0309 15:00:43.601796 13723 stop.go:39] StopHost: minikube
I0309 15:00:43.608726 13723 out.go:177] * Stopping node "minikube" ...
I0309 15:00:43.616885 13723 machine.go:156] backing up vm config to /var/lib/minikube/backup: [/etc/cni /etc/kubernetes]
I0309 15:00:43.617271 13723 ssh_runner.go:195] Run: sudo mkdir -p /var/lib/minikube/backup
I0309 15:00:43.617329 13723 sshutil.go:53] new ssh client: &{IP:192.168.106.7 Port:22 SSHKeyPath:/Users/nir/.minikube/machines/minikube/id_rsa Username:docker}
W0309 15:01:58.617799 13723 sshutil.go:64] dial failure (will retry): dial tcp 192.168.106.7:22: connect: operation timed out
W0309 15:01:58.618132 13723 stop.go:55] failed to complete vm config backup (will continue): create dir: NewSession: new client: new client: dial tcp 192.168.106.7:22: connect: operation timed out
The retry attempt is in the wrong level, or maybe the backup is in the wrong level. We just backed up this host, and there is no reason to repeat the backup.
The backup attempt times out after 135 seconds, since the vfkit process is not running at this point.
- Stopping the driver gain
I0309 15:01:58.618315 13723 main.go:141] libmachine: Stopping "minikube"...
I0309 15:01:58.619098 13723 stop.go:66] stop err: Machine "minikube" is already stopped.
I0309 15:01:58.619145 13723 stop.go:69] host is already stopped
The stop attempt fails because the host is already stopped. It should succeed without bogus errors.
Issues
- Using
HardStop
instead ofStop
- does not do graceful shutdown of the host - Using
HardStop
always fails, and we don't handle the error properly - Retry is in the wrong level, trying to backup again after a successful backup
- Backup times out because the host is not running and previous error not handled properly
Attach the log file
I tried several times and the behavior is same.
Operating System
macOS (Default)
Driver
N/A