Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

qemu VM stuck in spin loop after running podman machine start with podman 4.6.1 and macOS 13.3.1 #19821

Closed
ray-kast opened this issue Aug 31, 2023 · 18 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. machine macos MacOS (OSX) related

Comments

@ray-kast
Copy link

Issue Description

I'm not sure when exactly this broke on my machine, but running podman machine start causes it to spawn a QEMU process that appears to get stuck in some kind of spin loop. I've let this run for several hours as I've seen other Mac users find success in doing so, but I had no such luck.

Steps to reproduce the issue

Steps to reproduce the issue

  1. Install podman 4.6.1 (using MacPorts here, on 13.3.1 Ventura with qemu 8.0.4)
  2. Run podman machine init --now (or podman machine init and podman machine start)

Describe the results you received

The command hangs after printing Waiting for VM... and a qemu-system-aarch64 process spawns and sits at 100% core usage indefinitely.

Describe the results you expected

Running podman machine start usually finishes within 10-20 seconds.

podman info output

(neither podman info nor podman version can connect)
CPU: Apple M1 Pro
OS: macOS Ventura 13.3.1
podman version: 4.6.1
qemu version: 8.0.4

$ podman machine list && podman machine info
NAME                     VM TYPE     CREATED         LAST UP         CPUS        MEMORY      DISK SIZE
podman-machine-default*  qemu        14 minutes ago  14 minutes ago  6           4GiB        100GiB
Host:
  Arch: arm64
  CurrentMachine: podman-machine-default
  DefaultMachine: podman-machine-default
  EventsDir: /var/folders/0x/ww0lq3cx6575w9twz_bfjncw0000gn/T/podman-run--1/podman
  MachineConfigDir: /Users/ray/.config/containers/podman/machine/qemu
  MachineImageDir: /Users/ray/.local/share/containers/podman/machine/qemu
  MachineState: Stopped
  NumberOfMachines: 1
  OS: darwin
  VMType: qemu
Version:
  APIVersion: 4.6.1
  Built: 0
  BuiltTime: Wed Dec 31 16:00:00 1969
  GitCommit: ""
  GoVersion: go1.21.0
  Os: darwin
  OsArch: darwin/arm64
  Version: 4.6.1

Podman in a container

No

Privileged Or Rootless

Rootless

Upstream Latest Release

Yes

Additional environment details

$ port info podman qemu
podman @4.6.1 (sysutils)

Description:          Podman is a tool for running Linux containers. You can do this from a MacOS desktop as long as you have access to a linux box either running inside of a VM on the host, or available via
                      the network. You need to install the remote client and then setup ssh connection information.
Homepage:             https://github.com/containers/podman

Build Dependencies:   go, go-md2man, python311, pre-commit
Runtime Dependencies: gvisor-tap-vsock, qemu
Platforms:            darwin, freebsd, linux
License:              Apache-2
Maintainers:          Email: [email protected], GitHub: judaew
                      Policy: openmaintainer
--
qemu @8.0.4 (emulators)
Variants:             [+]cocoa, curl, curses, dbus, gtk3, sdl2, [+]spice, spice_protocol, ssh, target_alpha, [+]target_arm, target_cris, target_hppa, [+]target_i386, target_m68k, target_microblaze,
                      target_mips, target_nios2, target_or1k, target_ppc, target_riscv32, target_riscv64, target_rx, target_s390x, target_sh4, target_sparc, target_tricore, [+]target_x86_64, target_xtensa,
                      [+]usb, vde, [+]vnc

Description:          QEMU is a generic and open source machine emulator. It can run OSes and programs made for one machine on a different machine. By using dynamic translation, it achieves very good
                      performance.
Homepage:             https://www.qemu.org

Extract Dependencies: xz
Build Dependencies:   texinfo, libtool, meson, ninja, pkgconfig, py311-sphinx, perl5
Library Dependencies: glib2, gnutls, libpixman, bzip2, libslirp, lzfse, lzo2, snappy, zlib, zstd, libusb, usbredir, cyrus-sasl2, libjpeg-turbo, libpng, spice-protocol, spice-server
Platforms:            darwin
License:              GPL-2+
Maintainers:          Email: [email protected], GitHub: raimue
                      Policy: openmaintainer

Additional information

No response

@ray-kast ray-kast added the kind/bug Categorizes issue or PR as related to a bug. label Aug 31, 2023
@github-actions github-actions bot added the macos MacOS (OSX) related label Aug 31, 2023
@ashley-cui
Copy link
Member

What happens if you add podman --log-level=debug machine start? Does re-creating the machine do anything different?

@ray-kast
Copy link
Author

ray-kast commented Sep 1, 2023

@ashley-cui I've tried machine rm and machine init several times and also reinstalling podman and all its files entirely. Running with --log-level=debug gives me this output and QEMU window:

INFO[0000] podman filtering at log level debug
DEBU[0000] Using Podman machine with `qemu` virtualization provider
Starting machine "podman-machine-default"
[/opt/local/libexec/gvproxy -listen-qemu unix:///var/folders/0x/ww0lq3cx6575w9twz_bfjncw0000gn/T/podman/qmp_podman-machine-default.sock -pid-file /var/folders/0x/ww0lq3cx6575w9twz_bfjncw0000gn/T/podman/podman-machine-default_proxy.pid -ssh-port 51977 -forward-sock /Users/ray/.local/share/containers/podman/machine/qemu/podman.sock -forward-dest /run/user/501/podman/podman.sock -forward-user core -forward-identity /Users/ray/.ssh/podman-machine-default --debug]
DEBU[0000] qemu cmd: [/opt/local/bin/qemu-system-aarch64 -m 4096 -smp 6 -fw_cfg name=opt/com.coreos/config,file=/Users/ray/.config/containers/podman/machine/qemu/podman-machine-default.ign -qmp unix:/var/folders/0x/ww0lq3cx6575w9twz_bfjncw0000gn/T/podman/qmp_podman-machine-default.sock,server=on,wait=off -netdev socket,id=vlan,fd=3 -device virtio-net-pci,netdev=vlan,mac=5a:94:ef:e4:0c:ee -device virtio-serial -chardev socket,path=/var/folders/0x/ww0lq3cx6575w9twz_bfjncw0000gn/T/podman/podman-machine-default_ready.sock,server=on,wait=off,id=apodman-machine-default_ready -device virtserialport,chardev=apodman-machine-default_ready,name=org.fedoraproject.port.0 -pidfile /var/folders/0x/ww0lq3cx6575w9twz_bfjncw0000gn/T/podman/podman-machine-default_vm.pid -accel hvf -accel tcg -cpu host -M virt,highmem=on -drive file=/opt/local/share/qemu/edk2-aarch64-code.fd,if=pflash,format=raw,readonly=on -drive file=/Users/ray/.local/share/containers/podman/machine/qemu/podman-machine-default_ovmf_vars.fd,if=pflash,format=raw -virtfs local,path=/Users,mount_tag=vol0,security_model=none -virtfs local,path=/private,mount_tag=vol1,security_model=none -virtfs local,path=/var/folders,mount_tag=vol2,security_model=none -drive if=virtio,file=/Users/ray/.local/share/containers/podman/machine/qemu/podman-machine-default_fedora-coreos-38.20230819.2.0-qemu.aarch64.qcow2]
Waiting for VM ...
image

@baude
Copy link
Member

baude commented Sep 2, 2023

are there any hung processes from qemu, gvproxy, or podman by chance?

@vrothberg
Copy link
Member

are there any hung processes from qemu, gvproxy, or podman by chance?

See above "a qemu-system-aarch64 process spawns and sits at 100% core usage indefinitely."

@vrothberg
Copy link
Member

I am unable to reproduce on my M2 with Podman v4.6.2. I created a fresh VM via podman machine init and ran a start-stop loop for 150 iterations.

@ray-kast
Copy link
Author

ray-kast commented Sep 4, 2023

@vrothberg Weird. I've also not encountered this before, Podman was working fine for me previously.

@vrothberg
Copy link
Member

@ray-kast, are you still seeing the issue? I am not sure I read your statement correctly.

@rhatdan
Copy link
Member

rhatdan commented Sep 5, 2023

Running stuff in emulation mode is chancy I guess.

@ray-kast
Copy link
Author

ray-kast commented Sep 5, 2023

@vrothberg Yes, I am still encountering it, sorry for the confusion. It was working fine for several months and then this happened at some point and I have been unable to figure out what changed such that it no longer works.

@vrothberg
Copy link
Member

I am sorry, @ray-kast. Something must be off. Can you share which version of qemu you have installed (qemu-system-aarch64 -version)?

@vladimir-pachnik
Copy link

I believe the QEMU version is noted in the original submission (8.0.4).
I can see the same symptoms with podman 4.6.1 / qemu 8.0.4 from Macports on MacOS Monterey 12.6.7.

@ray-kast
Copy link
Author

ray-kast commented Sep 6, 2023

@vladimir-pachnik That's correct. I don't think the above port output shows it but the version I have installed is a source build because I explicitly requested the +ssh variant.

EDIT: sorry, I missed the part about the macOS version - I'm on Ventura 13.3.1 (22E261)

@vrothberg
Copy link
Member

Any chance you can try with qemu 8.1.0?

@vladimir-pachnik
Copy link

vladimir-pachnik commented Sep 7, 2023

I tried with qemu 8.1.0 (patched local build based on Macports upstream Portfile) and the result is the same. When I try to start the VM I get stuck at the Waiting for VM ...

The whole shebang as follows:

$> podman machine init --cpus=2 --disk-size=20 --rootful=true --image-path=stable --log-level=debug local-podman-machine
INFO[0000] podman filtering at log level debug
DEBU[0000] Using Podman machine with `qemu` virtualization provider
Extracting compressed file
Image resized.
Machine init complete
To start your machine run:

	podman machine start local-podman-machine

DEBU[0019] Called machine init.PersistentPostRunE(podman machine init --cpus=2 --disk-size=20 --rootful=true --image-path=stable --log-level=debug local-podman-machine)
DEBU[0019] Shutting down engines

$> podman machine start --log-level=debug local-podman-machine
INFO[0000] podman filtering at log level debug
DEBU[0000] Using Podman machine with `qemu` virtualization provider
Starting machine "local-podman-machine"
[/opt/local/libexec/gvproxy -listen-qemu unix:///var/folders/9y/4rd2p67d093dp42d_yq25py80000gn/T/podman/qmp_local-podman-machine.sock -pid-file /var/folders/9y/4rd2p67d093dp42d_yq25py80000gn/T/podman/local-podman-machine_proxy.pid -ssh-port 58338 -forward-sock /Users/vladimir.pachnik/.local/share/containers/podman/machine/qemu/podman.sock -forward-dest /run/podman/podman.sock -forward-user root -forward-identity /Users/vladimir.pachnik/.ssh/local-podman-machine --debug]
DEBU[0000] qemu cmd: [/opt/local/bin/qemu-system-aarch64 -m 2048 -smp 2 -fw_cfg name=opt/com.coreos/config,file=/Users/vladimir.pachnik/.config/containers/podman/machine/qemu/local-podman-machine.ign -qmp unix:/var/folders/9y/4rd2p67d093dp42d_yq25py80000gn/T/podman/qmp_local-podman-machine.sock,server=on,wait=off -netdev socket,id=vlan,fd=3 -device virtio-net-pci,netdev=vlan,mac=__REDACTED__ -device virtio-serial -chardev socket,path=/var/folders/9y/4rd2p67d093dp42d_yq25py80000gn/T/podman/local-podman-machine_ready.sock,server=on,wait=off,id=alocal-podman-machine_ready -device virtserialport,chardev=alocal-podman-machine_ready,name=org.fedoraproject.port.0 -pidfile /var/folders/9y/4rd2p67d093dp42d_yq25py80000gn/T/podman/local-podman-machine_vm.pid -accel hvf -accel tcg -cpu host -M virt,highmem=on -drive file=/opt/local/share/qemu/edk2-aarch64-code.fd,if=pflash,format=raw,readonly=on -drive file=/Users/vladimir.pachnik/.local/share/containers/podman/machine/qemu/local-podman-machine_ovmf_vars.fd,if=pflash,format=raw -virtfs local,path=/Users,mount_tag=vol0,security_model=none -virtfs local,path=/private,mount_tag=vol1,security_model=none -virtfs local,path=/var/folders,mount_tag=vol2,security_model=none -drive if=virtio,file=/Users/vladimir.pachnik/.local/share/containers/podman/machine/qemu/local-podman-machine_fedora-coreos-38.20230819.3.0-qemu.aarch64.qcow2]
Waiting for VM ...
^C

When I try to run the qemu command manually I get qemu error -netdev socket,id=vlan,fd=3: can't get socket option SO_TYPE:

$> /opt/local/bin/qemu-system-aarch64 \
                  -m 2048 \
                  -smp 2 \
                  -fw_cfg name=opt/com.coreos/config,file=/Users/vladimir.pachnik/.config/containers/podman/machine/qemu/local-podman-machine.ign \
                  -qmp unix:/var/folders/9y/4rd2p67d093dp42d_yq25py80000gn/T/podman/qmp_local-podman-machine.sock,server=on,wait=off \
                  -netdev socket,id=vlan,fd=3 \
                  -device virtio-net-pci,netdev=vlan,mac=__REDACTED__ \
                  -device virtio-serial \
                  -chardev socket,path=/var/folders/9y/4rd2p67d093dp42d_yq25py80000gn/T/podman/local-podman-machine_ready.sock,server=on,wait=off,id=alocal-podman-machine_ready \
                  -device virtserialport,chardev=alocal-podman-machine_ready,name=org.fedoraproject.port.0 \
                  -pidfile /var/folders/9y/4rd2p67d093dp42d_yq25py80000gn/T/podman/local-podman-machine_vm.pid \
                  -accel hvf \
                  -accel tcg \
                  -cpu host \
                  -M virt,highmem=on \
                  -drive file=/opt/local/share/qemu/edk2-aarch64-code.fd,if=pflash,format=raw,readonly=on \
                  -drive file=/Users/vladimir.pachnik/.local/share/containers/podman/machine/qemu/local-podman-machine_ovmf_vars.fd,if=pflash,format=raw \
                  -virtfs local,path=/Users,mount_tag=vol0,security_model=none \
                  -virtfs local,path=/private,mount_tag=vol1,security_model=none \
                  -virtfs local,path=/var/folders,mount_tag=vol2,security_model=none \
                  -drive if=virtio,file=/Users/vladimir.pachnik/.local/share/containers/podman/machine/qemu/local-podman-machine_fedora-coreos-38.20230819.3.0-qemu.aarch64.qcow2
                  
qemu-system-aarch64: -netdev socket,id=vlan,fd=3: can't get socket option SO_TYPE 

The qemu config as follows:

$> cat ./.config/containers/podman/machine/qemu/local-podman-machine.json
{
 "ConfigPath": {
  "Path": "/Users/vladimir.pachnik/.config/containers/podman/machine/qemu/local-podman-machine.json"
 },
 "CmdLine": [
  "/opt/local/bin/qemu-system-aarch64",
  "-m",
  "2048",
  "-smp",
  "2",
  "-fw_cfg",
  "name=opt/com.coreos/config,file=/Users/vladimir.pachnik/.config/containers/podman/machine/qemu/local-podman-machine.ign",
  "-qmp",
  "unix:/var/folders/9y/4rd2p67d093dp42d_yq25py80000gn/T/podman/qmp_local-podman-machine.sock,server=on,wait=off",
  "-netdev",
  "socket,id=vlan,fd=3",
  "-device",
  "virtio-net-pci,netdev=vlan,mac=__REDACTED__",
  "-device",
  "virtio-serial",
  "-chardev",
  "socket,path=/var/folders/9y/4rd2p67d093dp42d_yq25py80000gn/T/podman/local-podman-machine_ready.sock,server=on,wait=off,id=alocal-podman-machine_ready",
  "-device",
  "virtserialport,chardev=alocal-podman-machine_ready,name=org.fedoraproject.port.0",
  "-pidfile",
  "/var/folders/9y/4rd2p67d093dp42d_yq25py80000gn/T/podman/local-podman-machine_vm.pid",
  "-accel",
  "hvf",
  "-accel",
  "tcg",
  "-cpu",
  "host",
  "-M",
  "virt,highmem=on",
  "-drive",
  "file=/opt/local/share/qemu/edk2-aarch64-code.fd,if=pflash,format=raw,readonly=on",
  "-drive",
  "file=/Users/vladimir.pachnik/.local/share/containers/podman/machine/qemu/local-podman-machine_ovmf_vars.fd,if=pflash,format=raw",
  "-virtfs",
  "local,path=/Users,mount_tag=vol0,security_model=none",
  "-virtfs",
  "local,path=/private,mount_tag=vol1,security_model=none",
  "-virtfs",
  "local,path=/var/folders,mount_tag=vol2,security_model=none",
  "-drive",
  "if=virtio,file=/Users/vladimir.pachnik/.local/share/containers/podman/machine/qemu/local-podman-machine_fedora-coreos-38.20230819.3.0-qemu.aarch64.qcow2"
 ],
 "Rootful": true,
 "UID": 501,
 "HostUserModified": false,
 "IgnitionFilePath": {
  "Path": "/Users/vladimir.pachnik/.config/containers/podman/machine/qemu/local-podman-machine.ign"
 },
 "ImageStream": "stable",
 "ImagePath": {
  "Path": "/Users/vladimir.pachnik/.local/share/containers/podman/machine/qemu/local-podman-machine_fedora-coreos-38.20230819.3.0-qemu.aarch64.qcow2"
 },
 "Mounts": [
  {
   "ReadOnly": false,
   "Source": "/Users",
   "Tag": "vol0",
   "Target": "/Users",
   "Type": "9p"
  },
  {
   "ReadOnly": false,
   "Source": "/private",
   "Tag": "vol1",
   "Target": "/private",
   "Type": "9p"
  },
  {
   "ReadOnly": false,
   "Source": "/var/folders",
   "Tag": "vol2",
   "Target": "/var/folders",
   "Type": "9p"
  }
 ],
 "Name": "local-podman-machine",
 "PidFilePath": {
  "Path": "/var/folders/9y/4rd2p67d093dp42d_yq25py80000gn/T/podman/local-podman-machine_proxy.pid"
 },
 "VMPidFilePath": {
  "Path": "/var/folders/9y/4rd2p67d093dp42d_yq25py80000gn/T/podman/local-podman-machine_vm.pid"
 },
 "QMPMonitor": {
  "Address": {
   "Path": "/var/folders/9y/4rd2p67d093dp42d_yq25py80000gn/T/podman/qmp_local-podman-machine.sock"
  },
  "Network": "unix",
  "Timeout": 2000000000
 },
 "ReadySocket": {
  "Path": "/var/folders/9y/4rd2p67d093dp42d_yq25py80000gn/T/podman/local-podman-machine_ready.sock"
 },
 "CPUs": 2,
 "DiskSize": 20,
 "Memory": 2048,
 "IdentityPath": "/Users/vladimir.pachnik/.ssh/local-podman-machine",
 "Port": 58338,
 "RemoteUsername": "core",
 "Starting": false,
 "Created": "2023-09-07T14:16:31.235202+02:00",
 "LastUp": "2023-09-07T14:16:31.235202+02:00"
}

@vladimir-pachnik
Copy link

... and I was able to make it work for me by decompressing the /opt/local/share/qemu/edk2-aarch64-code.fd FW image as per this post in MacPorts.
When done, the machine starts successfully.

@vrothberg
Copy link
Member

So is it a packaging issue in MacPorts?

@vladimir-pachnik
Copy link

More likely qemu issue, according to MacPorts forum. I don't see MacPorts doing anything special with the images besides the standard build process.

@ray-kast
Copy link
Author

ray-kast commented Sep 9, 2023

... and I was able to make it work for me by decompressing the /opt/local/share/qemu/edk2-aarch64-code.fd FW image as per this post in MacPorts. When done, the machine starts successfully.

Confirmed this fix works for me as well. Thanks for the help, I would have had no clue where to start digging with this!

@ray-kast ray-kast closed this as completed Sep 9, 2023
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Dec 9, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 9, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. machine macos MacOS (OSX) related
Projects
None yet
Development

No branches or pull requests

6 participants