Errors encountered when upgrading incus with sudo apt upgrade #997

vikrantrathore · 2024-07-15T03:32:11Z

Required information

Distribution: Ubuntu
Distribution version: 24.04.4
The output of "incus info" (it faled after upgrade, ork after resart):

driver: lxc | qemu
 driver_version: 6.0.1 | 9.0.1
 firewall: nftables
 kernel: Linux
 kernel_architecture: x86_64
 kernel_features:
   idmapped_mounts: "true"
   netnsid_getifaddrs: "true"
   seccomp_listener: "true"
   seccomp_listener_continue: "true"
   uevent_injection: "true"
   unpriv_binfmt: "false"
   unpriv_fscaps: "true"
 kernel_version: 5.15.0-113-generic
 lxc_features:
   cgroup2: "true"
   core_scheduling: "true"
   devpts_fd: "true"
   idmapped_mounts_v2: "true"
   mount_injection_file: "true"
   network_gateway_device_route: "true"
   network_ipvlan: "true"
   network_l2proxy: "true"
   network_phys_macvlan_mtu: "true"
   network_veth_router: "true"
   pidfd: "true"
   seccomp_allow_deny_syntax: "true"
   seccomp_notify: "true"
   seccomp_proxy_send_notify_fd: "true"
 os_name: Ubuntu
 os_version: "22.04"
 project: default
 server: incus
 server_clustered: true
 server_event_mode: full-mesh
 server_name: insan02
 server_pid: 1407
 server_version: "6.3"
 storage: btrfs
 storage_version: 5.16.2
 storage_supported_drivers:
 - name: dir
   version: "1"
   remote: false
 - name: lvm
   version: 2.03.11(2) (2021-01-08) / 1.02.175 (2021-01-08) / 4.45.0
   remote: false
 - name: lvmcluster
   version: 2.03.11(2) (2021-01-08) / 1.02.175 (2021-01-08) / 4.45.0
   remote: true
 - name: btrfs
   version: 5.16.2
   remote: false

Issue description

Incus upgrade fails on updating it with sudo apt upgrade on Ubuntu. This problem persisted with every upgrade. The incus is upgraded from zabbly apt repositories.

Steps to reproduce

Run sudo apt upgrade
After waiting for substantial amount of time shows the following error

See "systemctl status incus.service" and "journalctl -xeu incus.service" for details.
dpkg: error processing package incus-base (--configure):
 installed incus-base package post-installation script subprocess returned error exit status 1
dpkg: dependency problems prevent configuration of incus:
 incus depends on incus-base (= 1:6.3-202407130507-ubuntu22.04); however:
  Package incus-base is not configured yet.

dpkg: error processing package incus (--configure):
 dependency problems - leaving unconfigured

Run sudo apt upgrade again and it installs the upgrade .
Restart the machine to make incus work again.

Information to attach

Any relevant kernel output (dmesg)

[3013479.164106] systemd[1]: systemd 249.11-0ubuntu3.12 running in system mode (+PAM +AUDIT +SELINUX +APPARMOR +IMA +SMACK +SECCOMP +GCRYPT +GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN +IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 -PWQUALITY -P11KIT -QRENCODE +BZIP2 +LZ4 +XZ +ZLIB +ZSTD -XKBCOMMON +UTMP +SYSVINIT default-hierarchy=unified)
[3013479.183788] systemd[1]: Detected architecture x86-64.
[3013479.252691] systemd[1]: Configuration file /run/systemd/system/netplan-ovs-cleanup.service is marked world-inaccessible. This has no effect as configuration data is accessible via APIs without restrictions. Proceeding anyway.
[3013479.344532] systemd[1]: Stopping Journal Service...
[3013479.344562] systemd-journald[554]: Received SIGTERM from PID 1 (systemd).
[3013479.345022] systemd[1]: Stopping Open Virtual Network host control daemon...
[3013479.345496] systemd[1]: Stopping Open Virtual Network central control daemon...
[3013479.346252] systemd[1]: Stopping Open vSwitch Record Hostname...
[3013479.346441] systemd[1]: Stopping PackageKit Daemon...
[3013479.346457] systemd[1]: systemd-networkd-wait-online.service: Deactivated successfully.
[3013479.346593] systemd[1]: Stopped Wait for Network to be Configured.
[3013479.346669] systemd[1]: Stopping Wait for Network to be Configured...
[3013479.346825] systemd[1]: Stopping Network Configuration...
[3013479.347965] systemd[1]: ovs-record-hostname.service: Deactivated successfully.
[3013479.348254] systemd[1]: Stopped Open vSwitch Record Hostname.
[3013479.348684] systemd[1]: Stopping Network Name Resolution...
[3013479.349172] systemd[1]: Stopping Network Time Synchronization...
[3013479.353163] systemd[1]: Stopping Disk Manager...
[3013479.353304] systemd[1]: Stopping Daemon for power management...
[3013479.354235] systemd[1]: upower.service: Deactivated successfully.
[3013479.354527] systemd[1]: Stopped Daemon for power management.
[3013479.354551] systemd[1]: upower.service: Consumed 8min 43.099s CPU time.
[3013479.356240] systemd[1]: Starting Daemon for power management...
[3013479.361890] systemd[1]: systemd-journald.service: Deactivated successfully.
[3013479.362318] systemd[1]: Stopped Journal Service.
[3013479.362384] systemd[1]: systemd-journald.service: Consumed 9min 57.595s CPU time.
[3013479.364488] systemd[1]: Starting Journal Service...
[3013479.367608] systemd[1]: packagekit.service: Deactivated successfully.
[3013479.367955] systemd[1]: Stopped PackageKit Daemon.
[3013479.367997] systemd[1]: packagekit.service: Consumed 28.303s CPU time.
[3013479.369102] systemd[1]: Starting PackageKit Daemon...
[3013479.376547] systemd[1]: systemd-timesyncd.service: Deactivated successfully.
[3013479.376872] systemd[1]: Stopped Network Time Synchronization.
[3013479.376916] systemd[1]: systemd-timesyncd.service: Consumed 7.831s CPU time.
[3013479.377695] systemd[1]: systemd-resolved.service: Deactivated successfully.
[3013479.378005] systemd[1]: Stopped Network Name Resolution.
[3013479.378031] systemd[1]: systemd-resolved.service: Consumed 17.220s CPU time.
[3013479.378715] systemd[1]: udisks2.service: Deactivated successfully.
[3013479.379023] systemd[1]: Stopped Disk Manager.
[3013479.379045] systemd[1]: udisks2.service: Consumed 4.704s CPU time.
[3013479.380791] systemd[1]: Starting Network Time Synchronization...
[3013479.381756] systemd[1]: Starting Disk Manager...
[3013479.381980] systemd[1]: Started Journal Service.
[3013479.783673] No such timeout policy "ovs_test_tp"
[3013479.783676] Failed to associated timeout policy `ovs_test_tp'

The text was updated successfully, but these errors were encountered:

stgraber · 2024-07-15T05:02:31Z

Hmm, so this is confusing. You're saying this is 24.04 but then all logs point to the system being 22.04. You're also reporting that the upgrade failed and hung but incus is running and correctly reporting the version as 6.3?

What happens if you do apt dist-upgrade again or potentially dpkg --configure -a if the former is failing.

vikrantrathore · 2024-07-15T05:10:54Z

No Probably wrote by mistake its 22.04.4, as mentioned in the issue when I run it again it works. Issue is this problem comes every time an upgrade is done remotely using ansible and brings down the whole cluster. I am able to upgrade and then after restart incus works fine.

stgraber · 2024-07-15T14:18:45Z

Ah, it's a cluster upgrade, for clusters you must always update all servers at the same time otherwise the first server to update will notice it's ahead of the others and will hang there waiting for the rest to match its version before continuing with its startup.

knutov · 2024-07-15T14:31:39Z

I'm wondering too, how exactly to do upgrade of cluster right?

stgraber · 2024-07-15T14:47:55Z

https://linuxcontainers.org/incus/docs/main/howto/cluster_manage/#upgrade-cluster-members

stgraber · 2024-07-15T14:50:31Z

I've been upgrading production clusters first on LXD and now on Incus for the past 5-6 years and never had an issue so long as you do make sure that everything is clean in incus cluster list prior to the upgrade and you make sure that all servers are updating at the same time.

As mentioned in the documentation, the servers will basically check a stable database table (one we can never change the schema of) to compare their own DB and API version with the rest of the clusters, if they notice they're behind, they'll refuse to start, if they notice they're ahead, they'll enter a loop waiting for all other servers to reach the same version.

Then as soon as all servers reach the same DB and API version, the startup sequence continues on all servers at the same time. The leader then goes on to apply any schema updates needed and the remaining servers perform any local data migration needed, then the cluster API becomes available to users again.

stgraber added the Incomplete Waiting on more information from reporter label Jul 15, 2024

stgraber closed this as completed Sep 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Errors encountered when upgrading incus with sudo apt upgrade #997

Errors encountered when upgrading incus with sudo apt upgrade #997

vikrantrathore commented Jul 15, 2024 •

edited

Loading

stgraber commented Jul 15, 2024

vikrantrathore commented Jul 15, 2024

stgraber commented Jul 15, 2024

knutov commented Jul 15, 2024 •

edited

Loading

stgraber commented Jul 15, 2024

stgraber commented Jul 15, 2024

Errors encountered when upgrading incus with sudo apt upgrade #997

Errors encountered when upgrading incus with sudo apt upgrade #997

Comments

vikrantrathore commented Jul 15, 2024 • edited Loading

Required information

Issue description

Steps to reproduce

Information to attach

stgraber commented Jul 15, 2024

vikrantrathore commented Jul 15, 2024

stgraber commented Jul 15, 2024

knutov commented Jul 15, 2024 • edited Loading

stgraber commented Jul 15, 2024

stgraber commented Jul 15, 2024

vikrantrathore commented Jul 15, 2024 •

edited

Loading

knutov commented Jul 15, 2024 •

edited

Loading