Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CRIU dump failed in docker container - runc did not terminate successfully #2368

Open
uetaaam opened this issue Mar 23, 2024 · 7 comments
Open

Comments

@uetaaam
Copy link

uetaaam commented Mar 23, 2024

I am trying to create checkpoint in my docker container using command:
docker checkpoint create container chekcpoint1
and i am getting this error:
Error response from daemon: Cannot checkpoint container container: runc did not terminate successfully: exit status 1: criu failed: type NOTIFY errno 0 path=

CRIU logs and information:

(00.112878) irmap: Refresh stat for /etc/ssl/private
(00.112891) irmap: Refilling /etc/ssl/private dir
(00.112905) irmap: Scanning /var/spool hint
(00.112908) irmap: Refresh stat for /var/spool
(00.112914) irmap: Refilling /var/spool dir
(00.112927) irmap: Refresh stat for /var/spool/mail
(00.112932) irmap: Scanning /var/log hint
(00.112934) irmap: Refresh stat for /var/log
(00.112939) irmap: Refilling /var/log dir
(00.112962) irmap: Refresh stat for /var/log/apt
(00.112967) irmap: Refilling /var/log/apt dir
(00.112986) irmap: Refresh stat for /var/log/apt/eipp.log.xz
(00.112991) irmap: Refresh stat for /var/log/apt/history.log
(00.112997) irmap: Refresh stat for /var/log/apt/term.log
(00.113002) irmap: Refresh stat for /var/log/btmp
(00.113007) irmap: Refresh stat for /var/log/faillog
(00.113011) irmap: Refresh stat for /var/log/lastlog
(00.113016) irmap: Refresh stat for /var/log/wtmp
(00.113021) irmap: Refresh stat for /var/log/dpkg.log
(00.113026) irmap: Scanning /usr/share/dbus-1/system-services hint
(00.113028) irmap: Refresh stat for /usr/share/dbus-1/system-services
(00.113039) Error (criu/irmap.c:104): irmap: Can't stat /usr/share/dbus-1/system-services: No such file or directory
(00.113045) irmap: Scanning /var/lib/polkit-1/localauthority hint
(00.113048) irmap: Refresh stat for /var/lib/polkit-1/localauthority
(00.113052) Error (criu/irmap.c:104): irmap: Can't stat /var/lib/polkit-1/localauthority: No such file or directory
(00.113055) irmap: Scanning /usr/share/polkit-1/actions hint
(00.113057) irmap: Refresh stat for /usr/share/polkit-1/actions
(00.113062) irmap: Refilling /usr/share/polkit-1/actions dir
(00.113076) irmap: Refresh stat for /usr/share/polkit-1/actions/org.dpkg.pkexec.update-alternatives.policy
(00.113082) irmap: Scanning /lib/udev hint
(00.113085) irmap: Refresh stat for /lib/udev
(00.113090) irmap: Refilling /lib/udev dir
(00.113102) irmap: Refresh stat for /lib/udev/rules.d
(00.113108) irmap: Refilling /lib/udev/rules.d dir
(00.113120) irmap: Refresh stat for /lib/udev/rules.d/96-e2scrub.rules
(00.113126) irmap: Scanning /. hint
(00.113128) irmap: Refresh stat for /.
(00.113132) irmap: Scanning /no-such-path hint
(00.113134) irmap: Refresh stat for /no-such-path
(00.113138) Error (criu/irmap.c:104): irmap: Can't stat /no-such-path: No such file or directory
(00.113141) Error (criu/fsnotify.c:284): fsnotify: Can't dump that handle
(00.113179) ----------------------------------------
(00.113205) Error (criu/cr-dump.c:1669): Dump files (pid: 1850966) failed with -1
(00.113214) Waiting for 1850966 to trap
(00.113274) Daemon 1850966 exited trapping
(00.113290) Sent msg to daemon 3 0 0
pie: 1: __fetched msg: 3 0 0
pie: 1: 1: new_sp=0x7fe7cfff3848 ip 0x7fe97a181ad8
(00.113459) 1850966 was trapped
(00.113487) 1850966 was trapped
(00.113493) 1850966 (native) is going to execute the syscall 15, required is 15
(00.113541) 1850966 was stopped
(00.113859) net: Unlock network
(00.113866) Running network-unlock scripts
(00.113869) RPC
(00.138992) Unfreezing tasks into 1
(00.139030) Unseizing 1850966 into 1
(00.139579) Error (criu/cr-dump.c:2093): Dumping FAILED.

How can i solve this?

@adrianreber
Copy link
Member

I think there was recently a discussion that inotify does not work on overlayfs with default options. So either you have to change the mount options or use a container that does not use inotify

@uetaaam
Copy link
Author

uetaaam commented Mar 24, 2024

I've changed storage driver to vfs and now i am getting another error:
(00.975275) 0x7fe3c118f000-0x7fe3c1190000 (4K) prot 0x1 flags 0x2 fdflags 0 st 0x41 off 0 reg fp shmid: 0x5b
(00.975279) 0x7fe3c1190000-0x7fe3c11b0000 (128K) prot 0x5 flags 0x2 fdflags 0 st 0x41 off 0x1000 reg fp shmid: 0x5b
(00.975283) 0x7fe3c11b0000-0x7fe3c11b8000 (32K) prot 0x1 flags 0x2 fdflags 0 st 0x41 off 0x21000 reg fp shmid: 0x5b
(00.975287) 0x7fe3c11b8000-0x7fe3c11b9000 (4K) prot 0x3 flags 0x22 fdflags 0 st 0x201 off 0 reg ap shmid: 0
(00.975291) 0x7fe3c11b9000-0x7fe3c11ba000 (4K) prot 0x1 flags 0x2 fdflags 0 st 0x41 off 0x29000 reg fp shmid: 0x5b
(00.975296) 0x7fe3c11ba000-0x7fe3c11bb000 (4K) prot 0x3 flags 0x2 fdflags 0 st 0x41 off 0x2a000 reg fp shmid: 0x5b
(00.975301) 0x7fe3c11bb000-0x7fe3c11bc000 (4K) prot 0x3 flags 0x22 fdflags 0 st 0x201 off 0 reg ap shmid: 0
(00.975306) 0x7fff5062f000-0x7fff50650000 (132K) prot 0x3 flags 0x122 fdflags 0 st 0x201 off 0 reg ap shmid: 0
(00.975310) 0x7fff50763000-0x7fff50767000 (16K) prot 0x1 flags 0x22 fdflags 0 st 0x1201 off 0 reg vvar ap shmid: 0
(00.975313) 0x7fff50767000-0x7fff50769000 (8K) prot 0x5 flags 0x22 fdflags 0 st 0x209 off 0 reg vdso ap shmid: 0
(00.975316) 0xffffffffff600000-0xffffffffff601000 (4K) prot 0x4 flags 0x22 fdflags 0 st 0x204 off 0 vsys ap shmid: 0
(00.975320) Obtaining task auvx ...
(00.976075) Dumping path for -3 fd via self 19 [/app/AutoDispatcherServer]
(00.976122) Dumping path for -3 fd via self 19 [/]
(00.976132) Dumping task cwd id 0xcc root id 0xcd
(00.976321) mnt: Dumping mountpoints
(00.976332) mnt: 549: 46:/ @ ./sys/devices/virtual/powercap
(00.984918) mnt: 548: 45:/ @ ./sys/firmware
(00.991427) mnt: 547: 44:/ @ ./proc/scsi
(00.997778) mnt: 546: 3a:/null @ ./proc/timer_list
(00.997794) mnt: 545: 3a:/null @ ./proc/keys
(00.997799) mnt: 544: 3a:/null @ ./proc/kcore
(00.997803) mnt: 543: 43:/ @ ./proc/acpi
(01.003814) mnt: 542: 39:/sysrq-trigger @ ./proc/sysrq-trigger
(01.003828) mnt: 541: 39:/sys @ ./proc/sys
(01.003832) mnt: 540: 39:/irq @ ./proc/irq
(01.003836) mnt: 539: 39:/fs @ ./proc/fs
(01.003839) mnt: 538: 39:/bus @ ./proc/bus
(01.003842) mnt: 666: 10300001:/var/lib/docker/containers/bde2afe8c23fab10c828a4cbd6db9a20fa96f870be7b3c4437f104fd97a7ae41/hosts @ ./etc/hosts
(01.003849) mnt: 665: 10300001:/var/lib/docker/containers/bde2afe8c23fab10c828a4cbd6db9a20fa96f870be7b3c4437f104fd97a7ae41/hostname @ ./etc/hostname
(01.003853) mnt: 663: 10300001:/var/lib/docker/containers/bde2afe8c23fab10c828a4cbd6db9a20fa96f870be7b3c4437f104fd97a7ae41/resolv.conf @ ./etc/resolv.conf
(01.003858) mnt: 661: 41:/ @ ./dev/shm
(01.009696) mnt: 658: 36:/ @ ./dev/mqueue
(01.009793) mnt: 657: 19:/ @ ./sys/fs/cgroup
(01.009799) mnt: 655: 3d:/ @ ./sys
(01.009803) mnt: 652: 3b:/ @ ./dev/pts
(01.009806) mnt: 651: 3a:/ @ ./dev
(01.009809) mnt: Mount is not fully visible ./dev(651)
(01.009864) mnt: mount has children ./dev(651)
(01.017304) mnt: 650: 39:/ @ ./proc
(01.017319) mnt: 649: 10300001:/var/lib/docker/vfs/dir/ea77f4f8b0d6c5e5dd18a3f8397d4568dbbe0e9fc369bf0784803b4605d10406 @ ./
(01.017352) Dumping file-locks
(01.017356) Error (criu/file-lock.c:110): Some file locks are hold by dumping tasks! You can try --file-locks to dump them.
(01.017453) net: Unlock network
(01.017603) Running network-unlock scripts
(01.017609) RPC
(01.040770) Unfreezing tasks into 1
(01.040803) Unseizing 390283 into 1
(01.041144) Error (criu/cr-dump.c:2093): Dumping FAILED.

@adrianreber
Copy link
Member

Try to enable support to dump file-locks as described in the error message. With Podman you can do something like podman container checkpoint --file-locks. Not sure if this works in Docker.

You could also do echo "file-locks" >> /etc/criu/runc.conf to try to make work in Docker.

@adrianreber
Copy link
Member

@uetaaam Can this be closed?

Copy link

A friendly reminder that this issue had no activity for 30 days.

@okpreetam
Copy link

use a container that does not use inotify

@adrianreber

  1. Are you referring to use other containerization technologies such as LXC or Docker?
  2. Is there a method to disable/remove filesystem notification events from a container or image so that we can create container checkpoints using the overlay storage driver?

Initially, I encountered an issue with inotify while attempting to create a checkpoint:
Error (criu/fsnotify.c:284): fsnotify: Can't dump that handle
Following your advice in the comments, switching the storage driver to vfs resolved this issue. However, this change significantly impacted performance for tasks such as image pull/push and creating/restoring checkpoints.

Copy link

A friendly reminder that this issue had no activity for 30 days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants