Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cannot checkpoint a vnc server container #2459

Closed
coldbloodx opened this issue Aug 1, 2024 · 6 comments
Closed

cannot checkpoint a vnc server container #2459

coldbloodx opened this issue Aug 1, 2024 · 6 comments

Comments

@coldbloodx
Copy link

Description
vnc server container cannot be checkpointed.

Steps to reproduce the issue:

  1. create a vnc server image with ubuntu 24.04 base image just install below packages:
    apt install -y xfce4 xfce4-session xfce4-terminal tightvncserver xauth
  2. init vnc server related files, password and xstartup script, here is my xstart script
[root@laworks .vnc]# cat xstartup
#!/bin/sh

xrdb "$HOME/.Xresources"
xsetroot -solid grey
x-terminal-emulator -geometry 80x24+10+10 -ls -title "$VNCDESKTOP Desktop" &
x-window-manager &
# Fix to make GNOME work
export XKL_XMODMAP_DISABLE=1
/etc/X11/Xsession

3.run container with below docker command:

[root@laworks ~]# docker run -d -v /root:/root -e USER=root  --net=host  --name cstest ubuntuvnc:v1 bash -c 'vncserver; tail -f /dev/null' 
 [root@laworks ~]# docker ps
CONTAINER ID   IMAGE          COMMAND                  CREATED       STATUS       PORTS     NAMES
3c0043e01e67   ubuntuvnc:v1   "bash -c 'vncserver;…"   3 hours ago   Up 3 hours             cstest

  1. connect to the vncserver by vncviewer, open a terminal in the viewer, then run below commd
    image

Describe the results you received:
checkpoint above vncserver container, get error like below

[root@laworks ~]# docker checkpoint create cstest cp001
Error response from daemon: Cannot checkpoint container cstest: runc did not terminate successfully: exit status 1: criu failed: type NOTIFY errno 0 path= /run/containerd/io.containerd.runtime.v2.task/moby/3c0043e01e6729d2a9745d26bbc2a3baa3faddb4c169cb217d777dc24fe868d0/criu-dump.log: unknown

check criu-dump.log

(00.392350) sockets: Searching for socket 0x79317 family 1
(00.392376) sockets: No filter for socket
(00.392378) unix: Dumping unix socket at 6
(00.392379) unix:       Dumping: ino 496407 peer_ino 0 family    1 type    1 state 10 name /root/.cache/ibus/dbus-mHfDLJSx
(00.392384) unix:       Dumped: id 0x29 ino 496407 peer 0 type 1 state 10 name 32 bytes
(00.392394) 77977 fdinfo 7: pos:                0 flags:                2/0x1
(00.392402) Error (criu/files-ext.c:94): Can't dump file 7 of that type [600] (anon anon_inode:[pidfd])   ---> this line.
(00.392410) ----------------------------------------
(00.392424) Error (criu/cr-dump.c:1674): Dump files (pid: 77977) failed with -1
(00.392434) Waiting for 77977 to trap
(00.392453) Daemon 77977 exited trapping
(00.392459) Sent msg to daemon 3 0 0

full logs will be attached in this ticket.

Describe the results you expected:
container should be checkpointed successfully.

Additional information you deem important (e.g. issue happens only occasionally):
below are process info of my vncserver container.
I've tried this case several times, and got none succeed with docker.

[root@laworks ~]# ps -ewf |grep shim
root       77840       1  0 11:53 ?        00:00:01 /usr/bin/containerd-shim-runc-v2 -namespace moby -id 3c0043e01e6729d2a9745d26bbc2a3baa3faddb4c169cb217d777dc24fe868d0 -address /run/containerd/containerd.sock
root       97514   64766  0 14:34 pts/2    00:00:00 grep --color=auto shim
[root@laworks ~]# pstree -cups 77840
systemd(1)───containerd-shim(77840)─┬─tail(77860)─┬─Xtightvnc(77884)
                                    │             ├─ibus-daemon(77977)─┬─ibus-dconf(77981)─┬─{ibus-dconf}(77986)
                                    │             │                    │                   ├─{ibus-dconf}(77987)
                                    │             │                    │                   ├─{ibus-dconf}(77989)
                                    │             │                    │                   └─{ibus-dconf}(77991)
                                    │             │                    ├─ibus-engine-sim(78023)─┬─{ibus-engine-sim}(78024)
                                    │             │                    │                        ├─{ibus-engine-sim}(78025)
                                    │             │                    │                        └─{ibus-engine-sim}(78026)
                                    │             │                    ├─ibus-extension-(77983)─┬─{ibus-extension-}(77997)
                                    │             │                    │                        ├─{ibus-extension-}(78001)
                                    │             │                    │                        ├─{ibus-extension-}(78004)
                                    │             │                    │                        └─{ibus-extension-}(78013)
                                    │             │                    ├─ibus-ui-gtk3(77982)─┬─{ibus-ui-gtk3}(77998)
                                    │             │                    │                     ├─{ibus-ui-gtk3}(78000)
                                    │             │                    │                     ├─{ibus-ui-gtk3}(78006)
                                    │             │                    │                     ├─{ibus-ui-gtk3}(78011)
                                    │             │                    │                     └─{ibus-ui-gtk3}(78015)
                                    │             │                    ├─{ibus-daemon}(77978)
                                    │             │                    ├─{ibus-daemon}(77979)
                                    │             │                    └─{ibus-daemon}(77990)
                                    │             ├─ibus-x11(77985)─┬─{ibus-x11}(77996)
                                    │             │                 ├─{ibus-x11}(77999)
                                    │             │                 └─{ibus-x11}(78005)
                                    │             ├─x-window-manage(77894)
                                    │             ├─xfce4-terminal(77893)─┬─bash(78020)───sleep(97530)
                                    │             │                       ├─{xfce4-terminal}(77914)
                                    │             │                       ├─{xfce4-terminal}(77915)
                                    │             │                       ├─{xfce4-terminal}(77944)
                                    │             │                       └─{xfce4-terminal}(78019)
                                    │             └─xstartup(77890)
                                    ├─{containerd-shim}(77841)
                                    ├─{containerd-shim}(77842)
                                    ├─{containerd-shim}(77843)
                                    ├─{containerd-shim}(77844)
                                    ├─{containerd-shim}(77845)
                                    ├─{containerd-shim}(77846)
                                    ├─{containerd-shim}(77847)
                                    ├─{containerd-shim}(77848)
                                    ├─{containerd-shim}(77866)
                                    ├─{containerd-shim}(78098)
                                    └─{containerd-shim}(81087)

CRIU logs and information:

CRIU full dump/restore logs:

[criu-dump.log](https://github.com/user-attachments/files/16451457/criu-dump.log)

Output of `criu --version`:

[root@laworks ~]# criu --version
Version: 3.19

Output of `criu check --all`:

[root@laworks ~]# criu check --all
Warn  (criu/cr-check.c:1346): Nftables based locking requires libnftables and set concatenations support
Looks good but some kernel features are missing
which, depending on your process tree, may cause
dump or restore failure.

Additional environment details:

@adrianreber
Copy link
Member

This sounds like it might be solved with #2449.

@coldbloodx
Copy link
Author

coldbloodx commented Aug 1, 2024

This sounds like it might be solved with #2449.

checked #2449, it is still in draft state, when will the fix be merged? I'd like to try it immediately.

@adrianreber
Copy link
Member

This sounds like it might be solved with #2449.

checked #2449, it is still in draft state when will the fix be merged? I'd like to try it immediately.

This is almost impossible to predict. But you can try it already now and build CRIU yourself with the patches from #2449 applied to see if it actually solves your problem.

@coldbloodx
Copy link
Author

@adrianreber
Thanks bro, I'll try it later

bsach64 added a commit to bsach64/criu that referenced this issue Aug 3, 2024
bsach64 added a commit to bsach64/criu that referenced this issue Aug 8, 2024
bsach64 added a commit to bsach64/criu that referenced this issue Aug 11, 2024
Process file descriptors (pidfds) were introduced to provide a stable
handle on a process. They solve the problem of pid recycling.

For a detailed explanation, see https://lwn.net/Articles/801319/ and
http://www.corsix.org/content/what-is-a-pidfd

Before Linux 6.9, anonymous inodes were used for the implementation of
pidfds. So, we detect them in a fashion similiar to other fd types that
use anonymous inodes by calling `readlink()`.
After 6.9, pidfs (a file system for pidfds) was introduced.
After this change, pidfs inodes have no file type in st_mode in
userspace.
(https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/pidfs.c?h=v6.11-rc2#n285)
We use `PID_FS_MAGIC` to detect pidfds for kernel >= 6.9

For pidfds that refer to dead processes, we lose the pid of the process
as the Pid and NSpid fields in /proc/<pid>/fdinfo/<pidfd> change to -1.
So, we create a temporary process for each unique inode and open pidfds
that refer to this process. After all pidfds have been opened we kill
this temporary process.

Fixes: checkpoint-restore#2258 checkpoint-restore#2459

Signed-off-by: Bhavik Sachdev <[email protected]>
bsach64 added a commit to bsach64/criu that referenced this issue Aug 12, 2024
Process file descriptors (pidfds) were introduced to provide a stable
handle on a process. They solve the problem of pid recycling.

For a detailed explanation, see https://lwn.net/Articles/801319/ and
http://www.corsix.org/content/what-is-a-pidfd

Before Linux 6.9, anonymous inodes were used for the implementation of
pidfds. So, we detect them in a fashion similiar to other fd types that
use anonymous inodes by calling `readlink()`.
After 6.9, pidfs (a file system for pidfds) was introduced.
After this change, pidfs inodes have no file type in st_mode in
userspace.
(https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/pidfs.c?h=v6.11-rc2#n285)
We use `PID_FS_MAGIC` to detect pidfds for kernel >= 6.9

For pidfds that refer to dead processes, we lose the pid of the process
as the Pid and NSpid fields in /proc/<pid>/fdinfo/<pidfd> change to -1.
So, we create a temporary process for each unique inode and open pidfds
that refer to this process. After all pidfds have been opened we kill
this temporary process.

Fixes: checkpoint-restore#2258 checkpoint-restore#2459

Signed-off-by: Bhavik Sachdev <[email protected]>
bsach64 added a commit to bsach64/criu that referenced this issue Aug 12, 2024
Process file descriptors (pidfds) were introduced to provide a stable
handle on a process. They solve the problem of pid recycling.

For a detailed explanation, see https://lwn.net/Articles/801319/ and
http://www.corsix.org/content/what-is-a-pidfd

Before Linux 6.9, anonymous inodes were used for the implementation of
pidfds. So, we detect them in a fashion similiar to other fd types that
use anonymous inodes by calling `readlink()`.
After 6.9, pidfs (a file system for pidfds) was introduced.
After this change, pidfs inodes have no file type in st_mode in
userspace.
(https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/pidfs.c?h=v6.11-rc2#n285)
We use `PID_FS_MAGIC` to detect pidfds for kernel >= 6.9

For pidfds that refer to dead processes, we lose the pid of the process
as the Pid and NSpid fields in /proc/<pid>/fdinfo/<pidfd> change to -1.
So, we create a temporary process for each unique inode and open pidfds
that refer to this process. After all pidfds have been opened we kill
this temporary process.

Fixes: checkpoint-restore#2258 checkpoint-restore#2459

Signed-off-by: Bhavik Sachdev <[email protected]>
bsach64 added a commit to bsach64/criu that referenced this issue Aug 14, 2024
Process file descriptors (pidfds) were introduced to provide a stable
handle on a process. They solve the problem of pid recycling.

For a detailed explanation, see https://lwn.net/Articles/801319/ and
http://www.corsix.org/content/what-is-a-pidfd

Before Linux 6.9, anonymous inodes were used for the implementation of
pidfds. So, we detect them in a fashion similiar to other fd types that
use anonymous inodes by calling `readlink()`.
After 6.9, pidfs (a file system for pidfds) was introduced.
After this change, pidfs inodes have no file type in st_mode in
userspace.
(https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/pidfs.c?h=v6.11-rc2#n285)
We use `PID_FS_MAGIC` to detect pidfds for kernel >= 6.9

For pidfds that refer to dead processes, we lose the pid of the process
as the Pid and NSpid fields in /proc/<pid>/fdinfo/<pidfd> change to -1.
So, we create a temporary process for each unique inode and open pidfds
that refer to this process. After all pidfds have been opened we kill
this temporary process.

Fixes: checkpoint-restore#2258 checkpoint-restore#2459

Signed-off-by: Bhavik Sachdev <[email protected]>
bsach64 added a commit to bsach64/criu that referenced this issue Aug 16, 2024
Process file descriptors (pidfds) were introduced to provide a stable
handle on a process. They solve the problem of pid recycling.

For a detailed explanation, see https://lwn.net/Articles/801319/ and
http://www.corsix.org/content/what-is-a-pidfd

Before Linux 6.9, anonymous inodes were used for the implementation of
pidfds. So, we detect them in a fashion similiar to other fd types that
use anonymous inodes by calling `readlink()`.
After 6.9, pidfs (a file system for pidfds) was introduced.
In 6.9 `S_ISREG()` returned true for pidfds, but this again changed with
6.10.
(https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/pidfs.c?h=v6.11-rc2#n285)
After this change, pidfs inodes have no file type in st_mode in
userspace.
We use `PID_FS_MAGIC` to detect pidfds for kernel >= 6.9
Hence, check for pidfds occurs before the check for regular files.

For pidfds that refer to dead processes, we lose the pid of the process
as the Pid and NSpid fields in /proc/<pid>/fdinfo/<pidfd> change to -1.
So, we create a temporary process for each unique inode and open pidfds
that refer to this process. After all pidfds have been opened we kill
this temporary process.

Fixes: checkpoint-restore#2258 checkpoint-restore#2459

Signed-off-by: Bhavik Sachdev <[email protected]>
bsach64 added a commit to bsach64/criu that referenced this issue Aug 20, 2024
Process file descriptors (pidfds) were introduced to provide a stable
handle on a process. They solve the problem of pid recycling.

For a detailed explanation, see https://lwn.net/Articles/801319/ and
http://www.corsix.org/content/what-is-a-pidfd

Before Linux 6.9, anonymous inodes were used for the implementation of
pidfds. So, we detect them in a fashion similiar to other fd types that
use anonymous inodes by calling `readlink()`.
After 6.9, pidfs (a file system for pidfds) was introduced.
In 6.9 `S_ISREG()` returned true for pidfds, but this again changed with
6.10.
(https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/pidfs.c?h=v6.11-rc2#n285)
After this change, pidfs inodes have no file type in st_mode in
userspace.
We use `PID_FS_MAGIC` to detect pidfds for kernel >= 6.9
Hence, check for pidfds occurs before the check for regular files.

For pidfds that refer to dead processes, we lose the pid of the process
as the Pid and NSpid fields in /proc/<pid>/fdinfo/<pidfd> change to -1.
So, we create a temporary process for each unique inode and open pidfds
that refer to this process. After all pidfds have been opened we kill
this temporary process.

Fixes: checkpoint-restore#2258 checkpoint-restore#2459

Signed-off-by: Bhavik Sachdev <[email protected]>
bsach64 added a commit to bsach64/criu that referenced this issue Aug 20, 2024
Process file descriptors (pidfds) were introduced to provide a stable
handle on a process. They solve the problem of pid recycling.

For a detailed explanation, see https://lwn.net/Articles/801319/ and
http://www.corsix.org/content/what-is-a-pidfd

Before Linux 6.9, anonymous inodes were used for the implementation of
pidfds. So, we detect them in a fashion similiar to other fd types that
use anonymous inodes by calling `readlink()`.
After 6.9, pidfs (a file system for pidfds) was introduced.
In 6.9 `S_ISREG()` returned true for pidfds, but this again changed with
6.10.
(https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/pidfs.c?h=v6.11-rc2#n285)
After this change, pidfs inodes have no file type in st_mode in
userspace.
We use `PID_FS_MAGIC` to detect pidfds for kernel >= 6.9
Hence, check for pidfds occurs before the check for regular files.

For pidfds that refer to dead processes, we lose the pid of the process
as the Pid and NSpid fields in /proc/<pid>/fdinfo/<pidfd> change to -1.
So, we create a temporary process for each unique inode and open pidfds
that refer to this process. After all pidfds have been opened we kill
this temporary process.

This commit does not include support for pidfds that point to a specific
thread, i.e pidfds opened with `PIDFD_THREAD` flag.

Fixes: checkpoint-restore#2258 checkpoint-restore#2459

Signed-off-by: Bhavik Sachdev <[email protected]>
bsach64 added a commit to bsach64/criu that referenced this issue Aug 28, 2024
Process file descriptors (pidfds) were introduced to provide a stable
handle on a process. They solve the problem of pid recycling.

For a detailed explanation, see https://lwn.net/Articles/801319/ and
http://www.corsix.org/content/what-is-a-pidfd

Before Linux 6.9, anonymous inodes were used for the implementation of
pidfds. So, we detect them in a fashion similiar to other fd types that
use anonymous inodes by calling `readlink()`.
After 6.9, pidfs (a file system for pidfds) was introduced.
In 6.9 `S_ISREG()` returned true for pidfds, but this again changed with
6.10.
(https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/pidfs.c?h=v6.11-rc2#n285)
After this change, pidfs inodes have no file type in st_mode in
userspace.
We use `PID_FS_MAGIC` to detect pidfds for kernel >= 6.9
Hence, check for pidfds occurs before the check for regular files.

For pidfds that refer to dead processes, we lose the pid of the process
as the Pid and NSpid fields in /proc/<pid>/fdinfo/<pidfd> change to -1.
So, we create a temporary process for each unique inode and open pidfds
that refer to this process. After all pidfds have been opened we kill
this temporary process.

This commit does not include support for pidfds that point to a specific
thread, i.e pidfds opened with `PIDFD_THREAD` flag.

Fixes: checkpoint-restore#2258 checkpoint-restore#2459

Signed-off-by: Bhavik Sachdev <[email protected]>
Copy link

github-actions bot commented Sep 1, 2024

A friendly reminder that this issue had no activity for 30 days.

@coldbloodx
Copy link
Author

already successfully dump/restore GUI applications on a remote vnc icewm desktop.

bsach64 added a commit to bsach64/criu that referenced this issue Sep 15, 2024
Process file descriptors (pidfds) were introduced to provide a stable
handle on a process. They solve the problem of pid recycling.

For a detailed explanation, see https://lwn.net/Articles/801319/ and
http://www.corsix.org/content/what-is-a-pidfd

Before Linux 6.9, anonymous inodes were used for the implementation of
pidfds. So, we detect them in a fashion similiar to other fd types that
use anonymous inodes by calling `readlink()`.
After 6.9, pidfs (a file system for pidfds) was introduced.
In 6.9 `S_ISREG()` returned true for pidfds, but this again changed with
6.10.
(https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/pidfs.c?h=v6.11-rc2#n285)
After this change, pidfs inodes have no file type in st_mode in
userspace.
We use `PID_FS_MAGIC` to detect pidfds for kernel >= 6.9
Hence, check for pidfds occurs before the check for regular files.

For pidfds that refer to dead processes, we lose the pid of the process
as the Pid and NSpid fields in /proc/<pid>/fdinfo/<pidfd> change to -1.
So, we create a temporary process for each unique inode and open pidfds
that refer to this process. After all pidfds have been opened we kill
this temporary process.

This commit does not include support for pidfds that point to a specific
thread, i.e pidfds opened with `PIDFD_THREAD` flag.

Fixes: checkpoint-restore#2258 checkpoint-restore#2459

Signed-off-by: Bhavik Sachdev <[email protected]>
bsach64 added a commit to bsach64/criu that referenced this issue Sep 18, 2024
Process file descriptors (pidfds) were introduced to provide a stable
handle on a process. They solve the problem of pid recycling.

For a detailed explanation, see https://lwn.net/Articles/801319/ and
http://www.corsix.org/content/what-is-a-pidfd

Before Linux 6.9, anonymous inodes were used for the implementation of
pidfds. So, we detect them in a fashion similiar to other fd types that
use anonymous inodes by calling `readlink()`.
After 6.9, pidfs (a file system for pidfds) was introduced.
In 6.9 `S_ISREG()` returned true for pidfds, but this again changed with
6.10.
(https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/pidfs.c?h=v6.11-rc2#n285)
After this change, pidfs inodes have no file type in st_mode in
userspace.
We use `PID_FS_MAGIC` to detect pidfds for kernel >= 6.9
Hence, check for pidfds occurs before the check for regular files.

For pidfds that refer to dead processes, we lose the pid of the process
as the Pid and NSpid fields in /proc/<pid>/fdinfo/<pidfd> change to -1.
So, we create a temporary process for each unique inode and open pidfds
that refer to this process. After all pidfds have been opened we kill
this temporary process.

This commit does not include support for pidfds that point to a specific
thread, i.e pidfds opened with `PIDFD_THREAD` flag.

Fixes: checkpoint-restore#2258 checkpoint-restore#2459

Signed-off-by: Bhavik Sachdev <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants