Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Add error message output to respond to the impact of Ubuntu 24.04 #1042

Conversation

younghojan
Copy link
Member

@younghojan younghojan commented May 24, 2024

  • Updated container.py and containerexecutor.py to detect whether BenchExec is being used with container mode enabled on Ubuntu 24.04 and the use of unprivileged user namespaces is restricted in system configuration, output the corresponding error message if so.
  • Updated the relevant documentation.

Part of #1041

@younghojan younghojan linked an issue May 24, 2024 that may be closed by this pull request
Copy link
Member

@PhilippWendler PhilippWendler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A general question:

Does the container mode work at all in Ubuntu 24.04 with the default settings?
In #1041 it fails due to network configuration, does it work with --network-access?

If it does not work at all due to further problems, we don't want to print several warnings just to fail with a different error in the end. Instead, it would be easier for the user to understand if we just detect the situation once the first problem occurs and abort immediately with an appropriate error message that includes the explanation.

If container mode works in principle, but several features are missing, then it would probably be appropriate to print a warning with the explanation and continue. But then we should also handle the problem with the networking.

benchexec/container.py Outdated Show resolved Hide resolved
benchexec/containerexecutor.py Outdated Show resolved Hide resolved
benchexec/containerexecutor.py Outdated Show resolved Hide resolved
benchexec/containerexecutor.py Outdated Show resolved Hide resolved
@PhilippWendler PhilippWendler added the container related to container mode label May 27, 2024
@younghojan
Copy link
Member Author

A general question:

Does the container mode work at all in Ubuntu 24.04 with the default settings? In #1041 it fails due to network configuration, does it work with --network-access?

If it does not work at all due to further problems, we don't want to print several warnings just to fail with a different error in the end. Instead, it would be easier for the user to understand if we just detect the situation once the first problem occurs and abort immediately with an appropriate error message that includes the explanation.

If container mode works in principle, but several features are missing, then it would probably be appropriate to print a warning with the explanation and continue. But then we should also handle the problem with the networking.

After adding --network-access, the function activate_network_interface() will no longer encounter an error. An error will subsequently occur in child()->_setup_container_filesystem()->libc.mount(None, temp_dir, b"tmpfs", 0, tmpfs_opts).

(benchexec-venv) haoranyang@haoranyang-fudan:~/benchexec$ containerexec --network-access --debug /usr/bin/ls2024-05-27 15:36:22 - DEBUG - This is containerexec 3.22-dev.
2024-05-27 15:36:22 - WARNING - The container configuration disables DNS, host lookups will fail despite --network-access. Consider using --keep-system-config.
2024-05-27 15:36:22 - INFO - Starting command /usr/bin/ls
2024-05-27 15:36:22 - DEBUG - Available Cgroups: {}
2024-05-27 15:36:22 - DEBUG - Starting process.
2024-05-27 15:36:22 - DEBUG - Parent: child process of RunExecutor with PID 355923 started.
2024-05-27 15:36:22 - DEBUG - Child: child process of RunExecutor with PID 355923 started
2024-05-27 15:36:22 - WARNING - Changing hostname in container prevented by system configuration, real hostname will leak into the container.
2024-05-27 15:36:22 - CRITICAL - Failed to configure container with operation 'raise OSError(errno, msg)': [Errno 13] mount(None, b'/tmp/BenchExec_run_kzzs3je7', b'tmpfs', 0, b'size=100%') failed: Permission denied
2024-05-27 15:36:22 - DEBUG - Waiting for process /usr/bin/ls with pid 355923
2024-05-27 15:36:22 - DEBUG - Parent: child process of RunExecutor with PID 355923 terminated with return value 128.
2024-05-27 15:36:22 - DEBUG - Process terminated, exit code 0.
2024-05-27 15:36:22 - DEBUG - Cleaning up temporary directory.
2024-05-27 15:36:22 - ERROR - execution in container failed, check log for details
Traceback (most recent call last):
  File "/home/haoranyang/benchexec/benchexec/containerexecutor.py", line 952, in _start_execution_in_container
    grandchild_pid = int(os.read(from_grandchild, 10))
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: invalid literal for int() with base 10: b''

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/haoranyang/benchexec/benchexec/containerexecutor.py", line 297, in main
    result = executor.execute_run(
             ^^^^^^^^^^^^^^^^^^^^^
  File "/home/haoranyang/benchexec/benchexec/containerexecutor.py", line 466, in execute_run
    pid, result_fn = self._start_execution(
                     ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/haoranyang/benchexec/benchexec/containerexecutor.py", line 535, in _start_execution
    result = self._start_execution_in_container(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/haoranyang/benchexec/benchexec/containerexecutor.py", line 956, in _start_execution_in_container
    check_child_exit_code()
  File "/home/haoranyang/benchexec/benchexec/containerexecutor.py", line 909, in check_child_exit_code
    raise BenchExecException(
benchexec.BenchExecException: execution in container failed, check log for details
Cannot execute /usr/bin/ls: execution in container failed, check log for details.

So, should we let child return CHILD_OSERROR when child()->socket.sethostname() is called and failed?

@PhilippWendler
Copy link
Member

Does the container mode work at all in Ubuntu 24.04 with the default settings? In #1041 it fails due to network configuration, does it work with --network-access?

After adding --network-access, the function activate_network_interface() will no longer encounter an error. An error will subsequently occur in child()->_setup_container_filesystem()->libc.mount(None, temp_dir, b"tmpfs", 0, tmpfs_opts).

Thanks for trying out! So this means that we cannot mount anything in the container, but mounting stuff is a hard requirement for our container mode. So there is no way to make container mode work on Ubuntu 24.04 except by letting the user change the system config.

So, should we let child return CHILD_OSERROR when child()->socket.sethostname() is called and failed?

I am not 100% sure how to best implement this. Let's think about this together. I see the following goals:

  • Detection must work no matter which options the user has, so for example also with --keep-system-config.
  • On other systems (non-Ubuntu) BenchExec should not abort if changing the hostname fails, because it might be able to continue. We can distinguish such cases with the check for the current value of the sysctl setting (what you implemented already).
  • We only want to do something special once we encounter an actual error. The alternative would be to check for the sysctl in the very beginning and simply give up if it has the wrong value, but that is not so robust because it could trigger in unwanted situations, for example if Ubuntu changes something in the future. Thus it is better to always try setting up the container.
  • It is better if there is only one error message for the user with an explanation and the recommendation, not a bunch of technical warnings. But if it is complicated to avoid this, it is not a hard requirement.
  • We can to keep the implementation maintainable without lots of special cases.

Could you help thinking about how to implement this?
I would like if in the end we would raise a BenchExecException with the error message in the main process, like we do for similar cases. However, I see that we first have to do something in the child. Maybe add some other special return code for the child?

@younghojan younghojan force-pushed the 1041-benchexec-not-working-after-upgrading-from-ubuntu-2310-to-2404 branch from ce27511 to 81b5d5b Compare May 27, 2024 11:14
@younghojan
Copy link
Member Author

I have pushed a new version that meets a part of our vision.

Detection must work no matter which options the user has

Having a global variable in container.py might satisfy this:

USER_NS_RESTRICTION = (
    util.try_read_file("/proc/sys/kernel/apparmor_restrict_unprivileged_userns") == "1"
)

We only want to do something special once we encounter an actual error

After catching OSError in child(), check container.USER_NS_RESTRICTION and print a warning before return CHILD_OSERROR. This is quite rough, as I'm not sure if the OSError is necessarily caused by the restriction of unprivileged user namespaces in this situation.

It is better if there is only one error message for the user with an explanation and the recommendation

Now, if sethostname fails, no information regarding the restriction of user-ns will be printed. If we output error messages when sethostname fails without terminating BenchExec, the creation of container goes on. Later when the network setup or the mount tmpfs error happens, the same error messages will be printed again. Therefore, I decided to discard the handling of sethostname failure and only print special info in the event of a critical error.

@PhilippWendler
Copy link
Member

Detection must work no matter which options the user has

Having a global variable in container.py might satisfy this:

USER_NS_RESTRICTION = (
    util.try_read_file("/proc/sys/kernel/apparmor_restrict_unprivileged_userns") == "1"
)

Ah sorry, this is a misunderstanding because I didn't write it clearly.
Whether the check is implemented on module-import time or later on doesn't really matter here, we can keep the standard way. What I wanted to say is that for example we shouldn't implement the check only in the sethostname part of the code, because that is optional.

We only want to do something special once we encounter an actual error

After catching OSError in child(), check container.USER_NS_RESTRICTION and print a warning before return CHILD_OSERROR. This is quite rough, as I'm not sure if the OSError is necessarily caused by the restriction of üunprivileged user namespaces in this situation.

Yes, OSError in general can have many causes. That's why we need to have this additional try_read_file check. We can also check the errno value of the OSError whether it matches our expectation.

It is better if there is only one error message for the user with an explanation and the recommendation

Now, if sethostname fails, no information regarding the restriction of user-ns will be printed. If we output error messages when sethostname fails without terminating BenchExec, the creation of container goes on. Later when the network setup or the mount tmpfs error happens, the same error messages will be printed again. Therefore, I decided to discard the handling of sethostname failure and only print special info in the event of a critical error.

I see, so we get all of these warnings. Could you post an example of a such a failing run of runexec? I wonder whether it is worth the effort to complicate the code by adding more checks to cases like sethostname.

@younghojan
Copy link
Member Author

Whether the check is implemented on module-import time or later on doesn't really matter here, we can keep the standard way.

Oh I see. However, I think it would be cooler to place this check at the import time, similar to how NATIVE_CLONE_CALLBACK_SUPPORTED is checked. Since checking unprivileged user-NS is necessary, doing it this way would not incur any additional overhead.

We can also check the errno value of the OSError whether it matches our expectation.

I think maybe we can do it this way:

                    if container.USER_NS_RESTRICTION and e.errno in [
                        errno.EPERM,  # container.activate_network_interface() failed
                        errno.EACCES,  # _setup_container_filesystem failed
                    ]:
                        logging.warning(
                            "Ubuntu 24.04 restircts unprivileged user namespaces,"
                            " please try 'echo 0 | sudo tee /proc/sys/kernel/"
                            "apparmor_restrict_unprivileged_userns' as a temporary workaround."
                        )

Could you post an example of a such a failing run of runexec?

Yes, I tested runexec with and without network access. This looks all "right", and from the results, it appears to be what we want.

(benchexec-venv) haoranyang@haoranyang-fudan:~/benchexec$ runexec /usr/bin/ls
2024-05-28 09:10:54 - INFO - Starting command /usr/bin/ls
2024-05-28 09:10:54 - INFO - Writing output to output.log
2024-05-28 09:10:54 - WARNING - Changing hostname in container prevented by system configuration, real hostname will leak into the container.
2024-05-28 09:10:54 - CRITICAL - Failed to configure container with operation 'fcntl.ioctl(sock, _SIOCSIFFLAGS, ifreq)': [Errno 1] Operation not permitted
2024-05-28 09:10:54 - WARNING - Ubuntu 24.04 restircts unprivileged user namespaces, please try 'echo 0 | sudo tee /proc/sys/kernel/apparmor_restrict_unprivileged_userns' as a temporary workaround.
2024-05-28 09:10:54 - CRITICAL - Cannot execute '/usr/bin/ls': execution in container failed, check log for details.
terminationreason=failed
(benchexec-venv) haoranyang@haoranyang-fudan:~/benchexec$ runexec --network-access /usr/bin/ls
2024-05-28 09:12:20 - WARNING - The container configuration disables DNS, host lookups will fail despite --network-access. Consider using --keep-system-config.
2024-05-28 09:12:20 - INFO - Starting command /usr/bin/ls
2024-05-28 09:12:20 - INFO - Writing output to output.log
2024-05-28 09:12:20 - WARNING - Changing hostname in container prevented by system configuration, real hostname will leak into the container.
2024-05-28 09:12:20 - CRITICAL - Failed to configure container with operation 'raise OSError(errno, msg)': [Errno 13] mount(None, b'/tmp/BenchExec_run_5hgyaihz', b'tmpfs', 0, b'size=100%') failed: Permission denied
2024-05-28 09:12:20 - WARNING - Ubuntu 24.04 restircts unprivileged user namespaces, please try 'echo 0 | sudo tee /proc/sys/kernel/apparmor_restrict_unprivileged_userns' as a temporary workaround.
2024-05-28 09:12:20 - CRITICAL - Cannot execute '/usr/bin/ls': execution in container failed, check log for details.
terminationreason=failed

@PhilippWendler
Copy link
Member

Whether the check is implemented on module-import time or later on doesn't really matter here, we can keep the standard way.

Oh I see. However, I think it would be cooler to place this check at the import time, similar to how NATIVE_CLONE_CALLBACK_SUPPORTED is checked. Since checking unprivileged user-NS is necessary, doing it this way would not incur any additional overhead.

Well, we do not need to check for it if everythings works. But overhead is not my concern, I am mostly wondering whether this has the potential to break something. For example, we had problems in the past where modules did stuff during import time, and this caused problems for situations where the module was not actually used but still imported, e.g., when using other parts of BenchExec on Windows, or in tests, or by type checkers. The use of try_read_file should make it robust, but as long as there is no real reason to use the global variable, I prefer to not risk it.

We can also check the errno value of the OSError whether it matches our expectation.

I think maybe we can do it this way:

                    if container.USER_NS_RESTRICTION and e.errno in [
                        errno.EPERM,  # container.activate_network_interface() failed
                        errno.EACCES,  # _setup_container_filesystem failed
                    ]:
                        logging.warning(
                            "Ubuntu 24.04 restircts unprivileged user namespaces,"
                            " please try 'echo 0 | sudo tee /proc/sys/kernel/"
                            "apparmor_restrict_unprivileged_userns' as a temporary workaround."
                        )

Like this, yes. But the comments are not correct, because it is not guaranteed that activate_network_interface always produces EPERM and _setup_container_filesystem always produces EACCES. So this can be misleading and it is better to remove the comments.

Could you post an example of a such a failing run of runexec?

Yes, I tested runexec with and without network access. This looks all "right", and from the results, it appears to be what we want.

(benchexec-venv) haoranyang@haoranyang-fudan:~/benchexec$ runexec /usr/bin/ls
2024-05-28 09:10:54 - INFO - Starting command /usr/bin/ls
2024-05-28 09:10:54 - INFO - Writing output to output.log
2024-05-28 09:10:54 - WARNING - Changing hostname in container prevented by system configuration, real hostname will leak into the container.
2024-05-28 09:10:54 - CRITICAL - Failed to configure container with operation 'fcntl.ioctl(sock, _SIOCSIFFLAGS, ifreq)': [Errno 1] Operation not permitted
2024-05-28 09:10:54 - WARNING - Ubuntu 24.04 restircts unprivileged user namespaces, please try 'echo 0 | sudo tee /proc/sys/kernel/apparmor_restrict_unprivileged_userns' as a temporary workaround.
2024-05-28 09:10:54 - CRITICAL - Cannot execute '/usr/bin/ls': execution in container failed, check log for details.
terminationreason=failed

Hm, I find the explanation quite hidden. It is in between other log messages, and it is even a warning where the rest is marked critical. Ideal would be a separate error message, and I thought we have that in similar cases. But I realize now that maybe this was only done for benchexec and not for runexec. Could you please also try with benchexec?

For comparison, consider for example the case if BenchExec cannot create a cgroup because pystemd is missing or so.

@younghojan
Copy link
Member Author

Oh I see. However, I think it would be cooler to place this check at the import time, similar to how NATIVE_CLONE_CALLBACK_SUPPORTED is checked. Since checking unprivileged user-NS is necessary, doing it this way would not incur any additional overhead.

Well, we do not need to check for it if everythings works. But overhead is not my concern, I am mostly wondering whether this has the potential to break something. For example, we had problems in the past where modules did stuff during import time, and this caused problems for situations where the module was not actually used but still imported, e.g., when using other parts of BenchExec on Windows, or in tests, or by type checkers. The use of try_read_file should make it robust, but as long as there is no real reason to use the global variable, I prefer to not risk it.

That's correct. Using such a global variable does indeed carry unpredictable risks. But should we wrap this check into a function and perhaps place it in util.py and place the module-constant error message in container.py?
According to BenchExec's execution process, when using container mode, due to the kernel's restrictions on unprivileged user namespaces, BenchExec will either abort at activate_network_interface() or at _setup_container_filesystem(). Can we temporarily add checks only at these two points? Because even if the restriction on unprivileged user namespaces affects other steps in the container setup, BenchExec's execution flow will not reach those steps and will terminate early.

As I mentioned before, if an error message is printed at sethostname without terminating BenchExec, the same error message will be printed again when activate_network_interface is executed, which is redundant. Perhaps we can adjust the execution flow. Currently, it is sethostname->activate_network_interface->_setup_container_filesystem, which results in two repetitive warnings being printed (one at sethostname and one at activate_network_interface). Can we change it to activate_network_interface->_setup_container_filesystem->sethostname? This way, BenchExec would print the error message and abort at activate_network_interface, preventing duplicate warnings.

Additionally, I noticed that in containerized_tool._init_container, the execution order is _setup_container_filesystem->sethostname->activate_network_interface, which is different from that in ContainerExecutor. The execution order of these three operations should not affect the container setup.

Like this, yes. But the comments are not correct, because it is not guaranteed that activate_network_interface always produces EPERM and _setup_container_filesystem always produces EACCES. So this can be misleading and it is better to remove the comments.

I'll remove them in later commits.

Hm, I find the explanation quite hidden. It is in between other log messages, and it is even a warning where the rest is marked critical. Ideal would be a separate error message, and I thought we have that in similar cases. But I realize now that maybe this was only done for benchexec and not for runexec. Could you please also try with benchexec?

In fact, after installing coloredlogs, the warning/critical messages are color-coded and not that difficult to distinguish. However, these fatal errors (which prevent the container from being created) are actually caused by Ubuntu's restrictions on user namespaces. We could also change the warnings here to critical.

I've tested benchexec, it failed when mounting tmpfs:

(benchexec-venv) haoranyang@haoranyang-fudan:~/benchexec$ benchexec doc/benchmark-example-rand.xml --tasks "XML files" --limitCores 1 --timelimit 10s --numOfThreads 4
2024-05-29 13:51:21 - WARNING - Creating UID mapping into container failed: [Errno 1] Operation not permitted
2024-05-29 13:51:21 - WARNING - Could not write to setgroups file in /proc: [Errno 13] Permission denied: '/proc/4966/setgroups'
2024-05-29 13:51:21 - WARNING - Creating GID mapping into container failed: [Errno 1] Operation not permitted
2024-05-29 13:51:21 - WARNING - Ubuntu 24.04 restircts unprivileged user namespaces, please try 'echo 0 | sudo tee /proc/sys/kernel/apparmor_restrict_unprivileged_userns' as a temporary workaround.
Error: Failed to configure container: [Errno 13] mount(None, b'/tmp/Benchexec_tool_info_container_obb1i6y4', b'tmpfs', 0, b'size=100%') failed: Permission denied

For comparison, consider for example the case if BenchExec cannot create a cgroup because pystemd is missing or so.

I tested runexec without pystemd:

(benchexec-venv) haoranyang@haoranyang-fudan:~/benchexec$ runexec /usr/bin/ls
2024-05-29 14:01:10 - WARNING - Cgroup subsystem cpu is not available. Please make sure it is supported by your kernel and available.
2024-05-29 14:01:10 - WARNING - Cannot measure CPU time without cpuacct cgroup.
2024-05-29 14:01:10 - WARNING - Cgroup subsystem freeze is not available. Please make sure it is supported by your kernel and available.
2024-05-29 14:01:10 - WARNING - Cgroup subsystem memory is not available. Please make sure it is supported by your kernel and available.
2024-05-29 14:01:10 - WARNING - Cannot measure memory consumption without memory cgroup.
2024-05-29 14:01:10 - INFO - Starting command /usr/bin/ls
2024-05-29 14:01:10 - INFO - Writing output to output.log
2024-05-29 14:01:10 - WARNING - Changing hostname in container prevented by system configuration, real hostname will leak into the container.
2024-05-29 14:01:10 - CRITICAL - Failed to configure container with operation 'fcntl.ioctl(sock, _SIOCSIFFLAGS, ifreq)': [Errno 1] Operation not permitted
2024-05-29 14:01:10 - WARNING - Ubuntu 24.04 restircts unprivileged user namespaces, please try 'echo 0 | sudo tee /proc/sys/kernel/apparmor_restrict_unprivileged_userns' as a temporary workaround.
2024-05-29 14:01:10 - CRITICAL - Cannot execute '/usr/bin/ls': execution in container failed, check log for details.
terminationreason=failed

When handling the missing pystemd case, the warning messages are somewhat repetitive, such as Please make sure it is supported by your kernel and available.

@PhilippWendler
Copy link
Member

Oh I see. However, I think it would be cooler to place this check at the import time, similar to how NATIVE_CLONE_CALLBACK_SUPPORTED is checked. Since checking unprivileged user-NS is necessary, doing it this way would not incur any additional overhead.

Well, we do not need to check for it if everythings works. But overhead is not my concern, I am mostly wondering whether this has the potential to break something. For example, we had problems in the past where modules did stuff during import time, and this caused problems for situations where the module was not actually used but still imported, e.g., when using other parts of BenchExec on Windows, or in tests, or by type checkers. The use of try_read_file should make it robust, but as long as there is no real reason to use the global variable, I prefer to not risk it.

That's correct. Using such a global variable does indeed carry unpredictable risks. But should we wrap this check into a function and perhaps place it in util.py and place the module-constant error message in container.py?

util.py is for stuff that is in general useful, i.e., from more than one module. But this check is an implementation detail of container mode, so keep it there.

According to BenchExec's execution process, when using container mode, due to the kernel's restrictions on unprivileged user namespaces, BenchExec will either abort at activate_network_interface() or at _setup_container_filesystem(). Can we temporarily add checks only at these two points? Because even if the restriction on unprivileged user namespaces affects other steps in the container setup, BenchExec's execution flow will not reach those steps and will terminate early.

You mean such that it will typically abort early without lots of confusing error messages because of the check in activate_network_interface but in case this is disabled by the user, there is always the fallback check in _setup_container_filesystem as a last guard? Sounds fine.

But why can't we handle this more generally, such that if there is some permission error and the problematic sysctl is set, then we terminate BenchExec and show a final error message. For example, if cgroups are unusable but necessary we get something like this:

2024-06-03 11:56:37,462 - WARNING - Cgroup subsystem cpu is not available. Please make sure it is supported by your kernel and available.
2024-06-03 11:56:37,462 - WARNING - Cannot measure CPU time without cpuacct cgroup.
2024-06-03 11:56:37,462 - WARNING - Cgroup subsystem freeze is not available. Please make sure it is supported by your kernel and available.
2024-06-03 11:56:37,462 - ERROR - Cannot reliably kill sub-processes without freezer cgroup or container mode. Please enable at least one of them.
2024-06-03 11:56:37,462 - WARNING - Cgroup subsystem memory is not available. Please make sure it is supported by your kernel and available.
2024-06-03 11:56:37,462 - WARNING - Cannot measure memory consumption without memory cgroup.

System is using cgroups v2 but not systemd.
If you are using BenchExec within a container, please ensure that cgroups are properly delegated into the container.
Otherwise please configure your system such that BenchExec can use cgroups.

This is something that would be nice to have here as well.

As I mentioned before, if an error message is printed at sethostname without terminating BenchExec, the same error message will be printed again when activate_network_interface is executed, which is redundant. Perhaps we can adjust the execution flow. Currently, it is sethostname->activate_network_interface->_setup_container_filesystem, which results in two repetitive warnings being printed (one at sethostname and one at activate_network_interface). Can we change it to activate_network_interface->_setup_container_filesystem->sethostname? This way, BenchExec would print the error message and abort at activate_network_interface, preventing duplicate warnings.

Hm, probably yes, but I would like to not mess with the container details just for the sake of getting some nice message. It is easy to imagine that sometime later someone brings the setup steps into a more logical and consistent order and thus breaks the message again, if it relies on tricky ordering details.

Hm, I find the explanation quite hidden. It is in between other log messages, and it is even a warning where the rest is marked critical. Ideal would be a separate error message, and I thought we have that in similar cases. But I realize now that maybe this was only done for benchexec and not for runexec. Could you please also try with benchexec?

In fact, after installing coloredlogs, the warning/critical messages are color-coded and not that difficult to distinguish. However, these fatal errors (which prevent the container from being created) are actually caused by Ubuntu's restrictions on user namespaces. We could also change the warnings here to critical.

Warning is for cases where BenchExec can continue, critical is for cases where BenchExec can not continue. If we cannot change the hostname, we can still continue, so it is a warning.

(benchexec-venv) haoranyang@haoranyang-fudan:~/benchexec$ runexec /usr/bin/ls
2024-05-29 14:01:10 - WARNING - Cgroup subsystem cpu is not available. Please make sure it is supported by your kernel and available.
2024-05-29 14:01:10 - WARNING - Cannot measure CPU time without cpuacct cgroup.
2024-05-29 14:01:10 - WARNING - Cgroup subsystem freeze is not available. Please make sure it is supported by your kernel and available.
2024-05-29 14:01:10 - WARNING - Cgroup subsystem memory is not available. Please make sure it is supported by your kernel and available.
2024-05-29 14:01:10 - WARNING - Cannot measure memory consumption without memory cgroup.
2024-05-29 14:01:10 - INFO - Starting command /usr/bin/ls
2024-05-29 14:01:10 - INFO - Writing output to output.log
2024-05-29 14:01:10 - WARNING - Changing hostname in container prevented by system configuration, real hostname will leak into the container.
2024-05-29 14:01:10 - CRITICAL - Failed to configure container with operation 'fcntl.ioctl(sock, _SIOCSIFFLAGS, ifreq)': [Errno 1] Operation not permitted
2024-05-29 14:01:10 - WARNING - Ubuntu 24.04 restircts unprivileged user namespaces, please try 'echo 0 | sudo tee /proc/sys/kernel/apparmor_restrict_unprivileged_userns' as a temporary workaround.
2024-05-29 14:01:10 - CRITICAL - Cannot execute '/usr/bin/ls': execution in container failed, check log for details.
terminationreason=failed

When handling the missing pystemd case, the warning messages are somewhat repetitive, such as Please make sure it is supported by your kernel and available.

Yes, because several different cgroup subsystems are missing, and they are independent from each other (it can happen that we have one of them but not the others). Furthermore, if cgroups are actually required and not just optional, there will be a final error message with an explanation what to do. So this is not perfect but good enough.

@younghojan
Copy link
Member Author

But why can't we handle this more generally, such that if there is some permission error and the problematic sysctl is set, then we terminate BenchExec and show a final error message.

In previous discussions, you have also mentioned this idea. However, implementing it seems to require quite a few changes, including adding other special return codes for the child.

Errors caused by the restriction of user-ns are caught as exceptions within the child() function, followed by printing the error message, and the child() function returns the error code CHILD_OSERROR. In the parent function, this error code is parsed by the check_child_exit_code() function, which throws a BenchExecException based on the type of error code. This BenchExecException is caught in containerexecutor.main, which then calls sys.exit() to terminate the entire containerexec program.

The problem here is evident: we cannot delay the output of the error message until containerexecutor.main because the errno of OSError has already been lost. Furthermore, in parent->check_child_exit_code(), it is assumed that all error messages have been recorded within the child, making it impossible to determine outside of the child whether the containerexec error was caused by the problematic sysctl set.

# containerexecutor.py > ContainerExecutor > _start_execution_in_container > check_chile_exit_code
# Line 914 - 919

                        if child_exitcode.value == CHILD_OSERROR:
                            # This was an OSError in the child,
                            # details were already logged
                            raise BenchExecException(
                                "execution in container failed, check log for details"
                            )

I am now feeling a bit confused. Should we continue to implement the same way as in the previous commits, where we only check if the problematic sysctl is set and print an error message when errors occur in activate_network_interface and _setup_container_filesystem? Or should we consider other implementation methods to achieve the desired effect with minimal changes to the code structure?

@PhilippWendler
Copy link
Member

I see. Thanks for digging through this and the summary! Might be somewhat awkward in the code to handle this specific case so differently. So now I tend to say we can proceed as in the previous commits.

Copy link
Member

@PhilippWendler PhilippWendler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Would you be willing to do some polishing?

benchexec/container.py Outdated Show resolved Hide resolved
benchexec/containerexecutor.py Show resolved Hide resolved
doc/container.md Outdated Show resolved Hide resolved
doc/container.md Outdated Show resolved Hide resolved
benchexec/container.py Outdated Show resolved Hide resolved
doc/container.md Outdated Show resolved Hide resolved
benchexec/containerized_tool.py Outdated Show resolved Hide resolved
doc/INSTALL.md Outdated Show resolved Hide resolved
@PhilippWendler PhilippWendler force-pushed the 1041-benchexec-not-working-after-upgrading-from-ubuntu-2310-to-2404 branch from fb3ef0b to fb6a54a Compare June 13, 2024 11:28
@PhilippWendler
Copy link
Member

I squashed the commits of this PR. Usually we do not do this, but in this case the back-and-forth just clutters the history and does not provide a benefit. I also added some slight refactoring and will now merge this PR. Thanks!

@PhilippWendler PhilippWendler merged commit 765e3ee into main Jun 13, 2024
15 checks passed
@PhilippWendler PhilippWendler deleted the 1041-benchexec-not-working-after-upgrading-from-ubuntu-2310-to-2404 branch June 13, 2024 11:44
@younghojan
Copy link
Member Author

Very honored that these codes has been merged into the main branch!

PhilippWendler added a commit that referenced this pull request Jun 13, 2024
On Ubuntu since 24.04, user namespaces are forbidden for regular users
(cf. #1041 and #1042).
There is a global sysctl switch to enable them again,
but applications whose AppArmor profile allows this can also use it.
(Typically, AppArmor only restricts application,
but in this case an AppArmor profile can actually provide a privilege
than an unconfined application does not have.)
More explanations are at
https://ubuntu.com/blog/ubuntu-23-10-restricted-unprivileged-user-namespaces

In order to make BenchExec usable out-of-the-box after installing
the .deb package we want to ship such an AppArmor profile.
This is made complicated by the fact that the AppArmor profile
that is necessary on Ubuntu 24.04+
breaks AppArmor on previous Ubuntu versions.
So we have to install this profile conditionally.
I found a way to do so using ucf (a tool for handling config files)
and this seems to work in my tests on Ubuntu 22.04 (old AppArmor),
Ubuntu 24.04 (new AppArmor), and Debian 12 (old AppArmor),
as well as installation without AppArmor present.

There are two known remaining problems:
- If one upgrades from Ubuntu 22.04 to Ubuntu 24.04 while having
  BenchExec installed, the AppArmor profile will not be installed,
  so BenchExec will not work.
  Upgrading or reinstalling the BenchExec package makes it work.
- The command "python3 -m benchexec.test_tool_info" will not work,
  because the AppArmor profile won't match it.
  One has to either disable container mode or temporarily allow
  the use of user namespaces for the whole system.
  If we implement #1053 this would just work.

Part of #1041.
EshaanAgg pushed a commit to EshaanAgg/benchexec that referenced this pull request Jun 28, 2024
On Ubuntu since 24.04, user namespaces are forbidden for regular users
(cf. sosy-lab#1041 and sosy-lab#1042).
There is a global sysctl switch to enable them again,
but applications whose AppArmor profile allows this can also use it.
(Typically, AppArmor only restricts application,
but in this case an AppArmor profile can actually provide a privilege
than an unconfined application does not have.)
More explanations are at
https://ubuntu.com/blog/ubuntu-23-10-restricted-unprivileged-user-namespaces

In order to make BenchExec usable out-of-the-box after installing
the .deb package we want to ship such an AppArmor profile.
This is made complicated by the fact that the AppArmor profile
that is necessary on Ubuntu 24.04+
breaks AppArmor on previous Ubuntu versions.
So we have to install this profile conditionally.
I found a way to do so using ucf (a tool for handling config files)
and this seems to work in my tests on Ubuntu 22.04 (old AppArmor),
Ubuntu 24.04 (new AppArmor), and Debian 12 (old AppArmor),
as well as installation without AppArmor present.

There are two known remaining problems:
- If one upgrades from Ubuntu 22.04 to Ubuntu 24.04 while having
  BenchExec installed, the AppArmor profile will not be installed,
  so BenchExec will not work.
  Upgrading or reinstalling the BenchExec package makes it work.
- The command "python3 -m benchexec.test_tool_info" will not work,
  because the AppArmor profile won't match it.
  One has to either disable container mode or temporarily allow
  the use of user namespaces for the whole system.
  If we implement sosy-lab#1053 this would just work.

Part of sosy-lab#1041.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
container related to container mode
Development

Successfully merging this pull request may close these issues.

BenchExec not working after upgrading from Ubuntu 23.10 to 24.04
2 participants