-
Notifications
You must be signed in to change notification settings - Fork 198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Running inside an interactive container #1048
Comments
@Po-Chun-Chien You were recently interested in this use case as well, right? How did you solve it? Would you mind sharing information and looking at the above? Would be interested to hear what you think about this as our artifact expert. :-) I will look at this next week. |
My use case was a bit different -- I did not need an interactive container. |
@Po-Chun-Chien Ok, I thought you were trying to get BenchExec running as a subprocess of some script inside the container, and this is the same case as an interactive container. Thanks for reading it anyway!
For cgroups v1, we provide a systemd service (either for manual installation or as part of our Debian package) that configures a cgroup with a specific name ( I think we will definitively either add this logic or a parameter such that this tutorial can be simplified. If we go with the parameter, I envision something like |
I am torn on this. "Automagical" things are nice until they aren't. But I think there is little to no harm in creating a cgroup if it doesn't exist. I think in either case, you should have a parameter -- either it is required or it can change the default behavior.
I would maybe rather try But then again, if for some obscure reason the "hardcoded" |
Ok, I'll implement what I proposed.
I will implement it once we have a concrete feature request.
Inside that Actually, there is also a stronger argument: I always found it important that BenchExec "plays nicely" with cgroups and does not disturb other uses and tools of cgroups (for example, following rules like this). A part of this is that subprocesses should only modify the cgroup hierarchy below the cgroup they are in, i.e., not move processes to some other unrelated part of the hierarchy, in order to not escape limits and measurements placed on their original cgroup. So far we always follow this, with two exceptions:
So to follow this principle, we should not blindly assume that And in the case discussed in this issue it is trivial to provide
If neither the current cgroup nor |
For cgroups v1 we already use /system.slice/benchexec-cgroup.service as a fallback cgroup if it exists and we cannot use our current cgroup. For cgroups v2 we do not yet have such a fallback. However, implementing this makes it much easier for people on systems without systemd (e.g., in a container) to setup and configure cgroups for BenchExec, so let's do this. Background discussion is in #1048, from which also the suggested commands in the documentation are taken. Thanks to @incaseoftrouble for reviewing the design of this.
Done in 29480b3. Would you mind testing this and updating the above container tutorial? |
Thanks a lot! I will integrate this into the tutorial once it is ready. Just to be clear: The proposal is to try to configure the EDIT: Ah! So the proposal is to "just" use the cgroup if it exists. Let me try it.
The case I am imagining is that the benchexec group already exists and is in use for something else (for whatever reason). Then I could just create the cgroup |
Indeed, for the reason explained above. If we ever add a parameter |
Two observations:
|
Thanks, the crash is fixed. Cleanup of created cgroups is too tricky as explained before and I won't attempt this. |
First, cgroup cleanup is not of the highest priority. Cleaning up the cgroups from the outside is not too difficult and can be scripted. Let's move on with the tutorial. I adapted the above text. A few words on the cleanup: I could imagine this to work reasonably in that concrete case (being given a fresh cgroup to move to). Effectively, it should "only" be reversal of all actions. I see two options:
I can have a go at this if you want (I like option 2)! (just no promises when, but it seems like a train-ride-activity ;) ) -- If you want to, just mention it, then I will create an issue that gathers relevant points. Reason: I am worried about runexec being used for a few hundred calls, which would create several hundred cgroups, and I am not sure about the overhead. Also, I do believe whatever the process creates it should also destroy. |
This is tricky, because we need to reliably know when we should do the move back. If runexec is used from within a Python program we cannot even know this. And then there are lots of failure cases like the original cgroup being no longer available etc. I don't want to risk this.
This won't work because the parent process cannot stay in its original cgroup, because then we cannot create the new cgroup. This is the whole reason we move the BenchExec process itself in the first place. (The new automagic fallback is the only situation where this could work, but I want to keep it as similar to the standard case.)
If this turns out to be a problem, we may reconsider.
Sure, but this is nice-to-have and the added complexity and failure potential does not seem worth it. |
Fair points, cgroups are complicated :)
One idea for that case: Inside the cgroup that benchexec "owns" (say |
I adapted the draft again; I think it is now in a good shape. Should this be incorporated into #1044 or a separate MR? If so, where should it go? I think you mentioned that this should probably not just be part of the quickstart but in the general container setup, right? |
Great, thank you very much! I personally would have not added that much cgroups background info, but I guess it might be nice to have for some readers. My biggest suggestion would be to skip the installation part of BenchExec in the Dockerfile. Some might prefer pip, some wget. Hardcoding a version number in the example will inevitably lead people to copy-and-paste this without thinking and use an old version in the future. So how about something like Apart from this I would probably only have minor wording stuff, which would be easier to handle in a PR. Can be separate or in #1044, I don't have a strong opinion here. Feel free to use #1044 for simplicity, especially because I think you want to link the new docs from the tutorial. As to where to put it: Yes, I think it would be too hidden in the tutorial, because it is also relevant for people with knowledge about BenchExec. I am wondering whether we should have a new Markdown file grouping everything for running BenchExec in a container, and also move the descriptions from |
Do you mean the "background" in the beginning or "technical details"? The former I am also unsure about. I personally like it when there is some understandable reason why things have to be "complicated", something along the lines "this is the issue, trust me, if you want to know more, here are some links", so that I can understand why I am guided to do certain things. But I think this could be cut down. I will think about it. The technical details I want to strongly separate from the rest, its not part of the tutorial but sort of "for advanced users". But I think it should be there.
Agree! I just like the fact that you can put a simple .whl in there and be done with it. I was looking for a way to point to "latest" release (but this isn't ideal for containers, either). I will rephrase when moving to the PR.
Yes, agree.
Well, (I think this is the only open point, but should be discussed before I integrate this into the PR; I would then check that all links are updated correctly.) |
Both.
For the latter part, it is only about what to consider with respect to BenchExec's container mode when running inside a container. So it is specifically about BenchExec's container mode.
I don't want to rename the existing file, it is fine that |
Ah, I see; I think I misunderstood that when skimming through it.
Fine with me; will do. I'll put the rest into |
Further discussion in #1044 |
Here is a (hopefully somewhat reliable) way of getting benchexec inside an interactive container.
The main trick is to fuse the "move init into sub-cgroup" step directly into the entrypoint.
The other two scripts are just there to move benchexec into a fresh cgroup and would be superfluous if benchexec gets a--cgroup /xy
(or--create-cgroup
) switch.With the latter, it would look kinda neat I think.
EDIT: Such a behavior has been added
Setup in Containerized Environments
This guide explains the complications of using Docker with benchexec and shows
how you can create your own interactive Docker image (or adapt existing ones)
to use BenchExec within it (with cgroups v2, the standard nowadays).
Note that the following setup is not guaranteed to work for every setup, since
there are too many variables, and you may need to slightly adapt some parts.
Background
Just like BenchExec, tools like Docker and Podman make heavy use of cgroups to
isolate processes. Thus, in order to get them working nicely together, some
additional tricks are required, especially in an interactive container (e.g.
one with a shell, where multiple commands can be executed).
The main complication of using BenchExec with Docker is that benchexec needs a
"clean" cgroup on its own, however, a interactive Docker container has the
shell inside the "root" cgroup, which prevents child cgroups with controllers
being created. So, we need to move the init process (and thus all subsequent
ones) into a separte cgroup. (See the "no internal processes" rule in the
cgroup documentation for more
information.)
Also, note that there is a difference between cgroups version 1 and 2, and
between Docker and Podman. We recommend to use Podman (compatible with Docker)
as it provides "rootless" containers, but provide setup for either cases. We
assume cgroups v2, since it is the standard for recent Linux distributions (you
are running on cgroups v2 if the file
/sys/fs/cgroup/cgroup.controllers
exists).
Creating the Image
First, create the script
init.sh
with the following content:and set it executable (
chmod +x init.sh
).Now, pack this script into your Docker image and set it as entry point. If you
are working with a standard Docker image, create a file
Dockerfile
If you already have a Dockerfile, you only need to get the
.whl
into it andadd the last two commands (i.e. copying the script and setting the entrypoint).
In case you want to modify this setup, please consider the notes mentioned
below.
Running BenchExec in the Container
With this setup finished, run
docker build -t <tag> .
orpodman build -t <tag> .
in the directory where the Dockerfile is located.Then, start the image with
docker run --privileged --cap-drop=all -it <tag>
or
podman run --security-opt unmask=/sys/fs/cgroup --cgroups=split --security-opt unmask=/proc/* --security-opt seccomp=unconfined -it <tag>
With this, you should be able to run, for example,
PYTHONPATH=/opt/benchexec.whl python3 -m benchexec.runexecutor <program>
inside the Docker container -- this is
runexec
.Technical Details for Adaptation
There are many ways to achieve the required setup and users familiar with
Docker may choose to adapt the above procedure. There are a few peculiarities
to be aware of.
The goal is to create a cgroup called
/benchexec
with relevant controllersenabled. here, note that it is important that the cgroup has exactly this path,
as benchexec uses this as a "fallback" on non-systemd setups.
Additionally, you should be aware of the implications of the "no internal
process"-rule. Effectively, this means that there cannot be any processes in
the root cgroup of the container (which, notably, is not the root cgroup of
the system). So, for the provided
init.sh
to work, it needs to be directlyexecuted (with
ENTRYPOINT [ "/init.sh" ]
). In particular, using, for example,ENTRYPOINT "/init.sh"
would not work, since this creates a shell that thenruns
init.sh
as a sub-process, meaning that the root cgroup would not beempty when we try to enable subtree control.
In case you run into
Device or resource busy
errors, the problem likely isthat you want to enable subtree control in a non-empty cgroup or, vice versa,
move a process to cgroup with enabled subtree control, both of which is
prohibited for cgroups by design. For debugging, it is useful to inspect the
cgroup.procs
andcgroup.subtree_control
of the cgroup in question. Here,note that calls like
cat
do also create a separate process.I'm "archiving" the cgexec and cleanup script here
cgexec.sh
benchexec.sh
The text was updated successfully, but these errors were encountered: