-
Notifications
You must be signed in to change notification settings - Fork 356
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add test for cgroups_relative_memory #2686
base: main
Are you sure you want to change the base?
Conversation
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #2686 +/- ##
==========================================
- Coverage 65.50% 65.13% -0.37%
==========================================
Files 133 133
Lines 16916 17012 +96
==========================================
Hits 11081 11081
- Misses 5835 5931 +96 |
Signed-off-by: omprakaash <[email protected]>
LGTM |
Hey, as @omprakaash had mentioned in discord we seem to be skipping a check for memory cgroups validation, as done here in runtime-tools. @tsturzl do you have any idea, why we might be intentionally skipping this, or we might have just missed it originally? If so, @omprakaash may I ask you to add that in both this and the other test? (You can do it in separate PR, we can merge this one before it) |
@YJDoc2 Hmm, if you compare any of the integration tests to runtime-tools it seems like we're missing a similar step a few cgroups integration test I think I might have assumed there was something in A simple solution would be to just read the respective cgroups files and comparing them. It looks like cpu, memory, and network are all missing these checks. There is a lot of opportunity here to create a read/write abstraction for cgroups files. Right now we just have a constant for each file name and use a |
I could just implement the additional checks for this test with current existing methods in this PR. Can the new abstractions be part of a separate PR later on ? |
@omprakaash that's probably a better immediate solution. If we can just implement something similar to what runtime tools is doing. |
8332eb0
to
44e7b59
Compare
I think I have some changes to request in this, I'll try to get to a full review soon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, Apologies it took me long for the review. There are some changed needed, so I have added comments, please take a look.
Is this test supposed to be cgroups v1 specific? If not we should not use the v1 functions explicitly, and use a generic version that can work on both : v1 and v2.
Also, for the functions added in libcgroup : if we are using them in only tests, and they might not be as useful otherwise, I'd prefer to move them in the test crate itself rather then libcgroup.
crates/libcgroups/src/v1/util.rs
Outdated
@@ -41,6 +46,71 @@ pub fn list_supported_mount_points() -> Result<HashMap<ControllerType, PathBuf>, | |||
Ok(mount_paths) | |||
} | |||
|
|||
pub fn get_memory_data(pid: i32) -> Result<LinuxMemory, Box<dyn std::error::Error>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do not use dyn error, instead create a proper error type similar to how we have made specialized errors in enum for other functions and use that (or alternatively reuse some existing error type if that fits)
Another thing is that , I would like to confirm if we are similarly exposing the oci_spec objects from any other existing functions : as this fn is pub, it will be part of our public api, and if we are using another crate (oci_spec) type here, we need to be careful for this. If any other function already does this, then fine, else need to think on how to properly expose this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not see any other functions that do this. Would it be a better idea to just move this into the testing crate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. While it is not a bad idea to have this sort of function, I'd prefer not to add this to a core crate in a PR related to tests. If you are interested, we can open another issue and discuss the API for this ; but for this case, I'll request you to move the functions into the tests itself.
crates/libcgroups/src/v1/util.rs
Outdated
let cgroup_memory_files = vec![ | ||
"memory.limit_in_bytes", | ||
"memory.soft_limit_in_bytes", | ||
"memory.memsw.limit_in_bytes", | ||
"memory.kmem.limit_in_bytes", | ||
"memory.kmem.tcp.limit_in_bytes", | ||
"memory.swappiness", | ||
"memory.oom_control", | ||
]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to not hardcode these (I think we probably can't, but just confirming )?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I don't think so.
crates/libcgroups/src/v1/util.rs
Outdated
.parse::<u64>()?; | ||
memory_data = memory_data.disable_oom_killer(oom_control == 1); | ||
} | ||
_ => {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as we are controlling the files by hardcoding, we should add unreachable!()
in the remaining case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Will change it
pub fn get_subsystem_path(pid: i32, subsystem: &str) -> Result<PathBuf, io::Error> { | ||
let contents = fs::read_to_string(Path::new(&format!("/proc/{}/cgroup", pid))) | ||
.unwrap_or_else(|_| panic!("failed to read /proc/{}/cgroup", pid)); | ||
for line in contents.lines() { | ||
let parts: Vec<&str> = line.splitn(3, ':').collect(); | ||
if parts.len() < 3 { | ||
continue; | ||
} | ||
let subparts: Vec<&str> = parts[1].split(',').collect(); | ||
for subpart in subparts { | ||
if subpart == subsystem { | ||
return Ok(PathBuf::from(parts[2].to_string())); | ||
} | ||
} | ||
} | ||
Err(io::Error::new( | ||
io::ErrorKind::Other, | ||
format!("subsystem {} not found", subsystem), | ||
)) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we already have something similar to this existing somewhere in the libcgroup crate. Can you check and confirm if we really need a new function for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes there is, but it is tied to V1Manager and is for the current process only.
|
||
use crate::utils::{test_outside_container, test_utils::check_container_created}; | ||
use crate::utils::{self, test_outside_container}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can we combine this import with the utils import above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep !
let spec = SpecBuilder::default() | ||
.linux( | ||
LinuxBuilder::default() | ||
.cgroups_path(Path::new("/testdir/runtime-test/container").join(cgroup_name)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: extract this path at top as a const
fn can_run() -> bool { | ||
Path::new(CGROUP_MEMORY_LIMIT).exists() && Path::new(CGROUP_MEMORY_SWAPPINESS).exists() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is cgroups v1 specific test, we should also check that cgroups is v1 here
let expected_memory = spec | ||
.linux() | ||
.as_ref() | ||
.unwrap() | ||
.resources() | ||
.as_ref() | ||
.unwrap() | ||
.memory() | ||
.as_ref(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can do without the as_ref
calls here, as spec is already a immutable ref? Please check once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh ,the linux() function call returns a &Option, so as_ref() was needed to make it a Option<&Linux>
if memory.is_err() { | ||
bail!("failed to get memory data: {:?}", memory.err().unwrap()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if memory.is_err() { | |
bail!("failed to get memory data: {:?}", memory.err().unwrap()); | |
} | |
if let Err(e) = memory { | |
bail!("failed to get memory data: {:?}", e); | |
} |
if expected_memory.limit().unwrap() != memory.as_ref().unwrap().limit().unwrap() { | ||
bail!("expected memory {:?}, got {:?}", expected_memory, memory); | ||
} | ||
|
||
if expected_memory.swappiness().unwrap() != memory.as_ref().unwrap().swappiness().unwrap() { | ||
bail!("expected memory {:?}, got {:?}", expected_memory, memory); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For both of these, can we extract the corresponding params in vars, and compare those? Additionally we should also report specifically what param mismatch was there, as potentially (although it shouldn't) some other param in memory data can be diff, making it confusing as to exactly why this failed. So in error message it should be something like expected memory limit {expected_limit} but got {actual_limit} instead
.
Thank you for the review ! Will make the required changes. |
@omprakaash ping! There are still some review changes pending, so pinging you |
Signed-off-by: omprakaash <[email protected]> Signed-off-by: om prakaash <[email protected]>
Sorry for the delay, finally got a chance to work on this again. |
@omprakaash There seems to be a conflict. Please need to rebase. |
@omprakaash Ping! |
Adds an integration test for cgroups_relative_memory. The only difference from the non-relative test seems to be the cgroup_path of the specified spec while testing.