-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make criticall section, while working with task_struct, safe again. #340
Conversation
Thank you very much for trying to help figure this out and fix it! At least the reproduction steps should be useful to @Adam-pi3, I guess. As to the patch in this PR, it looks wrong to me: you're removing required locking from Of course, since we do have this task killing logic on corruption, we should also make this logic reliable (even if it's supposed to be unreachable in normal circumstances). So maybe this part of your patch has merit, but then perhaps it should be different. We don't currently know whether the spurious off flag corruption and the killing of wrong task via dangling pointer are parts of the same root cause or different - ideally, that's the first question we need answered to fix this issue. My guess is that when you're running with these changes applied, you probably still do get the off flag corruption message occasionally, right? |
Because p_tasks_write_lock() will touch IRQ second time, first time it grabs in
May I ask, does there are any logic which unlink task_struct from candidates list in I mean:
So if the task somehow enter in the destruction procedure, it should be hocked, and jump out to
+
Indeed, after some time
|
Hm... I tried to follow the steps which you described under Ubuntu 24.04 LTS, 6.8.0-36-generic in a VM (under VmWare) but it does not trigger any bug/problem and LKRG works as intended. I can see in dmesg logs that modules are blocked:
However, no sign of any FP neither instability. What environment do you use to report the problem? |
|
We definitely need to figure out and fix the bug (or these bugs), but this PR isn't it. Closing. |
Description
This is second attempt to fix bug in exploit detection engine of LKRG.
First attempt: #339
Similar symptoms: #329
How This Could Be Reproduced?
To trigger the BUG the new process from kernel context could be spawned.
For example:
Set
lkrg.block_modules=1
and and ensure, that some modules likenetlink_diag af_packet_diag mptcp_diag unix_diag tcp_diag udp_diag raw_diag
aren't taint in the kernel.1st terminal:
while true; do exec sh -c exit & done
2nd terminal:
while true; do sudo ss -apn; done
second command will trigger kernel to spawn modprobe, but since lkrg is blocking modules - it will be rejected.
This way we can increase chance to win the race, when:
if (unlikely(p_val != p_global_cnt_cookie))
p_ed_kill_task_by_task(p_source->p_ed_task.p_task);
for uninitialized memory instead of alive task.then in
dmesg
output you would see something like:and, possible, an Oops.
How Has This Been Tested?
After applying this patch - steps from above wouldn't trigger false-positives, and would not crash the kernel anymore.
At least on my setup.