-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
set sampling ratio of dropping for single syscall individually #1309
base: master
Are you sure you want to change the base?
set sampling ratio of dropping for single syscall individually #1309
Conversation
Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Welcome @wangyongfeng5! It looks like this is your first PR to falcosecurity/libs 🎉 |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: wangyongfeng5 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Please double check driver/SCHEMA_VERSION file. See versioning. /hold |
Hi! Thanks for your contribution! |
Signed-off-by: Manny Wang <[email protected]>
e790789
to
4235cfd
Compare
Seems like a nice improvement, tagging our sampling ratio expert @gnosek! Moreover, we are in the process of releasing a new libs version, not sure we can do this for the next tag but we will try to do our best! |
Signed-off-by: Manny Wang <[email protected]>
9bee2a4
to
5d1e8de
Compare
Signed-off-by: Manny Wang <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry to say, I'm not too excited about this PR as it currently stands. The issues I have with it are:
- we really need DROP_[EX] events to contain the sampling ratio (getting rid of the sampling ratio obviously complicates this :))
- replacing "the" sampling ratio with an array feels wrong. With your patch, the sampling ratio now becomes an attribute of the individual syscalls, so maybe it should become an extension of the ppm_sc_of_interest concept (accept/drop becomes accept/accept-1/nth-of-the-time/drop) rather than an extension of the global sampling ratio concept?
My suggestion would be to leave the global sampling ratio alone and add a separate per-syscall sampling step afterwards, so that the higher sampling ratio (global or per-syscall) wins, although you'd realistically only use one or the other.
Also, what's the end use case for this? What would you use the extra flexibility for?
|
||
vpr_info("new sampling ratio: %d\n", new_sampling_ratio); | ||
vpr_info("new default sampling ratio: %d\n", new_sampling_ratio); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't really setting just the default, it's overwriting any previous per-syscall sampling ratios configured
@@ -1186,6 +1199,43 @@ static long ppm_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) | |||
ret = 0; | |||
goto cleanup_ioctl; | |||
} | |||
case PPM_IOCTL_SET_DROPPING_RATIO: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bikeshedding names is always fun, but I'd rather keep the SAMPLING
in the name here (e.g. PPM_IOCTL_SET_SYSCALL_SAMPLING_RATIO
?). Otherwise I'd keep wondering if sampling_ratio and dropping_ratio are the same thing and why do we need two ioctls to manage them.
Also, please consider (not forcing this in any way, just please consider :)) passing a (two-u32) struct by pointer instead of unpacking the arguments from a single (by-value) u64.
@@ -4939,7 +4939,7 @@ FILLER(sched_drop, false) | |||
/* | |||
* ratio | |||
*/ | |||
return bpf_push_u32_to_ring(data, data->settings->sampling_ratio); | |||
return bpf_push_u32_to_ring(data, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm correct, this is going to cause a lot of pain for us :( we actually do rely on getting the sampling ratio in drop events) and would require a major redesign once there's no single sampling ratio.
It feels like a way forward would be to keep the notion of a single sampling ratio and disable it only when a consumer uses the per-syscall ioctl (and we'd never do that then).
something like:
if(settings->sampling_ratio != PER_SYSCALL_SAMPLING_RATIO)
{
sampling_ratio = settings->sampling_ratio;
}
else
{
sampling_ratio = settings->sampling_ratios[syscall_id];
}
and in the ioctl that sets per syscall sampling ratios, just set settings->sampling_ratio = PER_SYSCALL_SAMPLING_RATIO
too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there are both global and per-system call ratios, which one should be reported here? Global? Or the one being sampled? In the komd driver, it seems difficult to obtain the system call being sampled based on its mechanism that may delay the insertion of drop events.
vpr_info("PPM_IOCTL_SET_DROPPING_RATIO, syscall(%u), ratio(%u), consumer %p\n", syscall_to_set, new_sampling_ratio, consumer_id); | ||
|
||
if (syscall_to_set >= SYSCALL_TABLE_SIZE) { | ||
pr_err("invalid syscall %u\n", syscall_to_set); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will appear in the kernel log without any extra context, so maybe add a few words here (like that we're in this particular ioctl for example). I'm sure there's precedent for cryptic log messages but let's try to make things better one line at a time :)
@@ -22,7 +22,7 @@ int BPF_PROG(t1_drop_e) | |||
|
|||
/*=============================== COLLECT PARAMETERS ===========================*/ | |||
|
|||
ringbuf__store_u32(&ringbuf, maps__get_sampling_ratio()); | |||
ringbuf__store_u32(&ringbuf, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as for the other engines, we can't work without a reliable sampling ratio report (as seen by the driver) :<
@@ -4525,7 +4525,7 @@ int f_sched_drop(struct event_filler_arguments *args) | |||
/* | |||
* ratio | |||
*/ | |||
res = val_to_ring(args, args->consumer->sampling_ratio, 0, false, 0); | |||
res = val_to_ring(args, 0, 0, false, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same. We need it :(
if(g_syscall_table[syscall_id].flags & (UF_NEVER_DROP | UF_ALWAYS_DROP) | ||
|| g_syscall_table[syscall_id].flags == UF_NONE | ||
|| !(g_syscall_table[syscall_id].flags & UF_USED)) | ||
{ | ||
return 1; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't check the event flags in pman_set_default_sampling_ratio, and I guess it's not critical here either (setting the sampling ratio for an unused event feels harmless?)
My end use case: The user process makes some unreasonable system calls, such as calling accpet on a non-blocking socket, which generates a large number of useless events, so we have to enable sampling, but at the same time we don't want to lose other useful system calls, such as 'sendto ', as they are important to the upper-level rules, here are two ways:
So, I would like to know: |
That would be very nice, we thought different times at filtering events kernel side using |
I think it's worth it:
|
It feels to me that sampling is the wrong approach here: you do not care about e.g. accept() calls returning -EAGAIN at all, so it would be best to filter them out completely, not (semi) randomly sample them and hope we catch the non-EAGAIN ones. IMHO what you want is UF_NEVER_DROP on the accept* syscalls (it's weird that we don't have this already...) combined with improvements to @Andreagit97 @FedeDP do you think it's realistic to expose UF_ALWAYS_DROP/UF_NEVER_DROP via ppm_sc_of_interest? Again, going from accept/drop to accept/accept-when-not-sampling/drop. There are several, unfortunately conflicting, viewpoints on syscall/event filtering:
... and one 0day later the process connects to somewhere in Lower Elbonia ;)
I'm mildly pessimistic about flexible filtering in the kernel, precisely due to the overhead for non-filtered events. We can benchmark to measure the overhead but it will always depend on the ratio of filtered to non-filtered events (is the non-filtered overhead greater than the overhead of generating events that will get immediately dropped by userspace?) Long term, I'd love to see the probe generated on demand based on whatever filters we have, but I might not live long enough to see it happen ;) |
I created another PR to implement the approach |
thank you we will take a look ASAP right now we are a little bit busy for the release |
Issues go stale after 90d of inactivity. Mark the issue as fresh with Stale issues rot after an additional 30d of inactivity and eventually close. If this issue is safe to close now please do so with Provide feedback via https://github.com/falcosecurity/community. /lifecycle stale |
/remove-lifecycle stale |
yeah, it is not clear to us how to collocate this change between the existing ones, we know this is a valuable feature but we don't want to explode the complexity with yet another logic in the drivers... I am moving ti to /TBD until we have clearer ideas, sorry for that |
Issues go stale after 90d of inactivity. Mark the issue as fresh with Stale issues rot after an additional 30d of inactivity and eventually close. If this issue is safe to close now please do so with Provide feedback via https://github.com/falcosecurity/community. /lifecycle stale |
/remove-lifecycle stale |
Issues go stale after 90d of inactivity. Mark the issue as fresh with Stale issues rot after an additional 30d of inactivity and eventually close. If this issue is safe to close now please do so with Provide feedback via https://github.com/falcosecurity/community. /lifecycle stale |
/remove-lifecycle stale |
Issues go stale after 90d of inactivity. Mark the issue as fresh with Stale issues rot after an additional 30d of inactivity and eventually close. If this issue is safe to close now please do so with Provide feedback via https://github.com/falcosecurity/community. /lifecycle stale |
/remove-lifecycle stale |
Issues go stale after 90d of inactivity. Mark the issue as fresh with Stale issues rot after an additional 30d of inactivity and eventually close. If this issue is safe to close now please do so with Provide feedback via https://github.com/falcosecurity/community. /lifecycle stale |
Stale issues rot after 30d of inactivity. Mark the issue as fresh with Rotten issues close after an additional 30d of inactivity. If this issue is safe to close now please do so with Provide feedback via https://github.com/falcosecurity/community. /lifecycle rotten |
/remove-lifecycle rotten |
What type of PR is this?
/kind feature
Any specific area of the project related to this PR?
/area driver-kmod
/area driver-bpf
/area driver-modern-bpf
/area libscap-engine-bpf
/area libscap-engine-kmod
/area libscap-engine-modern-bpf
/area libscap
/area libpman
Does this PR require a change in the driver versions?
/version driver-API-version-major
/version driver-API-version-minor
What this PR does / why we need it:
Set sampling ratio of dropping for single syscall individually,users can set different discard rates for system calls of different importance.