-
Notifications
You must be signed in to change notification settings - Fork 477
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Send SIGUSR1 to dnsmasq periodically #644
base: master
Are you sure you want to change the base?
Conversation
Skipping CI for Draft Pull Request. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: DamianSawicki The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/test all |
The failed test fails identically at HEAD (issue #646). |
/cc @aojea |
I think you have several changes in the same commit that makes it hard to review. In addition, what is the motivation to make this configurable, I don't think we should allow people to use this feature if is not for debugging, and if that is the case, you can perfectly do it with a simple bash script |
Thank you very much for a quick reaction, Antonio!
Yeah, I replace
You've mentioned running a bash script. Do I understand correctly that users can also run such a script sending SIGUSR1 to a dnsmasq instance? If so, why don't you want to allow them to use this feature?
We can hardcode some interval, but I thought it would be safer to have things configurable, even just for turning the feature on/off. The general motivation is to improve dnsmasq observability and not to require users to run custom scripts (or reach out to their cloud providers) to gain some insights about their kubedns load. Does it make sense to you? |
I can understand if you expose these metrics with a metrics/prometheus endpoints, that allows you to set alerts and integrate with common monitoring stacks ... but you still needs to evaluate the performance implications on sending a periodic signal to the process |
Does dnsmasq handle syscall interruption gracefully? e.g. on receipt of signal, the syscall is canceled with EINTR. |
Yes, dnsmasq mentions specific actions on
The statistics logged by dnsmasq on
For what you describe, the first step is still the current PR, which makes dnsmasq print statistics to logs, which we can later parse and expose with metrics/prometheus endpoints or set up log-based metrics. Do you think the current PR is only sensible if extended with these follow-up steps? |
I don't know, it really feels awkward running a process in production that you are sigkilling periodically ... but I don't know if this pattern is also used elsewhere |
For anyone reading this without much context, Antonio meant sending a signal periodically. The signal In any case, PRs to this repo are very rare, which means the resources are limited (two PRs with vulnerability fixes have been waiting for attention since July). The present PR (which actually was on my to-do list for months) is a rather low-hanging fruit to improve kubedns observability which is among priorities to help with supportability. Let's not rush and see what others have to say. |
problems on typing fast, thanks for clarifying, I meant what Damian said 😅 |
+1, Good work, Damian! |
If we can do a test and include the results in the commit message to demonstrate that sending the signal is benign:
|
This adds a feature flag --logInterval for periodic triggering dnsmasq to log its statistics by sending SIGUSR1 to it. The new motivation for adding this feature comes from the recently added dnsmasq flag --max-tcp-connections and the related statistics of TCP connection utilisation being added to the log output triggered by SIGUSR1.
38ac3af
to
b0ca20f
Compare
This adds a feature flag
--logInterval
for periodic triggering dnsmasq to log its statistics by sendingSIGUSR1
to it.The new motivation for adding this feature comes from dnsmasq changes in
v2.90
: the new flag--max-tcp-connections
and the related statistics of TCP connection utilisation added to the log output triggered bySIGUSR1
.