-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: HA proxy memory runaway on certain rpm based distro's -> Setting maxconn in haproxy config (#15319) #18283
base: master
Are you sure you want to change the base?
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #18283 +/- ##
==========================================
+ Coverage 45.06% 45.07% +0.01%
==========================================
Files 354 354
Lines 47963 47969 +6
==========================================
+ Hits 21614 21624 +10
+ Misses 23541 23538 -3
+ Partials 2808 2807 -1 ☔ View full report in Codecov by Sentry. |
Hi @timgriffiths
I had to remove the colons from your original, otherwise the pod failed with an error:
Seems like you would need this change in your fix too? |
Signed-off-by: Timothy Griffiths <[email protected]>
Thank you. @jennerm clearly had yaml on the brain when I copied that fix from my environment. It's fixed now. |
I can confirm this fixes the issue for us on AL2023 EKS nodes. |
We're seeing an issue where argocd-redis-ha-haproxy gets OOMKilled when all of the redis proxies are unreachable. I verified, and haproxy consumes all available memory in a matter of seconds and then OOM Kill happens. Limiting with This PR is only about Helm. The maxconn setting should be in all of the default manifests including |
@@ -701,6 +701,10 @@ data: | |||
stats enable | |||
stats uri /stats | |||
stats refresh 10s | |||
# Additional configuration | |||
global | |||
maxconn 4096 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the value already being used by some code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding here is this value used to be based on the max file descriptors the kernel would give you, in RHEL7 and 8 I believe this used to be 4096 (hence me using this value) in RHEL9 and i believe some other distro's this limit is removed and ha-proxy just tries to use all the file descriptors on the box until it runs out of memory now unless you set maxconn in it's config.
I don't see any harm in reducing the number but this was just chosen to be the absolute worst-case situation to maintain functionality
Fixes #15319
I propose the best way to fix this issue is by setting a global setting in haproxy config maxconn 4000 you can also fix it by changing the max open file limit in containerd but as this comment points out docker-library/haproxy#194 (comment) this only works as haproxy derives the max number of connections from the max open files on a system which seems like a bit of a bug or at least we should set a max as part of the config.
Setting the default sufficiently large so that this should not be a problem. this could be backported to whichever versions users need as it's a simple haproxy config tweak
Checklist: