-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: bind() to unix:/var/lib/nginx/nginx-status.sock failed (98: Address already in use) #6752
Comments
Hi @granescb thanks for reporting! Be sure to check out the docs and the Contributing Guidelines while you wait for a human to take a look at this 🙂 Cheers! |
Hi @granescb, Can you give more details about the node restarts. Are the nodes on a scheduled restart? Can you try turn readOnlyRootFilesystem to false and let us know if this changes the behaviour. In the meantime, we will do our best to reproduce the issue and get back as soon as we can. |
Hello @AlexFenlon Yes, I can try readOnlyRootFilesystem, but only in the staging cluster. The main problem - I don't know how to reproduce the issue to check whether readOnlyRootFilesystem will solve the problem. I will try to reproduce the problem with the current settings and then try to do it with readOnlyRootFilesystem. |
I repeated the same behavior by sending 1 signal from the k8s worker node.
Now Pod is restarting and going to CrashLoopBackOff cause of busy sockets error. I also used readOnlyRootFilesystem=false and repeated the same case with 1 signal - now the pod is just restarting and working fine. So looks like this solution will work for us. |
If you are happy, we will close this for now. |
@AlexFenlon Also, looks like the same problem was reported about a year ago: #4604 |
Hi @granescb, Thanks again for bringing this to our attention, we will investigate this again and get back to you. |
Version
3.6.2
What Kubernetes platforms are you running on?
EKS Amazon
Steps to reproduce
k8s EKS version: 1.31
Describe the bug:
Sometimes, the nginx-ingress-controller restarts the process without cleaning the socket files.
At first time we meet this problem during massive node restarting in the k8s cluster.
Then it happens randomly on weekends.
The problem was noticed in version 3.6.2. Before we used app version 3.0.2 and never had this problem
Manual Pod deletion solves the problem, but it can happen again.
Here is deployment yaml
Logs with error:
Here are logs, containing 1 signal reconfiguring and then a crash loop with socket busy error
Explore-logs-2024-11-05 18_40_57.txt
Expected behavior
nginx-ingress controller pod is working.
The text was updated successfully, but these errors were encountered: