-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: pool acquire allowing concurrently make new wire instead of waiting in line #761
Conversation
Hi @liuzhaohui, thank you. The change seems good to me but tests keep failing multiple times. There might be something we missed. |
It seems that the testing machine lacks of enough resources when running Integration tests concurrently. |
Signed-off-by: Rueian <[email protected]>
a55011f
to
d0adc18
Compare
I think that was the largest machine we could have on the CI. I tried to reduce the defulat pool size d0adc18 but tests were still failing. |
Hi @liuzhaohui, the tests have passed, which is cool! But can we keep another test on a relatively smaller scale for multiple 1ms timeouts? The original tests tried to ensure that the client could still work after timeouts. I still want to keep that but it can be done on a smaller scale. |
Sure, for smaller scale is not a big issue. ------- here is some explaination of the 10ms timeout chage -------- On my centos, I constrained the docker server cpu/memory resource to 5c6g and I reproduced the issue. I found that the testcase with para of 'keys*100' stucked, repeatedly creating and releasing connections which took very long time. And the reason is in that testcase the context timeout is set to 1ms and there are lots of occasions the underline processing time surpass that amount of time, causing the connection to be release on context deadline excceeded error. Both main branch and my revision has that issue, but my revision creates connection more aggressively which makes things worse under this extreme timeout setting. When I increase the timeout setting from 1ms to 10ms, the issue disappeared and it seems that the total time of the testcases decreases from 28m of main branch to 20m of my revision. |
Merged. Thank you so much @liuzhaohui! |
If there is no free wire in the pool, as len(p.list) == 0, it blocks the whole pool until make wire returns.
The process of making new wire usually take several millionseconds under my environment. But if serveral tasks acquire new wires in that short period concurrently, some tasks may need to wait tens of millionseconds which brings down performance dramatically.