Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: pool acquire allowing concurrently make new wire instead of waiting in line #761

Merged
merged 4 commits into from
Feb 20, 2025

Conversation

liuzhaohui
Copy link
Contributor

If there is no free wire in the pool, as len(p.list) == 0, it blocks the whole pool until make wire returns.
The process of making new wire usually take several millionseconds under my environment. But if serveral tasks acquire new wires in that short period concurrently, some tasks may need to wait tens of millionseconds which brings down performance dramatically.

@rueian
Copy link
Collaborator

rueian commented Feb 16, 2025

Hi @liuzhaohui, thank you. The change seems good to me but tests keep failing multiple times. There might be something we missed.

@liuzhaohui
Copy link
Contributor Author

liuzhaohui commented Feb 17, 2025

Hi @liuzhaohui, thank you. The change seems good to me but tests keep failing multiple times. There might be something we missed.

It seems that the testing machine lacks of enough resources when running Integration tests concurrently.
On the pull request branch, I tested the failed cases above individually, and all passed on my macbook.
And I deployed a centos with 16C32G, and ran the test command in the build.yaml and got all passed.

@rueian rueian force-pushed the fix_pool_concurrent_acquire branch from a55011f to d0adc18 Compare February 17, 2025 16:10
@rueian
Copy link
Collaborator

rueian commented Feb 17, 2025

I think that was the largest machine we could have on the CI. I tried to reduce the defulat pool size d0adc18 but tests were still failing.

@rueian
Copy link
Collaborator

rueian commented Feb 20, 2025

Hi @liuzhaohui, the tests have passed, which is cool! But can we keep another test on a relatively smaller scale for multiple 1ms timeouts? The original tests tried to ensure that the client could still work after timeouts. I still want to keep that but it can be done on a smaller scale.

@liuzhaohui
Copy link
Contributor Author

Hi @liuzhaohui, the tests have passed, which is cool! But can we keep another test on a relatively smaller scale for multiple 1ms timeouts? The original tests tried to ensure that the client could still work after timeouts. I still want to keep that but it can be done on a smaller scale.

Sure, for smaller scale is not a big issue.

------- here is some explaination of the 10ms timeout chage --------
Setting BlockingPoolSize option to 10 has no effect on the testcases, as pool.size usually stays at 8 which is in line with the parrallism of 8.

On my centos, I constrained the docker server cpu/memory resource to 5c6g and I reproduced the issue. I found that the testcase with para of 'keys*100' stucked, repeatedly creating and releasing connections which took very long time. And the reason is in that testcase the context timeout is set to 1ms and there are lots of occasions the underline processing time surpass that amount of time, causing the connection to be release on context deadline excceeded error.

Both main branch and my revision has that issue, but my revision creates connection more aggressively which makes things worse under this extreme timeout setting. When I increase the timeout setting from 1ms to 10ms, the issue disappeared and it seems that the total time of the testcases decreases from 28m of main branch to 20m of my revision.

@rueian rueian merged commit 050c4d9 into redis:main Feb 20, 2025
27 checks passed
@rueian
Copy link
Collaborator

rueian commented Feb 20, 2025

Merged. Thank you so much @liuzhaohui!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants