-
Notifications
You must be signed in to change notification settings - Fork 2k
[Elixir/phoenix] Implementing suggestions from @josevalim #9302
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Elixir/phoenix] Implementing suggestions from @josevalim #9302
Conversation
This is beautiful, thank you for working on it! ❤️ The only other feedback I have, which we should explore in a separate pull request anyway, is to increase the pool size. For multiple queries, we are not getting more performance from high concurrency numbers, and I assume this is either because our pool size is too small (we use 50, some of the Rust repos use 1024) or our polling is a bottleneck (it is a single process). I am away from my computer with 10 cores, so when I am back home, I want to run some benchmarks on top of Ecto polling and try to get some numbers. In particular, I may want to add multiple pools inside Ecto, and then we can try running 8 pools with 32 connections each or similar. But we should probably wait for this PR to be merged and see a trial run, so we can see baseline numbers. :) If you want, I can keep you posted on the Ecto efforts. Have a fantastic week! |
@srcrip I just wanted to drop a note that this PR was in flight to do |
Correct. transaction is checkout+begin+commit, but if you are only doing reads, you don't need begin+commit. :) |
I have added My suggestion is that, once this PR is merged and we measure its new baseline, we should update
This is exactly 512 connections. We can also try |
Thank you @atavistock and @josevalim, yes that makes sense. I also agree that we probably need to increase the pool size. When I made some initial improvements, I got way better results on my machine than in the last runs on https://tfb-status.techempower.com/. I'm not really sure why. If you look at the results there it seems like it's gotten worse over the past few runs. But I have no idea as to why. |
@NateBrady23 I think this one should be good to merge and we'll have a follow up based on the Ecto changes that @josevalim did. |
@srcrip some numbers have come out for using transactions and it already does better, so I am sure we are in the right track: https://www.techempower.com/benchmarks/#section=test&runid=176ba510-3607-4faa-996e-74f0778b88d4&hw=ph&test=query Btw, it seems the plain text benchmark is failing on the continuous benchmarks. Any idea why? |
returning: false | ||
) | ||
|
||
json(conn, world_updates) | ||
end | ||
|
||
def plaintext(conn, _params) do | ||
text(conn, "Hello, World!") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
text/2
includes "charset=utf-8" in the content type and the other benchmarks seem to not do it (and this will matter on the plain text one numbers). So it probably makes sense for us to set the content-type explicitly instead:
text(conn, "Hello, World!") | |
conn | |
|> put_resp_header("content-type", "text/plain") | |
|> send_resp(200, "Hello, World!") |
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me. Adding that now.
@josevalim I think there's something up with the compression headers on the plaintext benchmark. I tried to properly disable it in bandit, as that is what is required. But not sure if I made a mistake in the implementation. edit: ahhh, I didn't see your comment about the char type thing. that makes a lot more sense. |
@srcrip I double checked your compression configs and they look right to me (the one inside |
Okay, I've also pulled the version of Ecto with support for I tested different permutations of count and size, and arrived at 24 pools with 64 connections each which seems to be the best performance within the connection limit for the docker container. Having fewer pools started to create contention for connections, having more pools seemed to reduce throughput. I also did test fewer overall connections (16/64 and 32/32) but 24/64 seemed to be the right balance on my Macbook M2 |
How many cores do you have on your M2? I am thinking that for Techempower, we probably want pool count to be the double of cores (in their case, that would be 56) and 15 connections on each? |
@NateBrady23 I've seen some links to run ids, do those include these pull requests and how can I find the most current? |
@atavistock This will be included in the next full run that starts on https://tfb-status.techempower.com |
Awesome job @atavistock and @srcrip, Phoenix is doing much better in the benchmarks now! 🎉 In some of those we are putting almost 10x of what we got before! https://www.techempower.com/benchmarks/#section=test&runid=1300aee6-f9c9-42a2-8d17-252ba597202f&hw=ph&test=db Although it unfortunately still fails at the plaintext one... 🤔 |
Good progress!
So first the good news, I can reproduce the plaintext error locally.
Now the bad news, the problem seems to be within Bandit.
Testing with ApacheBench and 160 concurrent workers, 10,000
connections, and zero throttling:
Cowboy and Conn.Plug starts to noticeably slow down almost immediately and after
2000 requests its noticeably crawling. To its credit it does
eventually return a valid response for every request with a p95 of 14
seconds.
In contrast Bandit is clearly faster at lower numbers and just blazes
through requests but then around 7000 requests it just completely
collapses and stops accepting new connections for 30+ seconds or so.
The largest test here goes up to 256 workers with 10 threads and sends
over 50,000 requests, which is much more than my little load test.
Filing an issue with Bandit on this and open to any ideas on how to
help resolve this.
…On Mon, Oct 28, 2024 at 3:12 PM José Valim ***@***.***> wrote:
Awesome job @atavistock and @srcrip, Phoenix is doing much better in the benchmarks now! 🎉 In some of those we are putting almost 10x of what we got before! https://www.techempower.com/benchmarks/#section=test&runid=1300aee6-f9c9-42a2-8d17-252ba597202f&hw=ph&test=db
Although it unfortunately still fails at the plaintext one... 🤔
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Great digging! Mat (from Bandit) is super responsive, so please open up an issue and share it with us here (or ping me there, I will be glad to contribute). |
@atavistock btw, do you think it is worth doing a run with |
@josevalim Still trying to figure out why the |
Awesome, thank you for your continuous effort on this! |
@josevalim / @srcrip / @mtrudel Just thought it was worth pointing out that the new public benchmarks are out. https://www.techempower.com/benchmarks/#hw=ph&test=composite§ion=data-r23 |
Those are great improvements across the board. Fantastic work @atavistock and everyone involved! |
Much better! I do wonder @josevalim and @atavistock if there's still something slowing down the plaintext benchmark though. One of the rails benchmarks is just slightly outperforming the phoenix+bandit one and it seems like maybe there's still something happening there? |
I see that. The code feels like a bit of a cheat, as they're they're not really even using Rails, but routing to a proc that returns a canned Rack response. They're doing the same for the JSON test. from routes.rb
|
I dropped a comment on the other PR asking for some clarification. If it is ok to forego the Rails request/response objects, then we can likely do something similar and use canned responses from Bandit. |
Conversation in #9198 lead to some meaningful optimizations
Specifically