Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scorecard: Use Sanic, cache fully-shaped GitHub response #5242

Closed

Conversation

akotlar
Copy link
Contributor

@akotlar akotlar commented Feb 3, 2019

Use Sanic + ujson instead of Flask + flask.json, cache more, and fix missing favicon in HTML templates. Sanic should be 3+x faster than Flask, and ujson ~2-3x faster than the standard json lib.

I also optimize away the unnecessary re-generation of user_data (via get_users()) and json equivalent for /json.

This is deployed currently on scorecard.hail.is, and improves performance by ~20%, even for 1 single connection, and for that simple workload (response time from ~50ms to ~40ms)

Besides performance ( https://fgimian.github.io/blog/2018/06/05/python-api-framework-benchmarks/ ) Sanic also builds in a production-oriented web server, so need for WSGI or aWSGI, Gunicorn, etc. Can easily run multiple workers if desired

app.run(host='0.0.0.0', port=5000, workers=4)

This serves as a demonstration or migration to faster web frameworks, and in particular to uvloop-based asyncio implementations.

… bit faster than flask, but this is mostly a dry run testing ease of conversion from flask. templates have been modified to avoid favicon.ico 404, and formatting into 80 line width for user.html
…hey aren't imported. I would rather they make those imports an explicit requirement...
@akotlar akotlar changed the title ujson, removed unused imports Scorecard: ujson, removed unused imports Feb 6, 2019
<title>Scorecard: User {{ user }}</title>
<link rel="shortcut icon" href="#" />
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again add the favicon fix. The rest is HTML formatting, not critical if you prefer the style you already had

@akotlar akotlar changed the title Scorecard: ujson, removed unused imports Scorecard: Use Sanic, cache fully-shaped GitHub response Feb 6, 2019
…ry would run twice. don't cache timestamp diff, just timestamp; remove unnecessary globals use from flask v. add comments clarifying race conditions, shared between flask and current version. improve favicon.ico handling for chrome
@akotlar
Copy link
Contributor Author

akotlar commented Feb 6, 2019

Fixed a few issues, should be ready to go. Works great, very fast, and I think is evidence that we can start using this approach for our other python web server efforts. Only small trickiness is with Kubernetes, but there are known solutions (run_thread_pool_executor or deque)

Copy link
Collaborator

@cseed cseed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two comments, and a meta-comment:

  • I had looked over async http libraries and had preferred aiohttp over sanic because (1) "aiohttp" is a blessed aio library, (2) performance seemed comparable, (4) aiohttp seemed like a simpler solution which was attractive because the microservices are looking more and more like services and less like web servers (even more so moving all the rendering to the front end with the web app, the legacy version of scorecard using jinja is not the representative case). Did you look at aiohttp?

  • From the code:

    Global variables that are modified ...

    I don't want to have to think about shared state and locking. I want a shared-nothing architecture in the microservices where the only globals are true constants and threads communication by sending immutable data through queues.

  • Finally, a meta-comment. I started reviewing this when it was just ujson, I did a bit of research about json packages to understand your choices and when I came back, the PR had expanded with all the async stuff. I would have approved the ujson stuff. The async stuff could have been a separate PR. Nobody wants to review a moving target, so the scope of a change should be roughly frozen when you assign a PR and additional changes should be minimized and restricted to that scope. You're welcome to have an open PR with no reviewer if you're still fleshing out the scope, of course. Thanks!

@akotlar
Copy link
Contributor Author

akotlar commented Feb 6, 2019

I also created a Starlette branch; which may be preferable, as Sanic brings with it a bit of controversy and a bunch of errors generate on Techempower benchmarks. I took a brief look at the bench source didn't see an immediate issue, so worry a bit about. Sanic. Starlette is a light layer on top of Uvicorn, one of the leading ASGI web servers. Similar to Sanic/Flaks interface:

https://www.techempower.com/benchmarks/#section=data-r17&hw=ph&test=fortune&l=zijzen-1

Branch here, can issue a separate pr and close this one: https://github.com/akotlar/hail/tree/scorecard-starlette

@akotlar
Copy link
Contributor Author

akotlar commented Feb 6, 2019

Two comments, and a meta-comment:

  • I had looked over async http libraries and had preferred aiohttp over sanic because (1) "aiohttp" is a blessed aio library, (2) performance seemed comparable, (4) aiohttp seemed like a simpler solution which was attractive because the microservices are looking more and more like services and less like web servers (even more so moving all the rendering to the front end with the web app, the legacy version of scorecard using jinja is not the representative case). Did you look at aiohttp?

  • From the code:

    Global variables that are modified ...

    I don't want to have to think about shared state and locking. I want a shared-nothing architecture in the microservices where the only globals are true constants and threads communication by sending immutable data through queues.

  • Finally, a meta-comment. I started reviewing this when it was just ujson, I did a bit of research about json packages to understand your choices and when I came back, the PR had expanded with all the async stuff. I would have approved the ujson stuff. The async stuff could have been a separate PR. Nobody wants to review a moving target, so the scope of a change should be roughly frozen when you assign a PR and additional changes should be minimized and restricted to that scope. You're welcome to have an open PR with no reviewer if you're still fleshing out the scope, of course. Thanks!

In response:

  1. aiohttp is an option, but appears to be generally considered slow on a per-response basis (published benchmarks, haven't had a chance to try it), even potentially slower than flask. It seems wrong to choose something slower if there are are reasonable alternatives.
  2. The globals were a feature of the initial implementation (the GitHub cache). It felt outside of the scope of my PR to change that to some queue solution.

Meta comment. Ok. I didn't think it had been looked at, and expanded what it did pretty quickly, as I realized that ujson wasn't helping much. I can make a ujson-specific pr, but my goal was to test async library implementations in a simple applications, since we need a long term strategy for python web stuff that isn't Flask (or not just Flask)

@cseed
Copy link
Collaborator

cseed commented Feb 6, 2019

  1. aiohttp is an option, but appears to be generally considered slow on a per-response basis

Can you point me to the benchmarks? The only head-to-head one I found was this:

https://github.com/samuelcolvin/aiohttp-vs-sanic-vs-japronto

where sanic is faster for smaller examples but aiohttp performs MUCH better (3x latency, 10x throughput) when interacting with a db.

@cseed
Copy link
Collaborator

cseed commented Feb 6, 2019

  1. It felt outside of the scope of my PR to change that to some queue solution.

OK, great, thanks for clarifying!

@akotlar
Copy link
Contributor Author

akotlar commented Feb 6, 2019

  1. It felt outside of the scope of my PR to change that to some queue solution.

OK, great, thanks for clarifying!

Sure, this is something I am focusing on improving moving forward. Thanks for your feedback.

@akotlar
Copy link
Contributor Author

akotlar commented Feb 7, 2019

  1. aiohttp is an option, but appears to be generally considered slow on a per-response basis

Can you point me to the benchmarks? The only head-to-head one I found was this:

https://github.com/samuelcolvin/aiohttp-vs-sanic-vs-japronto

where sanic is faster for smaller examples but aiohttp performs MUCH better (3x latency, 10x throughput) when interacting with a db.

Sanic was chosen because it's a near drop-in for Flask, and is the most popular afaik library built around asyncio. It has 2x as many stars as aiohttp, slightly more forks.

The original motivation for considering aiohttp:
https://magic.io/blog/uvloop-blazing-fast-python-networking

In short, one of the creators of asyncio discusses uvloop performance relative to other libraries. They key is:

"However, the performance bottleneck in aiohttp turned out to be its HTTP parser, which is so slow, that it matters very little how fast the underlying I/O library is."

screen shot 2019-02-06 at 7 29 00 pm

screen shot 2019-02-06 at 7 29 19 pm

As an aside I've spent some time reading about this over the last ~month, and besides relatively consistent messaging about the messiness of Python's ecosystem, performance and user experience are deeply important to me, so when I read things like:

"I don’t think performance matter. I think asgi does not matter in 2018 in general. Usability and complexity matters. Python is not very good choice for high performance system in any case...For me high performance python is a fantasy, but i don’t do aiohttp/python anymore. In the end it is up to @asvetlov"

from one of the creators of aiohttp, I'm not encouraged about the long-term health of the project. aio-libs/aiohttp#2902

In the second branch related to this pull request, linked above, I chose Starlette, and it is a thin abstraction, nearly identical performance, over Uvicorn + httptools, which were both written by Yury Selivanov, the asyncio person I mention above.

Starlette and Uvicorn are currently the fastest options, (Sanic isn't tested), by a relatively large margin, on Techempower's benchmarks. If there is a reference standard benchmark of http library performance, Techempower is it: https://www.techempower.com/benchmarks/#section=data-r17 . Starlette is something like base Go performance (though 1/5-1/10th the performance of Go's fasthttp library for simple responses, and much closer for anything involving database calls)

Sanic also uses httptools and uvloop, but has more stuff.. so yeah maybe a bit slower than Starlette, or not, but the diff will probably be small.

Regarding the benchmark you linked, it is benchmarking the power of sleep. There is something deeply wrong with their results. Sanic has 1800 timeouts, vs 200 for aiohttp, and 3x the connection errors. Fine, so Sanic is super slow. But look at their non-db tests. Sanic is >2x as fast, 0 timeouts. They aren't using anything Sanic specific to query the database, and both use the same event loop. Adding asyncio Postgres to two programs that fundamentally differ mainly in how the handle http requests and responses, shows the one that is faster at http requests/responses (Sanic) becoming much slower, and in fact reversing its relationship to Aiohttp. This is strange to say the least.

I was really curious about this, so I ran the bench. First, I upgraded Sanic to a recent version. Then I ran their test. In short, their results were not what I found. Sanic is 50% faster, and the timeouts are what you'd expect. 26 timeouts for Sanic, 45 for aiohttp.

Sanic Run 1:
Running 1m test @ http://localhost:8000/db
12 threads and 2000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 640.64ms 947.31ms 7.97s 85.89%
Req/Sec 385.62 137.55 2.32k 77.21%
274143 requests in 1.00m, 41.70MB read
Socket errors: connect 0, read 2072, write 0, timeout 26
Requests/sec: 4563.11
Transfer/sec: 710.67KB

Sanic Run 2:
alexkotlar:~/projects/aiohttp-vs-sanic-vs-japronto:$ wrk -d 60 -c 2000 -t 12 --timeout 8 http://localhost:8000/db
Running 1m test @ http://localhost:8000/db
12 threads and 2000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 615.91ms 878.25ms 7.86s 85.85%
Req/Sec 391.30 118.76 1.61k 72.83%
278943 requests in 1.00m, 42.46MB read
Socket errors: connect 0, read 2079, write 0, timeout 12
Requests/sec: 4642.59
Transfer/sec: 723.58KB

Sanic Run 3 (very large background task spike in last 1-2s of run):
alexkotlar:~/projects/aiohttp-vs-sanic-vs-japronto:$ wrk -d 60 -c 2000 -t 12 --timeout 8 http://localhost:8000/db
Running 1m test @ http://localhost:8000/db
12 threads and 2000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 543.65ms 839.00ms 7.93s 87.81%
Req/Sec 392.47 118.69 1.42k 73.81%
279206 requests in 1.00m, 42.54MB read
Socket errors: connect 0, read 2101, write 0, timeout 35
Requests/sec: 4646.20
Transfer/sec: 724.97KB

Aiohttp Run 1:
alexkotlar:~/projects/aiohttp-vs-sanic-vs-japronto:$ wrk -d 60 -c 2000 -t 12 --timeout 8 http://localhost:8000/db
Running 1m test @ http://localhost:8000/db
12 threads and 2000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 747.49ms 1.00s 7.88s 86.77%
Req/Sec 280.95 103.65 1.60k 79.52%
199147 requests in 1.00m, 36.47MB read
Socket errors: connect 0, read 2058, write 1, timeout 45
Requests/sec: 3313.70
Transfer/sec: 621.36KB

Aiohttp Run 2:
Running 1m test @ http://localhost:8000/db
12 threads and 2000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 696.00ms 967.04ms 7.93s 86.48%
Req/Sec 289.87 115.90 1.90k 83.92%
205188 requests in 1.00m, 37.54MB read
Socket errors: connect 0, read 2041, write 0, timeout 38
Requests/sec: 3414.95
Transfer/sec: 639.84KB

Aiohttp Run 3:
Running 1m test @ http://localhost:8000/db
12 threads and 2000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 670.88ms 898.81ms 7.89s 86.58%
Req/Sec 318.17 108.06 1.47k 74.96%
226300 requests in 1.00m, 41.34MB read
Socket errors: connect 0, read 2053, write 0, timeout 19
Requests/sec: 3765.55
Transfer/sec: 704.34KB

Runs were interleaved two reduce chance that the programs would benefit from caching across runs. First run for each had somewhat more background tasks open.

Starlette will be run later.

@asvetlov
Copy link

asvetlov commented Feb 7, 2019

Thanks for mentioning me, very interesting reading.

You know, framework comparison is the very biased matter.
Sanic has 11,300 GitHub stars, aiohttp has only 6,900.

Monthly download count is different: 4,7M for aiohttp vs 60K for Sanic.
Precise download count is a very hard thing (it misses PyPI caches, installing from Linux packages and Docker images etc. etc.) -- but you see the difference anyway.

Sanic team is a champion in the library promotion, guys do their job perfectly well.

Performance comparison is even harder.
Libraries have different defaults: sanic worker installs uvloop by default, aiohttp doesn't do it but utilizes uvloop if uvloop.install() was called.
Moreover, the aiohttp performance depends on how the library was installed.
It has C optimizations with Pure Python fallbacks. If Cython/GCC was not available on target machine at installation time the slow pure python code is executed.
In fact, aiohttp in optimized mode runs the same C written HTTP parser as Sanic.

Sanic used to run multiple workers by default, aiohttp uses only one. On a real server it doesn't matter because usually the server is explicitly configured to run multiple web workers by gunicorn and (or) nginx config. Now Sanic switched to the single worker by default IIRC.
Anyway, looking on outdated performance comparisons in different blog posts doesn't show any useful numbers unless you re-run and check the numbers on your environment against latest (or used by you) versions.

aiohttp uses standard json module by default, Sanic ujson. ujson is faster but it is not 100% compatible with the standard and can fall into memory dumps IIRC. You can configure aiohttp to run ujson, orjson or rapidjson if needed -- all speedups have own drawbacks.

Very famous Tech Empower Benchmark demonstrates that sanic is faster than aiohttp on JSON serialization but cannot pass other tests. Please decide is it important or not.

The last cherry: Sanic has super fast URL router because it caches matching results. The feature is extremely useful for getting awesome numbers with wrk tool but in real life URL paths for server usually not constant. URLs like /users/{userid} don't fit in cache well :)

P.S.
aiohttp has a place for optimization, we are working on it. There is no single bottleneck, the optimization requires careful improvements in many places, with keeping backward compatibility policy.

I don't know the best criteria for choosing. My advice is: look on user API and choose what you like.

P.P.S.
Starlette has own benefits and compromises as well, but the post is pretty huge already.

@akotlar
Copy link
Contributor Author

akotlar commented Feb 7, 2019

@asvetlov thanks for your reply, and for your work on the Sanic project! I was really curious about the Techempower issue. Do you know why Sanic, on past rounds failed to complete test subsets without error? I haven’t had much of a chance to look into that yet, but sanic-org/sanic#53 doesn’t divulge much, and my own attempts to give Sanic problems haven’t yielded anything worrisome (i.e asyncpg works great under 2000 simultaneous connection load, request standard deviation is about as tight as aiohttp, and number of extremes / timeouts is smaller than aiohttp). Techwmpower benchmark was on version 0.7, if not earlier (the linked file in the Techempower issue is 0.7), and that version may have been affected by the issue described here: sanic-org/sanic#1176 which seems to have been largely addressed.

Edit: furthermore, other recent tests showed no significant issues with Sanic https://fgimian.github.io/blog/2018/06/05/python-api-framework-benchmarks/. Still the addressing the Techempower issues may help people feel more confident.

@asvetlov
Copy link

asvetlov commented Feb 7, 2019

To be clear, I'm working on aiohttp project, not Sanic.
Regarding TechEmpower -- I did not investigate.
Maybe the problem is trivial, maybe it is fixed on master.
IIRC Sanic has a partial Flow-Control/HTTP-Pipelining implementation now but I'm not 100% sure.
I have many points to apply my spare time, Sanic problems are not in my TOP-10 personal list. I hope you understand my position.

@akotlar
Copy link
Contributor Author

akotlar commented Feb 7, 2019

Gotcha, thanks, the trouble with reading your reply on the go. I had assumed you contributed to both from your reply.

Glad to hear you're taking aiohttp performance seriously.

Regarding flow-control, yep, that pr was merged sanic-org/sanic#1179. I haven't seen any further conversations from you or Sanic devs on the issue.

  • It's not just on master, it's in 0.8

Part of my concerns over Sanic came from reading your post @ https://www.reddit.com/r/Python/comments/876msl/sanic_python_web_server_thats_written_to_die_fast/ ; this now seems outdated, and it would be interesting to hear the reply of a Sanic contributor.

Re: "The last cherry: Sanic has super fast URL router because it caches matching results. The feature is extremely useful for getting awesome numbers with wrk tool but in real life URL paths for server usually not constant. URLs like /users/{userid} don't fit in cache well :)"

  • Surely simple (typical-use, i.e /path or /path/) pattern matching isn't a large performance constraint. I would be surprised if this were a bottleneck in either Sanic or aiohttp.
  • If aiohttp is bottlenecked by this, and isn't caching matches, why not? The most common route is say /

Edit: To be clear, the bench mentioned above used 1 worker for Sanic and aiohttp, both using uvloop. Bench attached.

bench.zip

@akotlar
Copy link
Contributor Author

akotlar commented Feb 7, 2019

This seems like a direction we don't want to go at the moment, closing.

@akotlar akotlar closed this Feb 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants