Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SBCL-specific optimization for plain blob HTTP output. #191

Closed
wants to merge 1 commit into from

Conversation

phmarek
Copy link
Contributor

@phmarek phmarek commented Mar 31, 2021

When a easy-handler returns a string or ub8-vector
for output, the small socket buffer size hurts performance
by requiring many unnecessary context switches, resp.
allows for other threads to be scheduled.

By just writing prepared data as it is, the socket buffer
can be streamed as fast as available bandwidth allows it.

(Note: on Linux a reasonable TCP buffer sysctl is recommended,
for example "net.ipv4.tcp_wmem = 131072 131072 4194304").

For small request sizes, the difference is within the noise floor:

  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    89.50us  287.62us   8.40ms   99.62%
    Req/Sec    13.24k     1.02k   15.86k    75.91%

  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    91.50us  312.95us   8.90ms   99.60%
    Req/Sec    13.21k     0.91k   15.69k    68.32%

But for larger outputs (here a 115kB PDF) this patch decreases
latency by quite a large margin. From

  3 threads and 3 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    38.41ms   20.96ms 111.80ms   67.98%
    Req/Sec    26.16      9.20    50.00     76.33%
  Latency Distribution
     50%   22.59ms
     75%   63.52ms
     90%   65.59ms
     99%   84.39ms
  785 requests in 10.01s, 87.90MB read
  Requests/sec:     78.40

the 99% latency is nearly halved:

  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    24.86ms    5.70ms  64.90ms   85.10%
    Req/Sec    40.29      6.88    50.00     60.00%
  Latency Distribution
     50%   22.61ms
     75%   23.12ms
     90%   32.48ms
     99%   45.03ms
  1209 requests in 10.01s, 135.24MB read
  Requests/sec:    120.76

For HTTPS, chunked output, and other stream types this keeps
the old behaviour.

When a easy-handler returns a string or ub8-vector
for output, the small socket buffer size hurts performance
by requiring many unnecessary context switches, resp.
allows for other threads to be scheduled.

By just writing prepared data as it is, the socket buffer
can be streamed as fast as available bandwidth allows it.

 (Note: on Linux a reasonable TCP buffer sysctl is recommended,
  for example "net.ipv4.tcp_wmem = 131072 131072 4194304").

For small request sizes, the difference is within the noise floor:

  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    89.50us  287.62us   8.40ms   99.62%
    Req/Sec    13.24k     1.02k   15.86k    75.91%

  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    91.50us  312.95us   8.90ms   99.60%
    Req/Sec    13.21k     0.91k   15.69k    68.32%

But for larger outputs (here a 115kB PDF) this patch decreases
latency by quite a large margin. From

  3 threads and 3 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    38.41ms   20.96ms 111.80ms   67.98%
    Req/Sec    26.16      9.20    50.00     76.33%
  Latency Distribution
     50%   22.59ms
     75%   63.52ms
     90%   65.59ms
     99%   84.39ms
  785 requests in 10.01s, 87.90MB read
  Requests/sec:     78.40

the 99% latency is nearly halved:

  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    24.86ms    5.70ms  64.90ms   85.10%
    Req/Sec    40.29      6.88    50.00     60.00%
  Latency Distribution
     50%   22.61ms
     75%   23.12ms
     90%   32.48ms
     99%   45.03ms
  1209 requests in 10.01s, 135.24MB read
  Requests/sec:    120.76

For HTTPS, chunked output, and other stream types this keeps
the old behaviour.
@stassats
Copy link
Member

If you want a fast web server then hunchentoot is probably the wrong place to find it. And using SBCL internals is a no go anyway.

@stassats stassats closed this Mar 31, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants