perf: switch to json decoder and reduce heap usage #47

kruskall · 2023-07-21T21:09:55Z

This PR aims to solve the recent increase in heap usage without sacrificing some of the performance improvements we gained.

In details:

we are flattening the map and storing the bulk response item in a slice since we are not using the map keys.
reduce allocations over the reflection-based approach of the std lib
optimize correct/common case by only adding 1 to the counter of indexed request. We use the slice only for failed requests which should ideally be 0 or really low compared to the total
do not use io.ReadAll and switch back to a decoder to avoid storing temporary the whole response in memory

Benchstat output compared to the current implementation:

                                      │  before.txt  │              after.txt               │
                                      │    sec/op    │    sec/op     vs base                │
Appender/NoCompression-20               479.5n ±  5%   494.4n ±  4%        ~ (p=0.190 n=10)
Appender/NoCompressionScaling-20        868.8n ± 10%   838.3n ± 42%        ~ (p=0.912 n=10)
Appender/BestSpeed-20                   824.3n ±  4%   673.6n ±  2%  -18.29% (p=0.000 n=10)
Appender/BestSpeedScaling-20            692.9n ±  9%   673.0n ±  9%   -2.88% (p=0.011 n=10)
Appender/DefaultCompression-20          1.379µ ±  5%   1.165µ ±  4%  -15.52% (p=0.000 n=10)
Appender/DefaultCompressionScaling-20   1.414µ ±  3%   1.175µ ±  3%  -16.90% (p=0.000 n=10)
Appender/BestCompression-20             1.438µ ±  5%   1.177µ ±  4%  -18.12% (p=0.000 n=10)
Appender/BestCompressionScaling-20      1.395µ ±  3%   1.167µ ±  2%  -16.31% (p=0.000 n=10)
geomean                                 990.9n         878.0n        -11.40%

                                      │  before.txt  │               after.txt                │
                                      │     B/s      │      B/s        vs base                │
Appender/NoCompression-20               266.5Mi ± 5%    258.5Mi ±  4%        ~ (p=0.190 n=10)
Appender/NoCompressionScaling-20        147.1Mi ± 9%    152.4Mi ± 73%        ~ (p=0.912 n=10)
Appender/BestSpeed-20                   155.0Mi ± 4%    189.7Mi ±  2%  +22.39% (p=0.000 n=10)
Appender/BestSpeedScaling-20            184.4Mi ± 8%    189.9Mi ± 10%   +2.97% (p=0.011 n=10)
Appender/DefaultCompression-20          92.70Mi ± 5%   109.77Mi ±  4%  +18.41% (p=0.000 n=10)
Appender/DefaultCompressionScaling-20   90.40Mi ± 4%   108.77Mi ±  3%  +20.32% (p=0.000 n=10)
Appender/BestCompression-20             88.88Mi ± 6%   108.60Mi ±  4%  +22.18% (p=0.000 n=10)
Appender/BestCompressionScaling-20      91.65Mi ± 3%   109.51Mi ±  2%  +19.48% (p=0.000 n=10)
geomean                                 129.0Mi         145.6Mi        +12.87%

                                      │ before.txt │              after.txt              │
                                      │    B/op    │    B/op      vs base                │
Appender/NoCompression-20               333.0 ± 0%   202.0 ±  0%  -39.34% (p=0.000 n=10)
Appender/NoCompressionScaling-20        340.5 ± 2%   210.5 ±  5%  -38.18% (p=0.000 n=10)
Appender/BestSpeed-20                   708.0 ± 2%   214.0 ±  0%  -69.77% (p=0.000 n=10)
Appender/BestSpeedScaling-20            692.5 ± 2%   216.5 ±  2%  -68.74% (p=0.000 n=10)
Appender/DefaultCompression-20          676.0 ± 4%   206.0 ±  9%  -69.53% (p=0.000 n=10)
Appender/DefaultCompressionScaling-20   673.5 ± 5%   207.0 ± 11%  -69.27% (p=0.000 n=10)
Appender/BestCompression-20             676.0 ± 5%   204.5 ±  7%  -69.75% (p=0.000 n=10)
Appender/BestCompressionScaling-20      673.0 ± 5%   208.5 ±  9%  -69.02% (p=0.000 n=10)
geomean                                 572.3        208.6        -63.56%

                                      │ before.txt │              after.txt              │
                                      │ allocs/op  │ allocs/op   vs base                 │
Appender/NoCompression-20               2.000 ± 0%   4.000 ± 0%  +100.00% (p=0.000 n=10)
Appender/NoCompressionScaling-20        2.000 ± 0%   4.000 ± 0%  +100.00% (p=0.000 n=10)
Appender/BestSpeed-20                   2.000 ± 0%   4.000 ± 0%  +100.00% (p=0.000 n=10)
Appender/BestSpeedScaling-20            2.000 ± 0%   4.000 ± 0%  +100.00% (p=0.000 n=10)
Appender/DefaultCompression-20          2.000 ± 0%   4.000 ± 0%  +100.00% (p=0.000 n=10)
Appender/DefaultCompressionScaling-20   2.000 ± 0%   4.000 ± 0%  +100.00% (p=0.000 n=10)
Appender/BestCompression-20             2.000 ± 0%   4.000 ± 0%  +100.00% (p=0.000 n=10)
Appender/BestCompressionScaling-20      2.000 ± 0%   4.000 ± 0%  +100.00% (p=0.000 n=10)
geomean                                 2.000        4.000       +100.00%

we only care about the structs fields if there is an error but since this should not be the norm, we can optimize the most common case by storing a counter, significantly decreasing the slice memory usage.

kruskall added 4 commits July 21, 2023 16:18

perf: switch to a json decoder to reduce heap usage

84a379c

perf: further optimize for common/correct scenario

fb5df5f

we only care about the structs fields if there is an error but since this should not be the norm, we can optimize the most common case by storing a counter, significantly decreasing the slice memory usage.

fix: json parsing error

21f87f3

refactor: wrap decoding error

5d831ba

kruskall requested a review from a team July 21, 2023 21:11

dmathieu approved these changes Jul 24, 2023

View reviewed changes

kruskall merged commit b714d61 into elastic:main Jul 24, 2023
2 checks passed

kruskall deleted the fix/flush-memory-usage branch July 24, 2023 08:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: switch to json decoder and reduce heap usage #47

perf: switch to json decoder and reduce heap usage #47

kruskall commented Jul 21, 2023

perf: switch to json decoder and reduce heap usage #47

perf: switch to json decoder and reduce heap usage #47

Conversation

kruskall commented Jul 21, 2023