-
Notifications
You must be signed in to change notification settings - Fork 350
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
eskip: improve lexer performance #2755
Conversation
AlexanderYastrebov
commented
Nov 28, 2023
•
edited
Loading
edited
- do not use fmt.Sprintf in ParseFilters and ParsePredicates
- avoid allocations for fixed tokens
- optimize scanWhile loop
- add scanEscaped fast path
- optimize scanRegexp and scanEscaped using strings.Builder
- optimize scanWhitespace
- optimize selectScanner using switch
how much does this speedup? |
👍 |
86b4418
to
30ff107
Compare
👍 |
30ff107
to
749f6bc
Compare
Reverted to return token (benchmark results did not change). Looks like compiler can figure out that it does not need to allocate the token if its not stored anywhere, previously there was: Line 360 in 0cdb93a
|
👍 |
Thanks for checking this @AlexanderYastrebov ! |
749f6bc
to
75a0867
Compare
@AlexanderYastrebov seems like the fuzzer found an index out of bound in the code, pls check the workflow. |
here is how reproduce it: diff --git a/eskip/eskip_test.go b/eskip/eskip_test.go
index 249dced0..f9a75270 100644
--- a/eskip/eskip_test.go
+++ b/eskip/eskip_test.go
@@ -388,6 +388,9 @@ func TestPredicateParsing(t *testing.T) {
}, {
title: "star notation",
input: `*`,
+ }, {
+ title: "panics",
+ input: `///`,
}} {
t.Run(test.title, func(t *testing.T) {
p, err := ParsePredicates(test.input)
this line is causing the panic: https://github.com/zalando/skipper/pull/2755/files#diff-c7c35e59a50fd3d4f06f22410b38b31f543ff4a988b500468920cf3e07a943f4R698 |
@sepehrdaddev Nice one! Did you get the
? |
yes, I got it from the logs (the same log you posted). |
Test case from #2755 discovered by fuzzing fails. Signed-off-by: Alexander Yastrebov <[email protected]>
master does not fail properly on this payload, I've created #2849 |
for reference, the fuzzer found multiple payloads that crash (multiple iterations of |
Test case from #2755 discovered by fuzzing fails. Signed-off-by: Alexander Yastrebov <[email protected]>
Add test cases from #2755 discovered by fuzzing. Signed-off-by: Alexander Yastrebov <[email protected]>
Add test cases from #2755 discovered by fuzzing. Signed-off-by: Alexander Yastrebov <[email protected]>
Add test cases from #2755 discovered by fuzzing. Signed-off-by: Alexander Yastrebov <[email protected]>
75a0867
to
32540a3
Compare
Signed-off-by: Alexander Yastrebov <[email protected]>
* do not use fmt.Sprintf in ParseFilters and ParsePredicates * avoid allocations for fixed tokens * optimize scanWhile loop * add scanEscaped fast path * optimize scanRegexp and scanEscaped using strings.Builder * optimize scanWhitespace * optimize selectScanner using switch ``` goos: linux goarch: amd64 pkg: github.com/zalando/skipper/eskip cpu: Intel(R) Core(TM) i5-8350U CPU @ 1.70GHz │ HEAD~1 │ HEAD │ │ sec/op │ sec/op vs base │ ParsePredicates-8 29.88µ ± 3% 11.33µ ± 2% -62.09% (p=0.000 n=10) │ HEAD~1 │ HEAD │ │ B/op │ B/op vs base │ ParsePredicates-8 4.863Ki ± 0% 2.008Ki ± 0% -58.71% (p=0.000 n=10) │ HEAD~1 │ HEAD │ │ allocs/op │ allocs/op vs base │ ParsePredicates-8 198.00 ± 0% 33.00 ± 0% -83.33% (p=0.000 n=10) ``` Signed-off-by: Alexander Yastrebov <[email protected]>
32540a3
to
ace2d1a
Compare
I've fixed the code and will followup in #2866 |
I ran the fuzzer for 30 mins and it didn't find anything so I think its safe to say that its fixed. |
👍 |
1 similar comment
👍 |
* use plain ascii instead of unicode package * use loop for scanSymbol * call scan functions directly instead of selectScanner to aid inlining ``` goos: linux goarch: amd64 pkg: github.com/zalando/skipper/eskip │ HEAD~1 │ HEAD │ │ sec/op │ sec/op vs base │ ParsePredicates-8 9.637µ ± 11% 8.894µ ± 4% -7.71% (p=0.001 n=10) Parse-8 329.1m ± 4% 272.7m ± 2% -17.15% (p=0.000 n=10) geomean 1.781m 1.557m -12.56% │ HEAD~1 │ HEAD │ │ B/op │ B/op vs base │ ParsePredicates-8 2.008Ki ± 0% 2.008Ki ± 0% ~ (p=1.000 n=10) Parse-8 49.94Mi ± 0% 49.94Mi ± 0% ~ (p=0.926 n=10) geomean 320.4Ki 320.4Ki -0.00% │ HEAD~1 │ HEAD │ │ allocs/op │ allocs/op vs base │ ParsePredicates-8 33.00 ± 0% 33.00 ± 0% ~ (p=1.000 n=10) ¹ Parse-8 1.100M ± 0% 1.100M ± 0% ~ (p=0.367 n=10) geomean 6.025k 6.025k +0.00% ¹ all samples are equal ``` See previous #2755 Signed-off-by: Alexander Yastrebov <[email protected]>
* use plain ascii instead of unicode package * use loop for scanSymbol * call scan functions directly instead of selectScanner to aid inlining ``` goos: linux goarch: amd64 pkg: github.com/zalando/skipper/eskip │ HEAD~1 │ HEAD │ │ sec/op │ sec/op vs base │ ParsePredicates-8 9.637µ ± 11% 8.894µ ± 4% -7.71% (p=0.001 n=10) Parse-8 329.1m ± 4% 272.7m ± 2% -17.15% (p=0.000 n=10) geomean 1.781m 1.557m -12.56% │ HEAD~1 │ HEAD │ │ B/op │ B/op vs base │ ParsePredicates-8 2.008Ki ± 0% 2.008Ki ± 0% ~ (p=1.000 n=10) Parse-8 49.94Mi ± 0% 49.94Mi ± 0% ~ (p=0.926 n=10) geomean 320.4Ki 320.4Ki -0.00% │ HEAD~1 │ HEAD │ │ allocs/op │ allocs/op vs base │ ParsePredicates-8 33.00 ± 0% 33.00 ± 0% ~ (p=1.000 n=10) ¹ Parse-8 1.100M ± 0% 1.100M ± 0% ~ (p=0.367 n=10) geomean 6.025k 6.025k +0.00% ¹ all samples are equal ``` See previous #2755 Signed-off-by: Alexander Yastrebov <[email protected]>
* eskip: add BenchmarkParse Add a benchmark for parsing 10000 routes. Signed-off-by: Alexander Yastrebov <[email protected]> * eskip: improve lexer performance 2 * use plain ascii instead of unicode package * use loop for scanSymbol * call scan functions directly instead of selectScanner to aid inlining ``` goos: linux goarch: amd64 pkg: github.com/zalando/skipper/eskip │ HEAD~1 │ HEAD │ │ sec/op │ sec/op vs base │ ParsePredicates-8 9.637µ ± 11% 8.894µ ± 4% -7.71% (p=0.001 n=10) Parse-8 329.1m ± 4% 272.7m ± 2% -17.15% (p=0.000 n=10) geomean 1.781m 1.557m -12.56% │ HEAD~1 │ HEAD │ │ B/op │ B/op vs base │ ParsePredicates-8 2.008Ki ± 0% 2.008Ki ± 0% ~ (p=1.000 n=10) Parse-8 49.94Mi ± 0% 49.94Mi ± 0% ~ (p=0.926 n=10) geomean 320.4Ki 320.4Ki -0.00% │ HEAD~1 │ HEAD │ │ allocs/op │ allocs/op vs base │ ParsePredicates-8 33.00 ± 0% 33.00 ± 0% ~ (p=1.000 n=10) ¹ Parse-8 1.100M ± 0% 1.100M ± 0% ~ (p=0.367 n=10) geomean 6.025k 6.025k +0.00% ¹ all samples are equal ``` See previous #2755 Signed-off-by: Alexander Yastrebov <[email protected]> --------- Signed-off-by: Alexander Yastrebov <[email protected]>
Create a copy of string token value in lexer to avoid referencing input data array. The string value becomes a filter or predicate argument or an endpoint value. Using such value as a long-living cache key may create a memory leak. ``` goos: linux goarch: amd64 pkg: github.com/zalando/skipper/eskip cpu: Intel(R) Core(TM) i5-8350U CPU @ 1.70GHz │ HEAD~1 │ HEAD │ │ sec/op │ sec/op vs base │ ParsePredicates-8 7.656µ ± 6% 8.281µ ± 3% +8.16% (p=0.002 n=10) Parse-8 241.0m ± 1% 252.9m ± 6% +4.96% (p=0.000 n=10) geomean 1.358m 1.447m +6.55% │ HEAD~1 │ HEAD │ │ B/op │ B/op vs base │ ParsePredicates-8 1.961Ki ± 0% 2.352Ki ± 0% +19.92% (p=0.000 n=10) Parse-8 49.02Mi ± 0% 58.86Mi ± 0% +20.07% (p=0.000 n=10) geomean 313.7Ki 376.5Ki +20.00% │ HEAD~1 │ HEAD │ │ allocs/op │ allocs/op vs base │ ParsePredicates-8 32.00 ± 0% 52.00 ± 0% +62.50% (p=0.000 n=10) Parse-8 1.080M ± 0% 1.330M ± 0% +23.15% (p=0.000 n=10) geomean 5.879k 8.316k +41.46% │ HEAD~1 │ HEAD │ │ bytes/op │ bytes/op vs base │ Parse-8 15.99Mi ± 0% 15.99Mi ± 0% ~ (p=1.000 n=10) ¹ ¹ all samples are equal ``` Followup on #2755 Signed-off-by: Alexander Yastrebov <[email protected]>
Create a copy of string token value in lexer to avoid referencing input data array. The string value becomes a filter or predicate argument or an endpoint value. Using such value as a long-living cache key may create a memory leak. ``` goos: linux goarch: amd64 pkg: github.com/zalando/skipper/eskip cpu: Intel(R) Core(TM) i5-8350U CPU @ 1.70GHz │ HEAD~1 │ HEAD │ │ sec/op │ sec/op vs base │ ParsePredicates-8 7.656µ ± 6% 8.281µ ± 3% +8.16% (p=0.002 n=10) Parse-8 241.0m ± 1% 252.9m ± 6% +4.96% (p=0.000 n=10) geomean 1.358m 1.447m +6.55% │ HEAD~1 │ HEAD │ │ B/op │ B/op vs base │ ParsePredicates-8 1.961Ki ± 0% 2.352Ki ± 0% +19.92% (p=0.000 n=10) Parse-8 49.02Mi ± 0% 58.86Mi ± 0% +20.07% (p=0.000 n=10) geomean 313.7Ki 376.5Ki +20.00% │ HEAD~1 │ HEAD │ │ allocs/op │ allocs/op vs base │ ParsePredicates-8 32.00 ± 0% 52.00 ± 0% +62.50% (p=0.000 n=10) Parse-8 1.080M ± 0% 1.330M ± 0% +23.15% (p=0.000 n=10) geomean 5.879k 8.316k +41.46% │ HEAD~1 │ HEAD │ │ bytes/op │ bytes/op vs base │ Parse-8 15.99Mi ± 0% 15.99Mi ± 0% ~ (p=1.000 n=10) ¹ ¹ all samples are equal ``` Followup on #2755 Signed-off-by: Alexander Yastrebov <[email protected]>
Create a copy of string token value in lexer to avoid referencing input data array. The string value becomes a filter or predicate argument or an endpoint value. Using such value as a long-living cache key may create a memory leak. ``` goos: linux goarch: amd64 pkg: github.com/zalando/skipper/eskip cpu: Intel(R) Core(TM) i5-8350U CPU @ 1.70GHz │ HEAD~1 │ HEAD │ │ sec/op │ sec/op vs base │ ParsePredicates-8 7.656µ ± 6% 8.281µ ± 3% +8.16% (p=0.002 n=10) Parse-8 241.0m ± 1% 252.9m ± 6% +4.96% (p=0.000 n=10) geomean 1.358m 1.447m +6.55% │ HEAD~1 │ HEAD │ │ B/op │ B/op vs base │ ParsePredicates-8 1.961Ki ± 0% 2.352Ki ± 0% +19.92% (p=0.000 n=10) Parse-8 49.02Mi ± 0% 58.86Mi ± 0% +20.07% (p=0.000 n=10) geomean 313.7Ki 376.5Ki +20.00% │ HEAD~1 │ HEAD │ │ allocs/op │ allocs/op vs base │ ParsePredicates-8 32.00 ± 0% 52.00 ± 0% +62.50% (p=0.000 n=10) Parse-8 1.080M ± 0% 1.330M ± 0% +23.15% (p=0.000 n=10) geomean 5.879k 8.316k +41.46% │ HEAD~1 │ HEAD │ │ bytes/op │ bytes/op vs base │ Parse-8 15.99Mi ± 0% 15.99Mi ± 0% ~ (p=1.000 n=10) ¹ ¹ all samples are equal ``` Followup on #2755 Signed-off-by: Alexander Yastrebov <[email protected]>
Create a copy of symbol token value in lexer to avoid referencing input data array. The symbol value becomes a filter or predicate name or a route id value. Using such value as a long-living cache key may create a memory leak. ``` goos: linux goarch: amd64 pkg: github.com/zalando/skipper/eskip cpu: Intel(R) Core(TM) i5-8350U CPU @ 1.70GHz │ HEAD~1 │ HEAD │ │ sec/op │ sec/op vs base │ ParsePredicates-8 17.07µ ± 3% 18.80µ ± 18% +10.17% (p=0.000 n=10) Parse-8 538.9m ± 19% 545.5m ± 21% ~ (p=0.631 n=10) geomean 3.033m 3.203m +5.60% │ HEAD~1 │ HEAD │ │ B/op │ B/op vs base │ ParsePredicates-8 2.352Ki ± 0% 2.375Ki ± 0% +1.00% (p=0.000 n=10) Parse-8 58.86Mi ± 0% 61.84Mi ± 0% +5.05% (p=0.000 n=10) geomean 376.5Ki 387.8Ki +3.00% │ HEAD~1 │ HEAD │ │ allocs/op │ allocs/op vs base │ ParsePredicates-8 52.00 ± 0% 53.00 ± 0% +1.92% (p=0.000 n=10) Parse-8 1.330M ± 0% 1.510M ± 0% +13.53% (p=0.000 n=10) geomean 8.316k 8.946k +7.57% │ HEAD~1 │ HEAD │ │ bytes/op │ bytes/op vs base │ Parse-8 15.99Mi ± 0% 15.99Mi ± 0% ~ (p=1.000 n=10) ¹ ¹ all samples are equal ``` See previous #2903 Followup on #2755 Signed-off-by: Alexander Yastrebov <[email protected]>
Create a copy of symbol token value in lexer to avoid referencing input data array. The symbol value becomes a filter or predicate name or a route id value. Using such value as a long-living cache key may create a memory leak. ``` goos: linux goarch: amd64 pkg: github.com/zalando/skipper/eskip cpu: Intel(R) Core(TM) i5-8350U CPU @ 1.70GHz │ HEAD~1 │ HEAD │ │ sec/op │ sec/op vs base │ ParsePredicates-8 17.07µ ± 3% 18.80µ ± 18% +10.17% (p=0.000 n=10) Parse-8 538.9m ± 19% 545.5m ± 21% ~ (p=0.631 n=10) geomean 3.033m 3.203m +5.60% │ HEAD~1 │ HEAD │ │ B/op │ B/op vs base │ ParsePredicates-8 2.352Ki ± 0% 2.375Ki ± 0% +1.00% (p=0.000 n=10) Parse-8 58.86Mi ± 0% 61.84Mi ± 0% +5.05% (p=0.000 n=10) geomean 376.5Ki 387.8Ki +3.00% │ HEAD~1 │ HEAD │ │ allocs/op │ allocs/op vs base │ ParsePredicates-8 52.00 ± 0% 53.00 ± 0% +1.92% (p=0.000 n=10) Parse-8 1.330M ± 0% 1.510M ± 0% +13.53% (p=0.000 n=10) geomean 8.316k 8.946k +7.57% │ HEAD~1 │ HEAD │ │ bytes/op │ bytes/op vs base │ Parse-8 15.99Mi ± 0% 15.99Mi ± 0% ~ (p=1.000 n=10) ¹ ¹ all samples are equal ``` See previous #2903 Followup on #2755 Signed-off-by: Alexander Yastrebov <[email protected]>