eskip: improve lexer performance #2755

AlexanderYastrebov · 2023-11-28T12:06:36Z

do not use fmt.Sprintf in ParseFilters and ParsePredicates
avoid allocations for fixed tokens
optimize scanWhile loop
add scanEscaped fast path
optimize scanRegexp and scanEscaped using strings.Builder
optimize scanWhitespace
optimize selectScanner using switch

goos: linux
goarch: amd64
pkg: github.com/zalando/skipper/eskip
cpu: Intel(R) Core(TM) i5-8350U CPU @ 1.70GHz
                  │   HEAD~1    │                HEAD                 │
                  │   sec/op    │   sec/op     vs base                │
ParsePredicates-8   29.88µ ± 3%   11.33µ ± 2%  -62.09% (p=0.000 n=10)

                  │    HEAD~1    │                 HEAD                 │
                  │     B/op     │     B/op      vs base                │
ParsePredicates-8   4.863Ki ± 0%   2.008Ki ± 0%  -58.71% (p=0.000 n=10)

                  │   HEAD~1    │                HEAD                │
                  │  allocs/op  │ allocs/op   vs base                │
ParsePredicates-8   198.00 ± 0%   33.00 ± 0%  -83.33% (p=0.000 n=10)

szuecs · 2023-11-29T12:50:18Z

return scalars instead of token to reduce allocations

how much does this speedup?
I think the code is already ugly if you have more than 2 return values and now we have increased from 3 to 4.

szuecs · 2023-12-01T15:30:21Z

👍

MustafaSaber · 2023-12-01T15:53:48Z

👍

AlexanderYastrebov · 2023-12-01T16:20:28Z

return scalars instead of token to reduce allocations

how much does this speedup?
I think the code is already ugly if you have more than 2 return values and now we have increased from 3 to 4.

Reverted to return token (benchmark results did not change). Looks like compiler can figure out that it does not need to allocate the token if its not stored anywhere, previously there was:

skipper/eskip/lexer.go

Line 360 in 0cdb93a

l.lastToken = &t

)

szuecs · 2023-12-01T16:35:43Z

👍

szuecs · 2023-12-01T16:35:54Z

Thanks for checking this @AlexanderYastrebov !

sepehrdaddev · 2024-01-12T13:28:50Z

@AlexanderYastrebov seems like the fuzzer found an index out of bound in the code, pls check the workflow.
specifically the in the ParsePredicates() function.

sepehrdaddev · 2024-01-12T13:47:15Z

ParsePredicates() panics when given ///

here is how reproduce it:

diff --git a/eskip/eskip_test.go b/eskip/eskip_test.go
index 249dced0..f9a75270 100644
--- a/eskip/eskip_test.go
+++ b/eskip/eskip_test.go
@@ -388,6 +388,9 @@ func TestPredicateParsing(t *testing.T) {
        }, {
                title: "star notation",
                input: `*`,
+       }, {
+               title: "panics",
+               input: `///`,
        }} {
                t.Run(test.title, func(t *testing.T) {
                        p, err := ParsePredicates(test.input)

this line is causing the panic: https://github.com/zalando/skipper/pull/2755/files#diff-c7c35e59a50fd3d4f06f22410b38b31f543ff4a988b500468920cf3e07a943f4R698

AlexanderYastrebov · 2024-01-12T13:50:25Z

@sepehrdaddev Nice one! Did you get the /// payload from the logs

==72==ERROR: AddressSanitizer: ABRT on unknown address 0x000000000048 (pc 0x0000005e4481 bp 0x10c000633bd0 sp 0x10c000633bb8 T0)
SCARINESS: 10 (signal)
    #0 0x5e4481 in runtime.raise.abi0 runtime/sys_linux_amd64.s:154

DEDUP_TOKEN: runtime.raise.abi0
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: ABRT runtime/sys_linux_amd64.s:154 in runtime.raise.abi0
==72==ABORTING
MS: 1 CopyPart-; base unit: 994af86bef6520042698904387522e903ba5149f
0x2f,0x2f,0x2f,
///

?

sepehrdaddev · 2024-01-12T13:50:51Z

yes, I got it from the logs (the same log you posted).

Test case from #2755 discovered by fuzzing fails. Signed-off-by: Alexander Yastrebov <[email protected]>

AlexanderYastrebov · 2024-01-12T14:08:57Z

master does not fail properly on this payload, I've created #2849

sepehrdaddev · 2024-01-12T14:36:31Z

for reference, the fuzzer found multiple payloads that crash (multiple iterations of /), here is another payload that causes it to crash that I found a bit interesting since it was different than others:
"\x2f\x2f\x00\x00\x00\xe6\xfe\x00\x00\x2f\x00\x00\x00\x00\x00\x00\x00\xe6\xfe\x00\x00\x2f\x00\x00\x00\x00"

Test case from #2755 discovered by fuzzing fails. Signed-off-by: Alexander Yastrebov <[email protected]>

Add test cases from #2755 discovered by fuzzing. Signed-off-by: Alexander Yastrebov <[email protected]>

Signed-off-by: Alexander Yastrebov <[email protected]>

* do not use fmt.Sprintf in ParseFilters and ParsePredicates * avoid allocations for fixed tokens * optimize scanWhile loop * add scanEscaped fast path * optimize scanRegexp and scanEscaped using strings.Builder * optimize scanWhitespace * optimize selectScanner using switch ``` goos: linux goarch: amd64 pkg: github.com/zalando/skipper/eskip cpu: Intel(R) Core(TM) i5-8350U CPU @ 1.70GHz │ HEAD~1 │ HEAD │ │ sec/op │ sec/op vs base │ ParsePredicates-8 29.88µ ± 3% 11.33µ ± 2% -62.09% (p=0.000 n=10) │ HEAD~1 │ HEAD │ │ B/op │ B/op vs base │ ParsePredicates-8 4.863Ki ± 0% 2.008Ki ± 0% -58.71% (p=0.000 n=10) │ HEAD~1 │ HEAD │ │ allocs/op │ allocs/op vs base │ ParsePredicates-8 198.00 ± 0% 33.00 ± 0% -83.33% (p=0.000 n=10) ``` Signed-off-by: Alexander Yastrebov <[email protected]>

AlexanderYastrebov · 2024-01-15T18:43:47Z

I've fixed the code and will followup in #2866

sepehrdaddev · 2024-01-16T09:20:14Z

I ran the fuzzer for 30 mins and it didn't find anything so I think its safe to say that its fixed.

sepehrdaddev · 2024-01-16T09:20:27Z

👍

szuecs · 2024-01-16T10:23:50Z

👍

* use plain ascii instead of unicode package * use loop for scanSymbol * call scan functions directly instead of selectScanner to aid inlining ``` goos: linux goarch: amd64 pkg: github.com/zalando/skipper/eskip │ HEAD~1 │ HEAD │ │ sec/op │ sec/op vs base │ ParsePredicates-8 9.637µ ± 11% 8.894µ ± 4% -7.71% (p=0.001 n=10) Parse-8 329.1m ± 4% 272.7m ± 2% -17.15% (p=0.000 n=10) geomean 1.781m 1.557m -12.56% │ HEAD~1 │ HEAD │ │ B/op │ B/op vs base │ ParsePredicates-8 2.008Ki ± 0% 2.008Ki ± 0% ~ (p=1.000 n=10) Parse-8 49.94Mi ± 0% 49.94Mi ± 0% ~ (p=0.926 n=10) geomean 320.4Ki 320.4Ki -0.00% │ HEAD~1 │ HEAD │ │ allocs/op │ allocs/op vs base │ ParsePredicates-8 33.00 ± 0% 33.00 ± 0% ~ (p=1.000 n=10) ¹ Parse-8 1.100M ± 0% 1.100M ± 0% ~ (p=0.367 n=10) geomean 6.025k 6.025k +0.00% ¹ all samples are equal ``` See previous #2755 Signed-off-by: Alexander Yastrebov <[email protected]>

* eskip: add BenchmarkParse Add a benchmark for parsing 10000 routes. Signed-off-by: Alexander Yastrebov <[email protected]> * eskip: improve lexer performance 2 * use plain ascii instead of unicode package * use loop for scanSymbol * call scan functions directly instead of selectScanner to aid inlining ``` goos: linux goarch: amd64 pkg: github.com/zalando/skipper/eskip │ HEAD~1 │ HEAD │ │ sec/op │ sec/op vs base │ ParsePredicates-8 9.637µ ± 11% 8.894µ ± 4% -7.71% (p=0.001 n=10) Parse-8 329.1m ± 4% 272.7m ± 2% -17.15% (p=0.000 n=10) geomean 1.781m 1.557m -12.56% │ HEAD~1 │ HEAD │ │ B/op │ B/op vs base │ ParsePredicates-8 2.008Ki ± 0% 2.008Ki ± 0% ~ (p=1.000 n=10) Parse-8 49.94Mi ± 0% 49.94Mi ± 0% ~ (p=0.926 n=10) geomean 320.4Ki 320.4Ki -0.00% │ HEAD~1 │ HEAD │ │ allocs/op │ allocs/op vs base │ ParsePredicates-8 33.00 ± 0% 33.00 ± 0% ~ (p=1.000 n=10) ¹ Parse-8 1.100M ± 0% 1.100M ± 0% ~ (p=0.367 n=10) geomean 6.025k 6.025k +0.00% ¹ all samples are equal ``` See previous #2755 Signed-off-by: Alexander Yastrebov <[email protected]> --------- Signed-off-by: Alexander Yastrebov <[email protected]>

Create a copy of string token value in lexer to avoid referencing input data array. The string value becomes a filter or predicate argument or an endpoint value. Using such value as a long-living cache key may create a memory leak. ``` goos: linux goarch: amd64 pkg: github.com/zalando/skipper/eskip cpu: Intel(R) Core(TM) i5-8350U CPU @ 1.70GHz │ HEAD~1 │ HEAD │ │ sec/op │ sec/op vs base │ ParsePredicates-8 7.656µ ± 6% 8.281µ ± 3% +8.16% (p=0.002 n=10) Parse-8 241.0m ± 1% 252.9m ± 6% +4.96% (p=0.000 n=10) geomean 1.358m 1.447m +6.55% │ HEAD~1 │ HEAD │ │ B/op │ B/op vs base │ ParsePredicates-8 1.961Ki ± 0% 2.352Ki ± 0% +19.92% (p=0.000 n=10) Parse-8 49.02Mi ± 0% 58.86Mi ± 0% +20.07% (p=0.000 n=10) geomean 313.7Ki 376.5Ki +20.00% │ HEAD~1 │ HEAD │ │ allocs/op │ allocs/op vs base │ ParsePredicates-8 32.00 ± 0% 52.00 ± 0% +62.50% (p=0.000 n=10) Parse-8 1.080M ± 0% 1.330M ± 0% +23.15% (p=0.000 n=10) geomean 5.879k 8.316k +41.46% │ HEAD~1 │ HEAD │ │ bytes/op │ bytes/op vs base │ Parse-8 15.99Mi ± 0% 15.99Mi ± 0% ~ (p=1.000 n=10) ¹ ¹ all samples are equal ``` Followup on #2755 Signed-off-by: Alexander Yastrebov <[email protected]>

Create a copy of symbol token value in lexer to avoid referencing input data array. The symbol value becomes a filter or predicate name or a route id value. Using such value as a long-living cache key may create a memory leak. ``` goos: linux goarch: amd64 pkg: github.com/zalando/skipper/eskip cpu: Intel(R) Core(TM) i5-8350U CPU @ 1.70GHz │ HEAD~1 │ HEAD │ │ sec/op │ sec/op vs base │ ParsePredicates-8 17.07µ ± 3% 18.80µ ± 18% +10.17% (p=0.000 n=10) Parse-8 538.9m ± 19% 545.5m ± 21% ~ (p=0.631 n=10) geomean 3.033m 3.203m +5.60% │ HEAD~1 │ HEAD │ │ B/op │ B/op vs base │ ParsePredicates-8 2.352Ki ± 0% 2.375Ki ± 0% +1.00% (p=0.000 n=10) Parse-8 58.86Mi ± 0% 61.84Mi ± 0% +5.05% (p=0.000 n=10) geomean 376.5Ki 387.8Ki +3.00% │ HEAD~1 │ HEAD │ │ allocs/op │ allocs/op vs base │ ParsePredicates-8 52.00 ± 0% 53.00 ± 0% +1.92% (p=0.000 n=10) Parse-8 1.330M ± 0% 1.510M ± 0% +13.53% (p=0.000 n=10) geomean 8.316k 8.946k +7.57% │ HEAD~1 │ HEAD │ │ bytes/op │ bytes/op vs base │ Parse-8 15.99Mi ± 0% 15.99Mi ± 0% ~ (p=1.000 n=10) ¹ ¹ all samples are equal ``` See previous #2903 Followup on #2755 Signed-off-by: Alexander Yastrebov <[email protected]>

AlexanderYastrebov changed the title ~~eskip: optimize lexer performance~~ eskip: improve lexer performance Nov 28, 2023

AlexanderYastrebov added the do-not-merge label Dec 1, 2023

AlexanderYastrebov force-pushed the eskip/optimize-lexer-2 branch from 86b4418 to 30ff107 Compare December 1, 2023 15:52

AlexanderYastrebov force-pushed the eskip/optimize-lexer-2 branch from 30ff107 to 749f6bc Compare December 1, 2023 16:18

AlexanderYastrebov removed the do-not-merge label Dec 1, 2023

AlexanderYastrebov added the major moderate risk, for example new API, small filter changes that have no risk like refactoring or logs label Dec 27, 2023

AlexanderYastrebov force-pushed the eskip/optimize-lexer-2 branch from 749f6bc to 75a0867 Compare January 8, 2024 16:38

AlexanderYastrebov added a commit that referenced this pull request Jan 12, 2024

eskip: ParsePredicates does not fail on invalid input

1e4a96b

Test case from #2755 discovered by fuzzing fails. Signed-off-by: Alexander Yastrebov <[email protected]>

AlexanderYastrebov mentioned this pull request Jan 12, 2024

eskip: add comment test cases for ParsePredicates #2849

Merged

AlexanderYastrebov marked this pull request as draft January 12, 2024 14:10

AlexanderYastrebov added a commit that referenced this pull request Jan 12, 2024

eskip: ParsePredicates does not fail on invalid input

51e73f2

Test case from #2755 discovered by fuzzing fails. Signed-off-by: Alexander Yastrebov <[email protected]>

AlexanderYastrebov added a commit that referenced this pull request Jan 12, 2024

eskip: add commnet test case for ParsePredicates

fe1ead1

Add test cases from #2755 discovered by fuzzing. Signed-off-by: Alexander Yastrebov <[email protected]>

AlexanderYastrebov added a commit that referenced this pull request Jan 12, 2024

eskip: add comment test cases for ParsePredicates

3382963

Add test cases from #2755 discovered by fuzzing. Signed-off-by: Alexander Yastrebov <[email protected]>

AlexanderYastrebov added a commit that referenced this pull request Jan 12, 2024

eskip: add comment test cases for ParsePredicates (#2849)

9e57d62

Add test cases from #2755 discovered by fuzzing. Signed-off-by: Alexander Yastrebov <[email protected]>

AlexanderYastrebov force-pushed the eskip/optimize-lexer-2 branch from 75a0867 to 32540a3 Compare January 15, 2024 11:57

AlexanderYastrebov added 2 commits January 15, 2024 19:16

eskip: use larger predicate for BenchmarkParsePredicates

d7f1f12

Signed-off-by: Alexander Yastrebov <[email protected]>

AlexanderYastrebov force-pushed the eskip/optimize-lexer-2 branch from 32540a3 to ace2d1a Compare January 15, 2024 18:16

AlexanderYastrebov marked this pull request as ready for review January 15, 2024 18:18

szuecs merged commit d468a59 into master Jan 16, 2024
14 checks passed

szuecs deleted the eskip/optimize-lexer-2 branch January 16, 2024 14:37

AlexanderYastrebov mentioned this pull request Jan 16, 2024

eskip: improve lexer performance 2 #2870

Merged

AlexanderYastrebov mentioned this pull request Feb 5, 2024

eskip: copy string token value #2903

Merged

AlexanderYastrebov mentioned this pull request Feb 6, 2024

eskip: copy symbol token value #2904

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

eskip: improve lexer performance #2755

eskip: improve lexer performance #2755

AlexanderYastrebov commented Nov 28, 2023 •

edited

Loading

szuecs commented Nov 29, 2023

szuecs commented Dec 1, 2023

MustafaSaber commented Dec 1, 2023

AlexanderYastrebov commented Dec 1, 2023 •

edited

Loading

szuecs commented Dec 1, 2023

szuecs commented Dec 1, 2023

sepehrdaddev commented Jan 12, 2024

sepehrdaddev commented Jan 12, 2024 •

edited

Loading

AlexanderYastrebov commented Jan 12, 2024 •

edited

Loading

sepehrdaddev commented Jan 12, 2024 •

edited

Loading

AlexanderYastrebov commented Jan 12, 2024

sepehrdaddev commented Jan 12, 2024 •

edited

Loading

AlexanderYastrebov commented Jan 15, 2024

sepehrdaddev commented Jan 16, 2024

sepehrdaddev commented Jan 16, 2024

szuecs commented Jan 16, 2024

eskip: improve lexer performance #2755

eskip: improve lexer performance #2755

Conversation

AlexanderYastrebov commented Nov 28, 2023 • edited Loading

szuecs commented Nov 29, 2023

szuecs commented Dec 1, 2023

MustafaSaber commented Dec 1, 2023

AlexanderYastrebov commented Dec 1, 2023 • edited Loading

szuecs commented Dec 1, 2023

szuecs commented Dec 1, 2023

sepehrdaddev commented Jan 12, 2024

sepehrdaddev commented Jan 12, 2024 • edited Loading

AlexanderYastrebov commented Jan 12, 2024 • edited Loading

sepehrdaddev commented Jan 12, 2024 • edited Loading

AlexanderYastrebov commented Jan 12, 2024

sepehrdaddev commented Jan 12, 2024 • edited Loading

AlexanderYastrebov commented Jan 15, 2024

sepehrdaddev commented Jan 16, 2024

sepehrdaddev commented Jan 16, 2024

szuecs commented Jan 16, 2024

AlexanderYastrebov commented Nov 28, 2023 •

edited

Loading

AlexanderYastrebov commented Dec 1, 2023 •

edited

Loading

sepehrdaddev commented Jan 12, 2024 •

edited

Loading

AlexanderYastrebov commented Jan 12, 2024 •

edited

Loading

sepehrdaddev commented Jan 12, 2024 •

edited

Loading

sepehrdaddev commented Jan 12, 2024 •

edited

Loading