Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regexp/syntax: character class negation is slow #69303

Open
func25 opened this issue Sep 6, 2024 · 2 comments · May be fixed by #69304
Open

regexp/syntax: character class negation is slow #69303

func25 opened this issue Sep 6, 2024 · 2 comments · May be fixed by #69304
Labels
FixPending Issues that have a fix which has not yet been reviewed or submitted. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Milestone

Comments

@func25
Copy link
Contributor

func25 commented Sep 6, 2024

I've submitted a benchmark and a fix at #69304.

New benchmark:

goos: darwin
goarch: arm64
pkg: regexp/syntax
cpu: Apple M1
BenchmarkString/^(.*);$|^;(.*)-8                 4972814               232.1 ns/op            56 B/op          3 allocs/op
BenchmarkString/(foo|bar$)x*-8                   5602014               212.6 ns/op            56 B/op          3 allocs/op
BenchmarkString/[^=,]-8                         21872083                53.84 ns/op            8 B/op          1 allocs/op
BenchmarkString/([^=,]+)=([^=,]+)-8              5726475               207.3 ns/op            56 B/op          3 allocs/op
BenchmarkString/([^=,]+)=([^=,]+),.*-8           4588252               259.2 ns/op            56 B/op          3 allocs/op
PASS
ok      regexp/syntax   7.603s

Go version

1.23 darwin/arm64

What did you do?

I received a report at VictoriaMetrics/VictoriaMetrics#6911 and ran a benchmark for Regexp.String().

What did you see happen?

Some regexes are extremely slow, even when they're simple. The negate [^] causes calcFlags to run over a large character space to find a fold case.

goos: darwin
goarch: arm64
pkg: regexp/syntax
cpu: Apple M1
BenchmarkString/^(.*);$|^;(.*)-8                 4594401               253.2 ns/op            56 B/op          3 allocs/op
BenchmarkString/(foo|bar$)x*-8                   5006730               236.1 ns/op            56 B/op          3 allocs/op
BenchmarkString/[^=,]-8                              256           4227434 ns/op               8 B/op          1 allocs/op
BenchmarkString/([^=,]+)=([^=,]+)-8                  151           8032660 ns/op              56 B/op          3 allocs/op
BenchmarkString/([^=,]+)=([^=,]+),.*-8               146           8095255 ns/op              56 B/op          3 allocs/op
PASS
ok      regexp/syntax   9.011s
@gabyhelp
Copy link

gabyhelp commented Sep 6, 2024

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/611300 mentions this issue: regexp/syntax: Optimize calcFlags() for OpCharClass fold case

@seankhliao seankhliao changed the title regexp/syntax: 10000x slower in Regexp.String() regexp/syntax: character class negation is slow Sep 6, 2024
@dmitshur dmitshur added Performance NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. FixPending Issues that have a fix which has not yet been reviewed or submitted. labels Sep 6, 2024
@dmitshur dmitshur added this to the Backlog milestone Sep 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
FixPending Issues that have a fix which has not yet been reviewed or submitted. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants