Performance Optimisation for String Literal Matching #32

AntonLydike · 2024-07-30T16:30:40Z

This PR adds a performance optimisation that skips compiling a regex for string literal matching.

Motivation:

Issue #25 points out, that we are about ~6x slower than filecheck 0.24 in the worst case, and about 3.3x on average.

We are also about 34x slower than LLVMs filecheck, but we can't get that down too far, due to pythons limitations. FileCheck is usually done before CPython finished loading the runtime.

Approach:

After some digging in traces (thanks to viztracer), I found that we spend a lot of time compiling regexes, even when they are just for fancy string literals (most of them are of the form test\s+string\s..., which is regular enough to special case. This time is dominating everything else by a huge margin:

The regex compile is about 135us of 156us total time spent, so about 85%. We then spend ~.8us on average in the actual matching logic. I was wondering how "slow" a non-regex implementation would compare.

I added logic in the existing check compiler that detects if the check is only made up of string literals, and returns a new LiteralMatcher that duck types re.Pattern for all cases that mater for our implementation. As it turns out that is just find and match.

Sadly we can't just replace re.search by string.find in all cases, as we need to handle white-space normalisation, which bloats the below code a bit. Otherwise it's quite readable though.

LiteralMatcher returns a special duck-typed version of re.Match called LiteralMatch that only has a single group. This is all that's needed for this little hack, and the other code can be left unmodified, thanks to the power of duck typing (and modifying some type hints).

Results:

The optimisation gets an average speedup in our benchmarks of 1.6x, making the new implementation only about 2.1x slower on average. This understates the effect though, as this optimisation manages to really cut down the longest benchmark (4.7k lines of CHECK-NEXT statements) times by more than 3x.

See the below chart for overall results:

The new trace shows us that we have indeed removed a bottleneck:

The new timing shows that compilation time is down to 5.5us on average, but the matching has grown to ~14.7us on average. Still the average CHECK-DAG statement now is down to 21.6us, so a reduction of 7x.

AntonLydike · 2024-07-30T17:12:31Z

I will benchmark this tonight against some real-world workloads and report numbers.

Edit: Don't have tine tonight :/

AntonLydike · 2024-08-02T08:36:04Z

It's hard to tell if this actually speeds anything up in practice, MLIRs benchmark suite goes from a geomean of 12.66 between 6 runs, to 12.49, which is almost within error. xDSLs tests increase in duration from 8.06 to 8.25, but these runs have even bigger error bars.

Further digging needed here.

implement performance optimisation for string literal matching

a6132bf

AntonLydike added the perf Performance Issues label Jul 30, 2024

AntonLydike self-assigned this Jul 30, 2024

AntonLydike added 2 commits July 30, 2024 17:31

some simplifications

214dcd9

fix pyright

65cf06f

AntonLydike changed the title ~~Implement performance optimisation for string literal matching~~ Performance Optimisation for String Literal Matching Jul 30, 2024

comments

0bee2a0

superlopuh approved these changes Jul 30, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance Optimisation for String Literal Matching #32

Performance Optimisation for String Literal Matching #32

AntonLydike commented Jul 30, 2024

AntonLydike commented Jul 30, 2024 •

edited

Loading

AntonLydike commented Aug 2, 2024

Performance Optimisation for String Literal Matching #32

Are you sure you want to change the base?

Performance Optimisation for String Literal Matching #32

Conversation

AntonLydike commented Jul 30, 2024

AntonLydike commented Jul 30, 2024 • edited Loading

AntonLydike commented Aug 2, 2024

AntonLydike commented Jul 30, 2024 •

edited

Loading