Parallelize the binary parsing of function bodies #7134
+98
−60
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
After a linear scan through the code section and input source map to
find the start locations corresponding to each function body, parse the
locals and instructions for each function in parallel.
This speeds up binary parsing with a sourcemap by about 20% with 8 cores
on my machine, but only by about 2% with all 128 cores, so this
parallelization has potential but suffers from scaling overhead1.
When running a full -O3 optimization pipeline, the parallel parsing
slightly reduces the number of minor page faults, presumably by better
allocating the original instructions in separate thread-local arenas. It
also slightly reduces the max RSS, but these improvements do not
translate into better overall performance. In fact, overall performance
is slightly lower with this change (at least on my machine.)
Footnotes
FWIW the full -O3 pipeline also performs significantly better with
8 cores than with 128 cores on my machine. ↩