Add a visualization of thread activity #443
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
How might we understand what our threads are up to? - it's hard to tell from most profiles. By visualizing it ourselves of course! First hacky draft:
Above each row is a thread and time goes left to right. The pink area (color/label strategy tbd) shows very fine grained FE glyph work being distributed among threads, this is what we want. Next we see a period where nothing can be done until we have final glyph order, understandable but not what we want. We can see any speedup to glyph order production is a big win as it's delaying everything. Next we see a lot of per-glyph BE work in light blue, then a period where fea-rs is the long pole as we push toward completion.
We can see that kern, which is executable as soon as glyph order is done and blocks backend fea, doesn't start for a quite a while which can work out poorly:
https://github.com/rayon-rs/rfcs/blob/master/accepted/rfc0001-scope-scheduling.md#current-behavior-per-thread-lifo suggests the order you submit to rayon can matter a lot. When I test with Oswald it seems kern is submitted with a large batch of backend glyph work and often sits in spot 600+ in the batch. If I hack to move it to the front it executes almost immediately after glyph order:
^ is significantly better; we will want to land some sort of prioritization. However, while this works for kern because it becomes executable at the same time as a large set of things that block on glyph order it doesn't work for fea so we still end up with fea not starting until all the be glyphs are done and then spending time as the only active task; if it had started as soon as kern finished we'd be in better shape:
Plenty of room for refinement but already seems useful so I will try to prepare a less hacky version to land.