perf: split lengthy fn(s) to improve concurrency #187
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
addresses #183.
obsoletes #186.
Summary: This PR gives us a knob we can twist to favor either concurrency or read+write atomicity. The intent is to give us greater flexibility as we move forward with processing "real" data... fuller blocks, longer proof times, etc.
Illustration:
There are several methods with the following general behavior:
Often the reading can be quite slow, such as fetching blocks from disk in a loop in order to examine the inputs and outputs.
This is bad for concurrency as the caller must be holding a write-lock over GlobalState in order to call an &mut self method. So all other tasks that read or write app-state are blocked.
So the above fn can be split as follows:
Now the i/o intensive read ops are performed in an &self method that only requires the caller to hold a read-lock over GlobalState. This allows all other read-only tasks to progress. Only tasks that attempt to modify state will be blocked.
The finalize method performs the state mutation as before but now it can do so much more quickly because all the data gathering and calcs are already done.
This is a win for concurrency but it comes at a cost. If the caller acquires read-lock for read method and then write-lock for the write method, we have lost atomicity over the entire read+write. That may be too high of a price to pay, depending on the operation. Still the caller now has the option to do it either way. The caller can either:
Note that for (2) the read is atomic and the write is atomic, but the read+write is not atomic.
Batching
Some of the read methods are potentially updating too much data to keep it all in RAM for the caller to pass it to the write method. In this case, batching is used and the pattern becomes:
So the caller may then call the read and write methods in a loop. If concurrency is favored then the caller would acquire read-lock for each read and write-lock for each write. Note that in this case, the reads and writes are only atomic per batch.
Backwards compatibility
wrapper methods are provided for the caller that preserve the original &mut self behavior. eg:
Smaller Fns
There is also a small win in that this refactoring breaks some of these fn into smaller units such that the logic/purpose of each becomes clearer in isolation. The best example is update_mutator_set().
Methods changed
of special note:
ScopeDurationLogger
This PR also adds a new type,
ScopeDurationLogger
that makes it a one-liner to enable logging of [slow] methods (or any scope). By default "slow" means > 0.1 seconds. If a given scope takes longer than the "slow" threshold, a warning will be emitted to the log. Otherwise nothing is logged. This makes it convenient to sprinkle the ScopeDurationLogger anywhere we think a fn might be slow eventually but it won't clutter up the log unless it actually is slow.The slow logging threshold defaults to 0.1 seconds and can be changed via:
ScopeDurationLogger::new_with_threshold()
This replaces the older duration_* macros that had to be used by the caller.
ScopeDurationLogger
could actually be made into a fn decoration macro for slightly cleaner usage. But I think that requires use syn to modify the fn and I didn't think it worth the time to learn/impl, especially as it is already a one-liner. so, maybe later.Details: