-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(set-map): introduce MutativeMap to optimize for common scenarios (lots of data, little change) #73
base: main
Are you sure you want to change the base?
Conversation
Hi @thomasjahoda , Thank you for your proposal and PR. I understand your idea and haven’t had a chance to thoroughly review your PR yet. However, I’m leaning towards implementing a library similar to I’ll take a full look at your PR later. Thanks again! |
@unadlib Thank you for already taking a look and the quick heads-up 🙏🏻 I agree that intuitively it would be more elegant and make more "sense" to implement such data-structures around and not within the core of the library. And at first I even attempted to implement it via the custom shallow copy interface, but I failed miserably. How to maybe do it with the custom shallow copy interfaceNow that I have more knowledge about the internals of mutative I probably would have more of a chance to implement
Why directly integrate it into
|
@thomasjahoda I appreciate the effort you have put into this. In the early stages of Mutative, I considered how to comprehensively support custom immutable types. To achieve this, in addition to the current custom shallow copies, it is necessary to provide modification hooks for the custom shallow copies to facilitate locating modification marks. Moreover, as you mentioned, we need to provide custom finalize interfaces during the finalization process, which naturally includes how to check and generate patches, etc. From a technical perspective, the aforementioned components are generally required. If refactored properly, such improvements would not introduce breaking changes. I propose providing a base class for custom types. This base class can offer some default implementations, such as shallow copying, modification hooks, finalize interfaces, etc., so that users only need to inherit from this base class and then implement their own types. What do you think of this approach? |
I propose a Map-variant optimized for scenarios where a lot of entries exist and only a few are changed or accessed during mutations. (it already is much faster at just a hundred entries though)
However, it does not guarantee iteration order to be the same as the original map (i.e. it might not be insertion order during iteration).
Background/Details:
Mutative and especially Immer do not work well for scenarios where a lot of entries exist and only a fraction of the data is accessed/changed per mutation. It has to shallow-copy the whole Map on each mutation, which has a significant performance impact compared to directly mutating a Map.
Compared to immer, mutative is already much faster/better for such scenarios, but using MutativeMap basically reduces the cost to 0. See performance test in test/performance/mutative-set-map.ts.
This class enables mutative from needing to copy the entire map when a single value is changed. Instead, it stores the original entries separately and only copies the changed data during drafting.
E.g. if there are 50k entries and only 1 is changed or was recently changed, this class will only copy the map with that 1 changed entry.
With N=total_count, M=changed_count, this changes the complexity from O(N) to O(M) for drafting.
Other operations will become slightly more expensive due to having to lookup two Maps in many scenarios, but they have the same asymptotic complexity as a regular Map.
Benchmarks
N = 'number of entries', M = 'number of changed/accessed items during mutation'.
The total cost within a mutation is MAP_DRAFTING + M * DRAFTING_ITEM.
The cost of MAP_DRAFTING is constant for MutativeMap and O(N) for Map.
DRAFTING_ITEM is slightly more expensive for MutativeMap but asymptotically has the same complexity.
So for scenarios with a single change, the cost is basically constant:
For scenarios with only a fraction of the data being changed/accessed, MAP_DRAFTING becomes more and more irrelevant.
Status
There are still a few TODOs for cleanup, refactoring and additional tests, but before I spend that effort I wanted to inquire whether there even is an interest in introducing such data-structures to optimize for such scenarios (which I suspect are the most common and most important scenarios though). Naturally, it takes away from the elegance of just being fast out-of-the-box, but I didn't want to extend Map and just transform every touched Map to MutativeMap by default. There are a few additional reasons in MutativeMap.ts against extending Map, but I'm not sure what the best approach is.
I also need to re-execute all benchmarks and apparently I accidentially deleted benchmark.jpg.
I also added some hacks regarding the
__DEV__
global, because I had issues when executing the benchmarks. It's still not perfect as rollup gives warnings when generating types.MutativeSet: A similar MutativeSet variant should be introduced to have the same asymptotic complexity as MutativeMap for the described scenarios, but I didn't bother to introduce it yet, as I don't need it in my own project. If there is an interest in this, I would be willing to implement it though.