-
-
Notifications
You must be signed in to change notification settings - Fork 638
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Experiment with Open Addressing HashMap and base it on Vector #1902
Comments
Fusing similar code and speeding up things + making them more memory efficient is alway a good idea. I will create a feature branch that we can work on. The experimental HashMap could use the best of both worlds, BMT and HAMT. I.e. using the hashing stuff of HAMT and building a structure using BMT. But caution: open-addressed impls may perform very very bad: http://stackoverflow.com/a/2557451/1110815
|
The mentioned article addresses this with a logarithmic worst-case probing |
Yes, but if HAMT is moderately slower (absolutely) for the good case and more stable in direction to the worst case I would prefer HAMT. We have to check that. Scala uses HAMT for HashMap. This might have a reason. |
@paplorinc I've created the open-addressing-hash-map branch. |
Hi guys. |
(hope I can) 😄 |
a few notes:
I will experiment with these, please share your findings also :) |
I would not change the BMT for the purpose of an Open Addressing (OA) HashMap. The past development showed that it is best to create specialized backing structures for each purpose. On the other hand we do not want to have another backing collection, we want to reduce them. Changing BMT in order to make it work for Vector and HashMap will make it too complicated (-> separation of concerns). In the end no one would know which line of code is necessary for which other collection...
Maybe this will make it too heavy-weight. We should try to stay on the lowest level possible for our backing collections: primitive types, bit-operations, ...
Yes, definitely. The quality of the hashing algorithm is the key to an outstanding hashed collection. I will try to get more involved after solving the issues I'm currently on - too many parallel tasks 😳 |
Closed because this issue is stale. |
Since
Hash Array Mapped Trie
andBit Mapped Trie
are very similar conceptually (not even sure what the differences are), we could eliminate duplicate code if we would base theMap
s onVector
, instead of recreating the array-emulating logic (where applicable, i.e. not forTreeMap
).(This is true even if we decide not to switch to Open Addressing.)
In addition to that, we could experiment with Open Addressing also - the current
HashMap
s use linking for collisions, right?Pro:
Contra:
null
values (easily), might be problematic with primitive storage (aBitSet
could help here, I think)null
, might lead to memory leaks (I'm sure though that others have thought about this also)The following resources could help us in designing it:
@ruslansennov has the expertise in designing the
Map
s, I know theVector
and @danieldietrich could mediate and contribute with his overall expertise. It could be a team-effort.We could either create an
experimental
branch/package (or both), where we could collaborate on trying to achieve the fastest persistent map for theJVM
- with automatic primitive support!Related to: #1486
The text was updated successfully, but these errors were encountered: