Skip to content

Pre-hash rows and add SmallHashSets #318

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

kazimuth
Copy link
Contributor

@kazimuth kazimuth commented May 16, 2025

Description of Changes

Annotate rows with their hashes. This is halfway to the old solution of annotating code with their serialized forms; it speeds up insertion into BTreeIndexes while saving a byte array allocation per-row.
(Wait, why do we hash when inserting into BTreeIndexes, you ask? It's because we use a HashSet to store the multiple rows corresponding to a non-unique key.)

This saves work on the main thread repeatedly hashing values. Once we parallelize message processing, this work will be spread out over multiple threads too.

Also adds a data type SmallHashSet<T> for use in BTreeIndexes. This is a struct that can store at most one element without allocating. This gives dramatic performance improvements on initial connection -- it seems that most rows in BTreeIndexes don't have any other rows with the same key. So, skipping all allocations in this case gives very good performance.

API

  • This is an API breaking change to the SDK

Requires SpacetimeDB PRs

Testsuite

SpacetimeDB branch name: master

Testing

I will test:

  • blackholio
  • bitcraft

@kazimuth kazimuth changed the title Pre-hash rows on the message pre-processing thread Pre-hash rows and add SmallHashSets May 16, 2025
@kazimuth
Copy link
Contributor Author

This does mildly pessimize calls to Filter -- it adds a second allocation to them.
SmallHashSet is so good I'm tempted to just use it and get rid of PreHashedRow, not sure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant