Pre-hash rows and add SmallHashSets #318

kazimuth · 2025-05-16T17:05:31Z

Description of Changes

Annotate rows with their hashes. This is halfway to the old solution of annotating code with their serialized forms; it speeds up insertion into BTreeIndexes while saving a byte array allocation per-row.
(Wait, why do we hash when inserting into BTreeIndexes, you ask? It's because we use a HashSet to store the multiple rows corresponding to a non-unique key.)

This saves work on the main thread repeatedly hashing values. Once we parallelize message processing, this work will be spread out over multiple threads too.

Also adds a data type SmallHashSet<T> for use in BTreeIndexes. This is a struct that can store at most one element without allocating. This gives dramatic performance improvements on initial connection -- it seems that most rows in BTreeIndexes don't have any other rows with the same key. So, skipping all allocations in this case gives very good performance.

API

This is an API breaking change to the SDK

Requires SpacetimeDB PRs

Testsuite

SpacetimeDB branch name: master

Testing

I will test:

blackholio
bitcraft

…o btree indexes

kazimuth · 2025-05-16T19:17:19Z

This does mildly pessimize calls to Filter -- it adds a second allocation to them.
SmallHashSet is so good I'm tempted to just use it and get rid of PreHashedRow, not sure.

Pre-hash rows on the message pre-processing thread

ce72510

kazimuth requested review from rekhoff and joshua-spacetime May 16, 2025 17:05

kazimuth added 4 commits May 16, 2025 13:52

Implement raison d'etre of this PR, not re-hashing when inserting int…

40796c6

…o btree indexes

Less allocations?

e9c4b4b

Allocate fewer hash sets

39f53af

Add test for SmallHashSet

b32c9ea

kazimuth changed the title ~~Pre-hash rows on the message pre-processing thread~~ Pre-hash rows and add SmallHashSets May 16, 2025

Save an allocation when calling Filter

1bad0b6

Fix regression tests & make them more aggressive

2410e12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Pre-hash rows and add SmallHashSets #318

Pre-hash rows and add SmallHashSets #318

Uh oh!

kazimuth commented May 16, 2025 •

edited

Loading

Uh oh!

kazimuth commented May 16, 2025

Uh oh!

Uh oh!

Pre-hash rows and add SmallHashSets #318

Are you sure you want to change the base?

Pre-hash rows and add SmallHashSets #318

Uh oh!

Conversation

kazimuth commented May 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description of Changes

API

Requires SpacetimeDB PRs

Testsuite

Testing

Uh oh!

kazimuth commented May 16, 2025

Uh oh!

Uh oh!

kazimuth commented May 16, 2025 •

edited

Loading