Consumer delta application supports re-sharding for Object type #644

Sunjeet · 2023-10-11T15:42:44Z

This is 2 of 4 planned PRs for supporting dynamic type re-sharding:

PR#642 ~~utilities for splitting/joining Object types; limited to Object types in this step, extend to Map/Set/List in last step~~
consumer delta application reshards with O(shard size) extra space (this PR)
producer-side numShards toggle and signaling
Extend resharding to Map,List, and Set types

Benchmarking with jmh

Benchmark 1: No significant regression for consumers when resharding isn't invoked, avg is upto 6% worse [Baseline.json vs. PR.json]

Benchmark 2: Read performance during delta transition with resharding vs. without resharding: avg is upto 6% worse again [PR_withDeltaTransitionsNoResharding.json vs. PR_withDeltaTransitionsWithResharding.json]

Baseline.json
PR.json
PR_withDeltaTransitionsNoResharding.json
PR_withDeltaTransitionsWithResharding.json

dkoszewnik · 2023-10-30T22:28:25Z

...java/com/netflix/hollow/core/read/engine/object/HollowObjectDeltaHistoricalStateCreator.java

-            int shardOrdinal = ordinal >> shardOrdinalShift;
-            copyRecord(historicalDataElements, nextOrdinal, stateEngineDataElements[shard], shardOrdinal, currentWriteVarLengthDataPointers);
+            int whichShard = ordinal & shardsHolder.shardNumberMask;
+            int shardOrdinal = ordinal >> shardsHolder.shards[whichShard].shardOrdinalShift;


I see this holder.shards[x].shardOrdinalShift used in a couple of places, I don't think the shard ordinal shift can actually be different between shards -- am I correct?

If so, why isn't the shardOrdinalShift a property of the holder, rather than the shard itself?

shardOrdinalShift can be different for different shards when we're in the middle of resharding.

Consider the case where we're resharding from 2 shards [S0, S1] to 4 shards .

In the first step of resharding, we will expand the shard sharray but retain the original shards as is to get-
[S0, S1, S0, S1]

In the next step of resharding, we process one shard at a time, so processing S0 yields-
[S00, S1, S01, S1]
Here the dataElements and shardOrdinalShift for the processed shard are different than the original shard but they co exist in the shard holder with shards that havent been processed yet (S1).

Ultimately when we've finished resharding we arrive at
[S00, S10, S01, S11]
and the shardOrdinalShifts are once again equal for all shards.

dkoszewnik · 2023-10-30T22:32:57Z

...java/com/netflix/hollow/core/read/engine/object/HollowObjectDeltaHistoricalStateCreator.java

+                historicalDataElements.bitsPerField[i] = Arrays.stream(shardsHolder.shards)
+                        .map(shard -> shard.dataElements.bitsPerField[fieldIdx])
+                        .max(Integer::compare).get();


dkoszewnik · 2023-10-30T22:48:33Z

hollow/src/main/java/com/netflix/hollow/core/read/engine/object/HollowObjectTypeReadState.java

+        final HollowObjectTypeReadStateShard[] shards = this.shardsVolatile.shards;
+        return Arrays.stream(shards)
+                .map(shard -> shard.dataElements)
+                .toArray(HollowObjectTypeDataElements[]::new);


The stream approach feels slightly harder to read, that might be my own bias.

Sure, reworked it

dkoszewnik · 2023-10-30T22:51:21Z

...src/main/java/com/netflix/hollow/core/read/engine/object/HollowObjectTypeReadStateShard.java

-        default:
-            return fixedLengthValue == currentData.nullValueForField[fieldIndex];
-        }
+    public long isNull(int ordinal, int fieldIndex) {


Feels like this method has the wrong name now -- it is now just reading the actual field data whether or not it is beyond the max bits for a single (possibly unaligned) read.

Good point, renamed to readValue

…r instead of data elements

…en resharding

Sunjeet requested review from dkoszewnik, workeatsleep and nayanika-u October 11, 2023 15:42

dkoszewnik reviewed Oct 30, 2023

View reviewed changes

dkoszewnik approved these changes Oct 30, 2023

View reviewed changes

Sunjeet added 6 commits November 2, 2023 11:30

Object type read state supports resharding

64d3c6a

Refactor object read state to apply consistency check on shards holde…

1764e05

…r instead of data elements

Save the extra object allocation in read path

808a194

Reuse shards when possible for optimizing the worst case read time wh…

43ab943

…en resharding

Microbenchmarking per-read during delta update

492d6e1

Cleanup

f4a81d3

Sunjeet force-pushed the pr-reshard branch from bcdb800 to f4a81d3 Compare November 2, 2023 18:30

Sunjeet merged commit ec5dcb4 into master Nov 2, 2023
2 checks passed

Sunjeet mentioned this pull request Nov 18, 2023

Resharding: producer can toggle numShards for Object types in the course of a delta chain #651

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consumer delta application supports re-sharding for Object type #644

Consumer delta application supports re-sharding for Object type #644

Sunjeet commented Oct 11, 2023 •

edited

Loading

dkoszewnik Oct 30, 2023 •

edited

Loading

Sunjeet Nov 2, 2023

dkoszewnik Oct 30, 2023

dkoszewnik Oct 30, 2023

Sunjeet Nov 2, 2023

dkoszewnik Oct 30, 2023

Sunjeet Nov 2, 2023

Consumer delta application supports re-sharding for Object type #644

Consumer delta application supports re-sharding for Object type #644

Conversation

Sunjeet commented Oct 11, 2023 • edited Loading

Benchmarking with jmh

dkoszewnik Oct 30, 2023 • edited Loading

Choose a reason for hiding this comment

Sunjeet Nov 2, 2023

Choose a reason for hiding this comment

dkoszewnik Oct 30, 2023

Choose a reason for hiding this comment

dkoszewnik Oct 30, 2023

Choose a reason for hiding this comment

Sunjeet Nov 2, 2023

Choose a reason for hiding this comment

dkoszewnik Oct 30, 2023

Choose a reason for hiding this comment

Sunjeet Nov 2, 2023

Choose a reason for hiding this comment

Sunjeet commented Oct 11, 2023 •

edited

Loading

dkoszewnik Oct 30, 2023 •

edited

Loading