-
Notifications
You must be signed in to change notification settings - Fork 216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consumer delta application supports re-sharding for Object type #644
Conversation
int shardOrdinal = ordinal >> shardOrdinalShift; | ||
copyRecord(historicalDataElements, nextOrdinal, stateEngineDataElements[shard], shardOrdinal, currentWriteVarLengthDataPointers); | ||
int whichShard = ordinal & shardsHolder.shardNumberMask; | ||
int shardOrdinal = ordinal >> shardsHolder.shards[whichShard].shardOrdinalShift; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see this holder.shards[x].shardOrdinalShift
used in a couple of places, I don't think the shard ordinal shift can actually be different between shards -- am I correct?
If so, why isn't the shardOrdinalShift
a property of the holder, rather than the shard itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shardOrdinalShift
can be different for different shards when we're in the middle of resharding.
Consider the case where we're resharding from 2 shards [S0, S1]
to 4 shards .
In the first step of resharding, we will expand the shard sharray but retain the original shards as is to get-
[S0, S1, S0, S1]
In the next step of resharding, we process one shard at a time, so processing S0 yields-
[S00, S1, S01, S1]
Here the dataElements
and shardOrdinalShift
for the processed shard are different than the original shard but they co exist in the shard holder with shards that havent been processed yet (S1).
Ultimately when we've finished resharding we arrive at
[S00, S10, S01, S11]
and the shardOrdinalShift
s are once again equal for all shards.
historicalDataElements.bitsPerField[i] = Arrays.stream(shardsHolder.shards) | ||
.map(shard -> shard.dataElements.bitsPerField[fieldIdx]) | ||
.max(Integer::compare).get(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch
final HollowObjectTypeReadStateShard[] shards = this.shardsVolatile.shards; | ||
return Arrays.stream(shards) | ||
.map(shard -> shard.dataElements) | ||
.toArray(HollowObjectTypeDataElements[]::new); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The stream approach feels slightly harder to read, that might be my own bias.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, reworked it
default: | ||
return fixedLengthValue == currentData.nullValueForField[fieldIndex]; | ||
} | ||
public long isNull(int ordinal, int fieldIndex) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feels like this method has the wrong name now -- it is now just reading the actual field data whether or not it is beyond the max bits for a single (possibly unaligned) read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, renamed to readValue
…r instead of data elements
This is 2 of 4 planned PRs for supporting dynamic type re-sharding:
utilities for splitting/joining Object types; limited to Object types in this step, extend to Map/Set/List in last stepBenchmarking with jmh
Benchmark 1: No significant regression for consumers when resharding isn't invoked, avg is upto 6% worse [
Baseline.json
vs.PR.json
]Benchmark 2: Read performance during delta transition with resharding vs. without resharding: avg is upto 6% worse again [
PR_withDeltaTransitionsNoResharding.json
vs.PR_withDeltaTransitionsWithResharding.json
]Baseline.json
PR.json
PR_withDeltaTransitionsNoResharding.json
PR_withDeltaTransitionsWithResharding.json