-
Notifications
You must be signed in to change notification settings - Fork 627
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: exclude deleted values from proof #12390
Conversation
92a7ec4
to
07267e6
Compare
@@ -1675,15 +1660,6 @@ impl Trie { | |||
for (node_hash, serialized_node) in trie_accesses.nodes { | |||
recorder.borrow_mut().record(&node_hash, serialized_node); | |||
} | |||
for (value_hash, value) in trie_accesses.values { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess by exclude deleted values
you mean the previous values being replaced with new values at the corresponding keys? is the reason this is safe because the validator will run the code that will provide the new value anyways?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, these are all deleted values. Let me explain that:
If you are validator, you trust to the root first, because it was signed by validators on the previous step. And then you trust its children, because they hash into the root and we have uniqueness, then we trust their children, etc.
And when we reach the ValueRef, we don't actually need the whole value. We need ValueRef::length
to recompute memory usage, which we trust already, and after that we just set value to None and squash necessary nodes on the way.
Intermediate nodes are still required to prove that value exists, though.
Side note - inserted values are not part of the proof, because they are result of previous state proof + chunk execution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it, thanks for explaining. lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks ok, but need other's reviews to see if I miss anything. Also thanks for the nice refactoring in the testloop utils.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #12390 +/- ##
==========================================
+ Coverage 71.60% 71.62% +0.01%
==========================================
Files 842 842
Lines 170717 170804 +87
Branches 170717 170804 +87
==========================================
+ Hits 122243 122338 +95
+ Misses 43101 43084 -17
- Partials 5373 5382 +9
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
for (value_hash, value) in trie_accesses.values { | ||
let value = match value { | ||
FlatStateValue::Ref(_) => { | ||
self.storage.retrieve_raw_bytes(&value_hash)? | ||
} | ||
FlatStateValue::Inlined(value) => value.into(), | ||
}; | ||
recorder.borrow_mut().record(&value_hash, value); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that we don't need the values for removed nodes. However here it seems like you don't record any values at all, including the ones that were just read. Are those stored elsewhere? I do see that the tests are still passing so I think believe the change is good, I'm just trying to understand it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can say that trie nodes are recorded in two phases - 1) during trie lookups on chunk processing, 2) during trie updates. This is the 2nd phase.
1st happens mainly in lookup_from_memory
for memtries or internal_retrieve_trie_node
for trie storage.
From other perspective, if value was just read but not modified, it wouldn't appear here, so this part is incomplete anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice new utilities!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
for (value_hash, value) in trie_accesses.values { | ||
let value = match value { | ||
FlatStateValue::Ref(_) => { | ||
self.storage.retrieve_raw_bytes(&value_hash)? | ||
} | ||
FlatStateValue::Inlined(value) => value.into(), | ||
}; | ||
recorder.borrow_mut().record(&value_hash, value); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, thanks!
TLDR: the change is very small, majority is the test.
Before recording resharding proof, I want to slightly improve
MemTrieUpdate::to_trie_changes
API. It returnsTrieAccesses
which includesvalues
which were read during processing state changes. But in fact, we don't need the removed values at all, to generate proof. Because we have hash uniqueness, it's enough to find that node has a value with arbitrary hash.Currently it adds weird dependency on
DBCol::State
, as memtries don't have long values, so I completely remove values from the proof.If we don't consider migration, then for correctness it is enough to maintain the same behaviour on memtrie, disk and partial storage processing. I achieve that by removing value accesses simultaneously, with removing
values.insert
andinternal_retrieve_trie_node
.Migration could be annoying - it's possible that chunk producers may have this adopted and chunk validators may not, then CV would receive insufficient proof. But I checked all
TrieUpdate::remove
calls, and, surprisingly, each remove is prepended with getting value. ForVMLogic
' storage remove API requires to get previous value, for account removal we iterate over all keys and values because that's how iterator works. So with current runtime this change is even no-op.I still find it worthy to write a test triggering
test_create_delete_account
because we don't have any integration test exercising this path. I tried to write it in intuitive way and manually checked that if runtime removes value without reading, and my change is inconsistent, then test fails.