-
Notifications
You must be signed in to change notification settings - Fork 11.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Several caching improvements: #17930
Conversation
mystenmark
commented
May 24, 2024
- Populate the cache with the latest object version when possible.
- Add negative caching for getting object by id.
- Use latest version cache for satisfying dynamic field reads
The latest updates on your projects. Learn more about Vercel for Git ↗︎
3 Ignored Deployments
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with one comment about whether we are still able to read Deleted or Wrapped objects.
if latest_object.is_live() { | ||
return CacheResult::Hit((*latest_version, latest_object.clone())); | ||
} else { | ||
assert_eq!(*latest_version, SequenceNumber::from_u64(0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this assertion true? We don't store deleted or wrapped version?
Also, I feel here is a semantic change. For deleted object, it was returning ObjectEntry::Deleted, but now it returns NegativeHit. Is this intended?
I thought NegativeHit means there is no such object version exists, including deleted/wrapped version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cleaned this up
// !is_fresh means we lost the race, and actual_entry holds the entry that was | ||
// inserted by the other thread. We need to check if we have a more recent version | ||
// than the other reader. | ||
// This could also mean that the entry was inserted by a transaction write. This | ||
// could occur in the following case: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this comment was a bit confusing. I though is_fresh
means whether the entry is the latest entry in the cache or not. But after reading the doc, I realize that it basically means there was already an entry in the cache, which unnecessarily indicating a race.
I think the whole block basically does a test-and-set, isn't it? If so, updating the doc to something like "updating the object to latest version" may be clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is_fresh
means that our init callback ran, so !is_fresh
means that we need to check if we have a more recent version that what is in the cache. I will try to clarify the comment
// | ||
// Thread 1 will see that v2 is already in the cache | ||
|
||
// this point because there should have been a cache hit.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I'm not sure what does this sentence mean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, just an unfinished comment
@@ -1060,9 +1153,7 @@ impl ObjectCacheRead for WritebackCache { | |||
ObjectEntry::Deleted => (object_id, version, ObjectDigest::OBJECT_DIGEST_DELETED), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to above comment, can we still get ObjectEntry::Deleted
or ObjectEntry::Wrapped
here? Feels like it is going to return NegativeHit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cleaned this up by having a different enum type for the latest-object cache
if let Some(entry) = self.cached.object_by_id_cache.get(object_id) { | ||
let entry = entry.lock(); | ||
let (latest_version, latest_object) = &*entry; | ||
if latest_object.is_live() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, reading the PR again, I think I understand what is_live
means, which basically try to accomodate cache_object_not_found
. I think the criteria should be latest_version == 0 && entry == ObjectEntry::Deleted
, right?
Or can we create an ObjectEntry::NotExist?
- Populate the cache with the latest object version when possible. - Add negative caching for getting object by id. - Use latest version cache for satisfying dynamic field reads
913f3f9
to
8927d9e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for adding all the detailed comments!
I wonder if it also worth creating some kind of cache invariant checker that runs periodically in simtest to catch issues. Some invariants may be that it could compare objects in dirty/cache/db and in object_by_id_cache, to see if they actually match.
} | ||
|
||
fn cache_object_not_found(&self, object_id: &ObjectID) { | ||
// when caching non-existence, we use version 0. Since that is less than OBJECT_START_VERSION |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove out dated comments?
92a0426
to
71fdafc
Compare
adding a debug check |
3908cb0
to
dc0ab9d
Compare
LGTM! |
- Populate the cache with the latest object version when possible. - Add negative caching for getting object by id. - Use latest version cache for satisfying dynamic field reads
- Populate the cache with the latest object version when possible. - Add negative caching for getting object by id. - Use latest version cache for satisfying dynamic field reads