-
-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make the cache configurable by cpp API #950
base: main
Are you sure you want to change the base?
Conversation
433e1b4
to
6c5f25e
Compare
CI seems to be broken for other reasons than this PR. |
include/zim/archive.h
Outdated
/** Set the size of the dirent lookup cache. | ||
* | ||
* Contrarly to other `set_<foo>_cache_max_size`, this method is useless once | ||
* the lookup cache is created. | ||
* The lookup cache is created at first access to a entry in the archive. | ||
* So this method must be called before any access to content (including metadata). | ||
* It is best to call this method first, just after the archive creation. | ||
*/ | ||
void set_dirent_lookup_cache_max_size(size_t new_size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it better to through an exception if this method is called when it's too late?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure. Another option would be to add a new argument to Archive constructor and remove this function.
6c5f25e
to
2c34dbf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's have #949 merged first. I will review the rebased&fixed-up version of this PR again after conflict resolution.
2c34dbf
to
e6fbbbf
Compare
PR is now rebased&fixed-up and conflict are resolved. |
fe9ee9d
to
4a1881a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All my feedback was restricted to the test code, which means that generally the PR is good to merge.
4a1881a
to
117868f
Compare
Last push updates with your comments. Only |
if (REF_IT->isRedirect()) { \ | ||
ASSERT_EQ(REF_IT->getRedirectEntryIndex(), test_entry.getRedirectEntryIndex()); \ | ||
} \ | ||
auto ref_item = REF_IT->getItem(true); \ | ||
auto test_item = test_entry.getItem(true); \ | ||
ASSERT_EQ(ref_item.getClusterIndex(), test_item.getClusterIndex()); \ | ||
ASSERT_EQ(ref_item.getBlobIndex(), test_item.getBlobIndex()); \ | ||
ASSERT_EQ(ref_item.getData(), test_item.getData()); \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the item equivalence testing be in an else
branch of the if
statement? Shouldn't your tests fail otherwise? Why don't they fail then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it's because the test archive doesn't contain redirect entries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
´getItem(true)´ resolve redirect(s) and return the final item.
We load the data even if the entry is a redirect to force loading of all cluster.
Else the test fails with withns
category at line 366 because only one cluster is present.
test/archive.cpp
Outdated
ASSERT_ARCHIVE_EQUIVALENT_IT_LIMIT(ref_it, range.end(), archive, 50) | ||
EXPECT_EQ(archive.get_dirent_cache_current_size(), 30); | ||
EXPECT_EQ(archive.get_cluster_cache_current_size(), 2); // Only 2 clusters in the file | ||
|
||
// Reduce cache size | ||
archive.set_dirent_cache_max_size(10); | ||
archive.set_cluster_cache_max_size(1); | ||
|
||
EXPECT_EQ(archive.get_dirent_cache_current_size(), 10); | ||
EXPECT_EQ(archive.get_cluster_cache_current_size(), 1); | ||
|
||
ASSERT_ARCHIVE_EQUIVALENT_IT_LIMIT(ref_it, range.end(), archive, 50) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fragment of the test scenario is somewhat obscure and may be misleading. The ASSERT_ARCHIVE_EQUIVALENT_IT_LIMIT()
macro has a side effect - it advances its iterator argument. Therefore the second invocation of that macro on the same iterator object continues the iteration (or performs no iteration at all if the end has already been reached before that) and it's not obvious if that is intentional or not. Please add comments explaining what is being tested or change the test accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is intentional, we want to change the cache size as we are iterating on it.
I will add a comment.
// Clean cache | ||
// (More than testing the value, this is needed as we want to be sure the cache is actually populated later) | ||
archive.set_dirent_cache_max_size(0); | ||
archive.set_cluster_cache_max_size(0); | ||
|
||
EXPECT_EQ(archive.get_dirent_cache_current_size(), 0); | ||
EXPECT_EQ(archive.get_cluster_cache_current_size(), 0); | ||
|
||
// Increase the cache |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With an ASSERT_ARCHIVE_EQUIVALENT()
easy/concise check at your service, doesn't it make sense to check that things still work after decreasing the cache size to 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose of this test is to test than changing the cache size has an effect. Not that it doesn't change the returned value. ASSERT_ARCHIVE_EQUIVALENT
is "just" a nice to have to be sure, but not the main purpose of the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose of this test is to test than changing the cache size has an effect.
That's prone to misinterpretation and I also wanted to ask if the commit's description was wrong. Yes, we want to be sure that setting the cache size actually changes the cache size, but I think that we also want to test that changing the cache size on a live archive doesn't break its operation (i.e. in that sense, it has no observable effect other than on performance).
55616af
to
f3ea99f
Compare
I have added a small comment about no reset the |
- DirentLookup becoming polymorphic - Possibility to disable lookup caching altogether - Once created, this cache size cannot be modified
Probably nobody never set this, and we don't use lzma anymore.
We don't use it anymore.
This make fail the test as it appears we create the dirent lookup cache at initialization. This will be fixed in next commits.
There is no reason this setting is done while reading the mimetype.
Let's use a temp dirent lookup (without cache) to access titleOrdered v1 and 'C' namespace.
Reading and writing in the same time to a unique_ptr is not thread safe. We introduce a atomicBool to know if the unique_ptr has been initialized. Fix #945
f3ea99f
to
cc02757
Compare
This fix #946.
Contrary to what proposed in #946, this doesn't introduce a cache config structure but use simple method:
This PR probably have some small conflict with #949, I will resolve them as soon as a PR is merged.