-
-
Notifications
You must be signed in to change notification settings - Fork 742
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improve files cache #8385
Comments
@intelfx noted on IRC that this might be a bit much traffic if the archive is e.g. 5M files big. True, so guess this will need some optimization afterwards, e.g. persisting the files cache or the archive metadata stream locally. |
Guess we could do it like that:
Update: Done! Now usually there is no transfer from repo needed. But that can be done in case the local files cache is lost (still better than re-reading/chunking/hashing everything). Local "files cache" filename suffix automatically determined by archive (series) name or manually via the env var by the user. Now mtime AND ctime stored in the files cache. |
- changes to locally stored files cache: - store as files.<H(archive_name)> - user can manually control suffix via env var - if local files cache is not found, build from previous archive. - enable rebuilding the files cache via loading the previous archive's metadata from the repo (better than starting with empty files cache and needing to read/chunk/hash all files). previous archive == same archive name, latest timestamp in repo. - remove AdHocCache (not needed any more, slow) - remove BORG_CACHE_IMPL, we only have one - remove cache lock (this was blocking parallel backups to same repo from same machine/user). Cache entries now have ctime AND mtime. Note: TTL and age still needed for discarding removed files. But due to the separate files caches per series, the TTL was lowered to 2 (from 20).
- changes to locally stored files cache: - store as files.<H(archive_name)> - user can manually control suffix via env var - if local files cache is not found, build from previous archive. - enable rebuilding the files cache via loading the previous archive's metadata from the repo (better than starting with empty files cache and needing to read/chunk/hash all files). previous archive == same archive name, latest timestamp in repo. - remove AdHocCache (not needed any more, slow) - remove BORG_CACHE_IMPL, we only have one - remove cache lock (this was blocking parallel backups to same repo from same machine/user). Cache entries now have ctime AND mtime. Note: TTL and age still needed for discarding removed files. But due to the separate files caches per series, the TTL was lowered to 2 (from 20).
We can build the files cache by reading the "previous" archive from the repo after we have the "backup series" feature (#8379 is merged), see:
#7930 (comment)
Pros:
borg create
by default (without the BORG_CACHE_IMPL=adhoc hack).Cons:
The text was updated successfully, but these errors were encountered: