-
-
Notifications
You must be signed in to change notification settings - Fork 741
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
borg2: build_chunkindex_from_repo is slow #8397
Comments
Analysis: There are only a few borg2 commands that remove objects from
Notably, these commands do NOT delete objects from
So, the set of objects in data/ is always increasing until compact/check is run (we can ignore borg debug and borg repo-delete). borg create must not assume a chunk is in the repo when it in fact isn't anymore, that would create a corrupt archive, referencing a non-existing object. OTOH, storing a chunk into the repo that already exists in there (but we did not know) is only a performance issue, but otherwise not a problem. |
Implementation idea:
Uptodate check and lockless operation (even if multiple borg of same user on same machine use the same repository) needs more thoughts. |
Another idea:
|
Problem:
That function does a
repository.list()
, listing all the object IDs in the repo to build an in-memory chunkindex.Because all objects are stored separately into a 2 levels deep dir structure, that are (1+)256+65536 listdir() calls in the worst case. Depending on store speed, connection latency, etc., that can take quite a while.
The in-memory chunkindex is currently not persisted to local cache.
The text was updated successfully, but these errors were encountered: