Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get rid of refcounts? #7379

Closed
ThomasWaldmann opened this issue Feb 24, 2023 · 2 comments · Fixed by #8332
Closed

get rid of refcounts? #7379

ThomasWaldmann opened this issue Feb 24, 2023 · 2 comments · Fixed by #8332
Assignees
Milestone

Comments

@ThomasWaldmann
Copy link
Member

ThomasWaldmann commented Feb 24, 2023

borg currently is usually precisely refcounting objects. this ticket is split off from #7377 to focus on refcounting here only:

  • precise refcounting 0..N (like currently done in borg)
  • is object_still_needed = bool(refcount > 0) all we really need?
@ThomasWaldmann
Copy link
Member Author

ThomasWaldmann commented Feb 24, 2023

about refcounts

They are a pain to maintain:

  • if something crashes
  • if we have concurrent operations in the repo, esp. if multiple borg processes would concurrently (or even just in an alternating order) update the repo. client1 does not know what client2 is doing and vice versa. so clientside refcounts are easily outdated for multi-client scenarios and need expensive fixing (iterating over all archives, counting all references).

What are refcounts needed for?

  • detecting refcount == 0 is needed to decide about whether an object should be deleted (equivalent: id in unreferenced_set)
  • when borg create adds a new chunk (this is when refcount changes 0 -> 1, but we can also know it because the chunk does not exist and we add it), it is considered unique for this archive and contributes to the deduplicated size of it (this is interesting when borg create just finished and before creating the next archive, but as soon as the next archive references the same chunk again, it is not in the deduplicated size any more, because now 2 archives use this chunk - so this is a bit of a problematic statistics value)

Code using refcounts

  • ChunkIndexEntry has (refcount, size)
  • cleanup when we have written file content chunks, but did not reach the code writing the file item that references these chunks: the exception handler decrefs all the chunks we have already written/incref'd.
  • tests and debug commands
  • sum_unique_chunks_metadata
  • mark_as_possibly_superseded checks for refcount 0
  • orphan_chunks_check checks for refcount 0

LocalCache

  • cache.seen_chunk returns refcount
  • cache.add_chunk decides with it whether to process/write a chunk or just increase refcount
  • cache.chunk_decref (when refcount reaches 0) deletes the chunk from the repo and also the entry in the chunks index and adjusts size stats
  • cache.chunk_incref used to increment the refcount of an already existing chunk

AdhocCache

Guess this cache already goes half the way we need to go:

  • AdhocCache works differently with refcount
    • sets it to MAX_VALUE for existing chunks in the repo (== unknown refcount, assume infinite, never delete)
    • tracks precisely for new chunks added within the current session (for this case, we can know precisely)
  • It does not implement a files cache yet (which is crucial for borg's high speed with unchanged files):
    • "would require persistence" (sure, but rather use some local space than have slow backups with lots of I/O)
    • another reason is likely that the files cache we have in LocalCache only has IDs in the chunks lists, but not chunk sizes (which we also need to create an Item we write to an archive). Also, the AdhocCache chunks index does not know about chunk plaintext sizes either for already existing chunks (they have size==0).
    • so, we need to get the chunk sizes from somewhere: we either need to add them to the files cache or persist the chunks index.

How to improve?

All we need for basic borg create is the set of chunkids of the chunks existing in the repo.

  • For remote as well as for local repos, we do not want to ask the repo about the existence of each chunk, but rather have a quick (client side) hashtable lookup to decide that.
  • if it exists, we just reference it and are done.
  • if not, we need to compress/encrypt/auth/write it to the repo (and add it to the set of chunkids in the repo).
  • borg create could just internally sum up newly written chunks (no need to refcount for that).
  • accept some orphan chunks (if we don't precisely refcount):
    • if we run into an Backup(OS)Error exception before writing a file item to the archive, the already written content chunks will be orphans (if we can't clean them up in exception handler due to lack of precise refcounting). this kind of exception is not fatal, thus borg would still continue with remaining files (and commit the transaction at the end).
    • if we would also go away from transaction rollback, we might get many orphans if some backup is aborted.

borg delete/prune would just remove the archive entry from manifest (== the root object of this archive, which is for sure not needed for anything else / not referenced by anything else), creating many orphans and leave all else to borg gc.

borg gc / borg check would kill everything not referenced / all the orphans:

  • It does not even need real refcounts, but could start with a set of existing object IDs and after encountering a reference add it to a set of referenced object IDs. orphans = existing - referenced.
  • Or start with everything in orphans set and move IDs to referenced set when a reference is encountered. At the end, we can delete everything that remains in orphans set.

@ThomasWaldmann ThomasWaldmann self-assigned this Feb 24, 2023
@ThomasWaldmann ThomasWaldmann removed their assignment Jun 8, 2023
@ThomasWaldmann ThomasWaldmann added this to the ng1 milestone Sep 13, 2023
@ThomasWaldmann ThomasWaldmann self-assigned this Aug 13, 2024
@ThomasWaldmann
Copy link
Member Author

Updates:

  • implemented AdhocCacheWithFiles a while ago
  • currently working on an experimental branch with borg compact implementing a garbage collector

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant