-
-
Notifications
You must be signed in to change notification settings - Fork 741
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use borgstore and other big changes #8332
use borgstore and other big changes #8332
Commits on Aug 23, 2024
-
Repository3 / RemoteRepository3: implement a borgstore based repository
Simplify the repository a lot: No repository transactions, no log-like appending, no append-only, no segments, just using a key/value store for the individual chunks. No locking yet. Also: mypy: ignore missing import there are no library stubs for borgstore yet, so mypy errors without that option. pyproject.toml: install borgstore directly from github There is no pypi release yet. use pip install -e . rather than python setup.py develop The latter is deprecated and had issues installing the "borgstore from github" dependency.
Configuration menu - View commit details
-
Copy full SHA for d30d5f4 - Browse repository at this point
Copy the full SHA d30d5f4View commit details -
It uses xxh64 hashes of the meta and data parts to verify their validity. On a server with borg, this can be done server-side without the borg key. The new RepoObj header has meta_size, data_size, meta_hash and data_hash.
Configuration menu - View commit details
-
Copy full SHA for d95cacd - Browse repository at this point
Copy the full SHA d95cacdView commit details
Commits on Sep 7, 2024
-
transfer: fix upgrades from borg 1.x by adding a --from-borg1 option
borg transfer is primarily a general purpose archive transfer function from borg2 to related borg2 repos. but for upgrades from borg 1.x, we also need to support: - rcreate with a borg 1.x "other repo" - transfer with a borg 1.x "other repo"
Configuration menu - View commit details
-
Copy full SHA for c740fd7 - Browse repository at this point
Copy the full SHA c740fd7View commit details -
locking3: store-based repo locking
Features: - exclusive and non-exclusive locks - acquire timeout - lock auto-expiry (after 30mins of inactivity), lock refresh - use tz-aware datetimes (in utc timezone) in locks Also: - document lock acquisition rules in the src - increased default BORG_LOCK_WAIT to 10s - better document with-lock test Stale locks are ignored and automatically deleted. Default: stale == 30 Minutes old. lock.refresh() can be called frequently to avoid that an acquired lock becomes stale. It does not do much if the last real refresh was recently. After stale/2 time it checks and refreshes the locks in the store. Update the repository3 code to call refresh frequently: - get/put/list/scan - inside check loop
Configuration menu - View commit details
-
Copy full SHA for 72d0cae - Browse repository at this point
Copy the full SHA 72d0caeView commit details -
manifest: store archives separately one-by-one into archives/*
repository: - api/rpc support for get/put manifest - api/rpc support to access the store
Configuration menu - View commit details
-
Copy full SHA for 8b9c052 - Browse repository at this point
Copy the full SHA 8b9c052View commit details -
Configuration menu - View commit details
-
Copy full SHA for b637542 - Browse repository at this point
Copy the full SHA b637542View commit details -
Configuration menu - View commit details
-
Copy full SHA for c292ee2 - Browse repository at this point
Copy the full SHA c292ee2View commit details -
compact: remove "borg compact", not needed any more
All chunks are separate objects in borgstore.
Configuration menu - View commit details
-
Copy full SHA for 8c2cbdb - Browse repository at this point
Copy the full SHA 8c2cbdbView commit details -
compact: reimplement "borg compact" as garbage collection
It also outputs some statistics and warns about missing/reappeared chunks.
Configuration menu - View commit details
-
Copy full SHA for 8ef5171 - Browse repository at this point
Copy the full SHA 8ef5171View commit details -
check: remove orphan chunks detection/cleanup
This is now done in borg compact, so borg check does not need to care.
Configuration menu - View commit details
-
Copy full SHA for 17ea118 - Browse repository at this point
Copy the full SHA 17ea118View commit details -
delete: just remove archive from manifest, let borg compact clean up …
…later. much faster and easier now, similar to what borg delete --force --force used to do. considering that speed, no need for checkpointing anymore. --stats does not work that way, thus it was removed. borg compact now shows some stats.
Configuration menu - View commit details
-
Copy full SHA for 4c052cd - Browse repository at this point
Copy the full SHA 4c052cdView commit details -
Note: this is the default cache implementation in borg 1.x, it worked well, but there were some issues: - if the local chunks cache got out of sync with the repository, it needed an expensive rebuild from the infos in all archives. - to optimize that, a local chunks.archive.d cache was used to speed that up, but at the price of quite significant space needs. AdhocCacheWithFiles replaced this with a non-persistent chunks cache, requesting all chunkids from the repository to initialize a simplified non-persistent chunks index, that does not do real refcounting and also initially does not have size information for pre-existing chunks. We want to move away from precise refcounting, LocalCache needs to die.
Configuration menu - View commit details
-
Copy full SHA for d6a70f4 - Browse repository at this point
Copy the full SHA d6a70f4View commit details -
Configuration menu - View commit details
-
Copy full SHA for 7a93890 - Browse repository at this point
Copy the full SHA 7a93890View commit details -
get rid of the CacheSynchronizer
Lots of low-level code written back then to optimize runtime of some functions. We'll solve this differently by doing less stats, esp. if it is expensive to compute.
Configuration menu - View commit details
-
Copy full SHA for 0306ba9 - Browse repository at this point
Copy the full SHA 0306ba9View commit details -
cache: replace .stats() by a dummy
Dummy returns all-zero stats from that call. Problem was that these values can't be computed from the chunks cache anymore. No correct refcounts, often no size information. Also removed hashindex.ChunkIndex.summarize (previously used by the above mentioned .stats() call) and .stats_against (unused) for same reason.
Configuration menu - View commit details
-
Copy full SHA for fc6d459 - Browse repository at this point
Copy the full SHA fc6d459View commit details -
Configuration menu - View commit details
-
Copy full SHA for dcde484 - Browse repository at this point
Copy the full SHA dcde484View commit details -
Configuration menu - View commit details
-
Copy full SHA for d59306f - Browse repository at this point
Copy the full SHA d59306fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 1231c96 - Browse repository at this point
Copy the full SHA 1231c96View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3e7a4cd - Browse repository at this point
Copy the full SHA 3e7a4cdView commit details -
Configuration menu - View commit details
-
Copy full SHA for cb9ff3b - Browse repository at this point
Copy the full SHA cb9ff3bView commit details -
repository3.check: implement --repair
Tests were a bit tricky as there is validation on 2 layers now: - repository3 does an xxh64 check, finds most corruptions already - on the archives level, borg also does an even stronger cryptographic check
Configuration menu - View commit details
-
Copy full SHA for bfbf3ba - Browse repository at this point
Copy the full SHA bfbf3baView commit details -
debug dump-repo-objs: remove --ghost
This was used for an implementation detail of the borg 1.x repository code, dumping uncommitted objects. Not needed any more. Also remove local repository method scan_low_level, it was only used by --ghost.
Configuration menu - View commit details
-
Copy full SHA for 1189fc3 - Browse repository at this point
Copy the full SHA 1189fc3View commit details -
repository/repository3: remove .scan method
This was an implementation specific "in on-disk order" list method that made sense with borg 1.x log-like segment files only. But we now store objects separately, so there is no "in on-disk order" anymore.
Configuration menu - View commit details
-
Copy full SHA for 60edc82 - Browse repository at this point
Copy the full SHA 60edc82View commit details -
remove the repository.flags call / feature
this heavily depended on having a repository index where the flags get stored. we don't have that with borgstore.
Configuration menu - View commit details
-
Copy full SHA for 6605f58 - Browse repository at this point
Copy the full SHA 6605f58View commit details -
cache: add log msg to _load_chunks_from_repo
For big repos, this might take a while, so at least have messages on debug level.
Configuration menu - View commit details
-
Copy full SHA for 68e64ad - Browse repository at this point
Copy the full SHA 68e64adView commit details -
Configuration menu - View commit details
-
Copy full SHA for 5c325e3 - Browse repository at this point
Copy the full SHA 5c325e3View commit details -
docs: update the repository filesystem docs
In the end, it will all depend on the borgstore backend that will be used, so we better point to the borgstore project for details.
Configuration menu - View commit details
-
Copy full SHA for c2890ef - Browse repository at this point
Copy the full SHA c2890efView commit details -
borg1 needed this due to its transactional / rollback behaviour: if there was uncommitted stuff in the repo, next repo opening automatically rolled back to last commit. thus we needed checkpoint archives to reference chunks and commit the repo. borg2 does not do that anymore, unused chunks are only removed when the user invokes borg compact. thus, if a borg create gets interrupted, the user can just run borg create again and it will find some chunks are already in the repo, making progress even if borg create gets frequently interrupted.
Configuration menu - View commit details
-
Copy full SHA for 5e3f2c0 - Browse repository at this point
Copy the full SHA 5e3f2c0View commit details -
didn't do anything anyway in this implementation.
Configuration menu - View commit details
-
Copy full SHA for e23231b - Browse repository at this point
Copy the full SHA e23231bView commit details -
Configuration menu - View commit details
-
Copy full SHA for d9f24de - Browse repository at this point
Copy the full SHA d9f24deView commit details -
Configuration menu - View commit details
-
Copy full SHA for 2be98c7 - Browse repository at this point
Copy the full SHA 2be98c7View commit details -
debug: remove refcount-obj command
borg doesn't do precise refcounting anymore, so this is pretty useless.
Configuration menu - View commit details
-
Copy full SHA for 20c180c - Browse repository at this point
Copy the full SHA 20c180cView commit details -
Configuration menu - View commit details
-
Copy full SHA for c5023da - Browse repository at this point
Copy the full SHA c5023daView commit details -
parseformat: remove dsize and unique_chunks placeholder
We don't have precise refcounts, thus we can't compute these.
Configuration menu - View commit details
-
Copy full SHA for 0b85b1a - Browse repository at this point
Copy the full SHA 0b85b1aView commit details -
info: do not output deduplicated_size
No precise refcounting, can't compute that inexpensively.
Configuration menu - View commit details
-
Copy full SHA for 8455c95 - Browse repository at this point
Copy the full SHA 8455c95View commit details -
Configuration menu - View commit details
-
Copy full SHA for 15e759c - Browse repository at this point
Copy the full SHA 15e759cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 84bd2b2 - Browse repository at this point
Copy the full SHA 84bd2b2View commit details -
refactor: rename repository/locking classes/modules
Repository -> LegacyRepository RemoteRepository -> LegacyRemoteRepository borg.repository -> borg.legacyrepository borg.remote -> borg.legacyremote Repository3 -> Repository RemoteRepository3 -> RemoteRepository borg.repository3 -> borg.repository borg.remote3 -> borg.remote borg.locking -> borg.fslocking borg.locking3 -> borg.storelocking
Configuration menu - View commit details
-
Copy full SHA for 05739aa - Browse repository at this point
Copy the full SHA 05739aaView commit details -
Configuration menu - View commit details
-
Copy full SHA for 7714b65 - Browse repository at this point
Copy the full SHA 7714b65View commit details -
Configuration menu - View commit details
-
Copy full SHA for ec8a127 - Browse repository at this point
Copy the full SHA ec8a127View commit details -
Configuration menu - View commit details
-
Copy full SHA for 22b68b0 - Browse repository at this point
Copy the full SHA 22b68b0View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3408e94 - Browse repository at this point
Copy the full SHA 3408e94View commit details -
Configuration menu - View commit details
-
Copy full SHA for 1a382a8 - Browse repository at this point
Copy the full SHA 1a382a8View commit details -
Configuration menu - View commit details
-
Copy full SHA for a15cd1e - Browse repository at this point
Copy the full SHA a15cd1eView commit details -
Repository.list: return [(id, stored_size), ...]
Note: LegacyRepository still returns [id, ...] and so does RemoteRepository.list, if the remote repo is a LegacyRepository. also: use LIST_SCAN_LIMIT
Configuration menu - View commit details
-
Copy full SHA for c67cf07 - Browse repository at this point
Copy the full SHA c67cf07View commit details -
- compression factor - dedup factor - repo size All values are approx. values without considering overheads.
Configuration menu - View commit details
-
Copy full SHA for ec1d89f - Browse repository at this point
Copy the full SHA ec1d89fView commit details -
Configuration menu - View commit details
-
Copy full SHA for a40978a - Browse repository at this point
Copy the full SHA a40978aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 5726890 - Browse repository at this point
Copy the full SHA 5726890View commit details -
Configuration menu - View commit details
-
Copy full SHA for d27b7a7 - Browse repository at this point
Copy the full SHA d27b7a7View commit details -
cache/hashindex: remove decref method, don't try to remove chunks on …
…exceptions When the AdhocCache(WithFiles) queries chunk IDs from the repo to build the chunks index, it won't know their refcount and thus all chunks in the index have their refcount at the MAX_VALUE (representing "infinite") and that would never decrease nor could that ever reach zero and get the chunk deleted from the repo. Only completely new chunks first written in the current borg run have a valid refcount. In some exception handlers, borg tried to clean up chunks that won't be used by an item by decref'ing them. That is either: - pointless due to refcount being at MAX_VALUE - inefficient, because the user might retry the backup and would need to transmit these chunks to the repo again. We'll just rely on borg compact ONLY to clean up any unused/orphan chunks.
Configuration menu - View commit details
-
Copy full SHA for ef47666 - Browse repository at this point
Copy the full SHA ef47666View commit details -
ArchiveChecker.verify_data: simplify / optimize
.init_chunks has just built self.chunks using repository.list(), so don't call that again, but just iterate over self.chunks. also some other changes, making the code much simpler.
Configuration menu - View commit details
-
Copy full SHA for bafbf62 - Browse repository at this point
Copy the full SHA bafbf62View commit details -
ArchiveChecker: remove unused possibly_superseded code
We don't care about unused or superseded repo objects any more here, borg compact will deal with them.
Configuration menu - View commit details
-
Copy full SHA for 266e6ca - Browse repository at this point
Copy the full SHA 266e6caView commit details -
Configuration menu - View commit details
-
Copy full SHA for e9c42a7 - Browse repository at this point
Copy the full SHA e9c42a7View commit details -
ArchiveChecker: don't do precise refcounting here
That's the job of borg compact and not needed inside borg check. check only needs to know if a chunk is present in the repo.
Configuration menu - View commit details
-
Copy full SHA for f9d2e68 - Browse repository at this point
Copy the full SHA f9d2e68View commit details -
cache: renamed .chunk_incref -> .reuse_chunk, boolean .seen_chunk
reuse_chunk is the complement of add_chunk for already existing chunks. It doesn't do refcounting anymore. .seen_chunk does not return the refcount anymore, but just whether the chunk exists. If we add a new chunk, it immediately sets its refcount to MAX_VALUE, so there is no difference anymore between previously existing chunks and new chunks added. This makes the stats even more useless, but we have less complexity.
Configuration menu - View commit details
-
Copy full SHA for ccc84c7 - Browse repository at this point
Copy the full SHA ccc84c7View commit details -
Configuration menu - View commit details
-
Copy full SHA for ddf6812 - Browse repository at this point
Copy the full SHA ddf6812View commit details -
ChunkIndex: remove unused .merge method
LocalCache used this to assemble a new overall chunks index from multiple chunks.archive.d's single-archive chunks indexes.
Configuration menu - View commit details
-
Copy full SHA for 15c7039 - Browse repository at this point
Copy the full SHA 15c7039View commit details -
Configuration menu - View commit details
-
Copy full SHA for 07ab6e0 - Browse repository at this point
Copy the full SHA 07ab6e0View commit details -
Configuration menu - View commit details
-
Copy full SHA for e2aa9d5 - Browse repository at this point
Copy the full SHA e2aa9d5View commit details -
Configuration menu - View commit details
-
Copy full SHA for 551834a - Browse repository at this point
Copy the full SHA 551834aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 86dc673 - Browse repository at this point
Copy the full SHA 86dc673View commit details -
Configuration menu - View commit details
-
Copy full SHA for dc9fff9 - Browse repository at this point
Copy the full SHA dc9fff9View commit details -
with-lock: refresh repo lock while subprocess is running, fixes borgb…
…ackup#8347 otherwise the lock might become stale and could get killed by any other borg process. note: ThreadRunner class written by PyCharm AI and only needed small enhancements. nice.
Configuration menu - View commit details
-
Copy full SHA for 60a592d - Browse repository at this point
Copy the full SHA 60a592dView commit details -
Configuration menu - View commit details
-
Copy full SHA for 7bf0f47 - Browse repository at this point
Copy the full SHA 7bf0f47View commit details -
Configuration menu - View commit details
-
Copy full SHA for 1cd2f4d - Browse repository at this point
Copy the full SHA 1cd2f4dView commit details -
Configuration menu - View commit details
-
Copy full SHA for b14c050 - Browse repository at this point
Copy the full SHA b14c050View commit details
Commits on Sep 8, 2024
-
Configuration menu - View commit details
-
Copy full SHA for ace97fa - Browse repository at this point
Copy the full SHA ace97faView commit details -
Configuration menu - View commit details
-
Copy full SHA for 60e88ef - Browse repository at this point
Copy the full SHA 60e88efView commit details -
Configuration menu - View commit details
-
Copy full SHA for b82ced2 - Browse repository at this point
Copy the full SHA b82ced2View commit details -
manifest.archives: refactor api
Archives was built with a dictionary-like api, but in future we want to go away from a read-modify-write archives list.
Configuration menu - View commit details
-
Copy full SHA for b56c81b - Browse repository at this point
Copy the full SHA b56c81bView commit details -
manifest: no read-modify-write for borgstore archives list
previously, borg always read all archives entries, modified the list in memory, wrote back to the repository (similar as borg 1.x did). now borg works directly with archives/* in the borgstore.
Configuration menu - View commit details
-
Copy full SHA for ef7dd76 - Browse repository at this point
Copy the full SHA ef7dd76View commit details -
check: only write to repo if --repair is given
old borg just didn't commit the transaction and thus caused a transaction rollback if not in repair mode. we can't do that anymore, thus we must avoid modifying the repo if not in repair mode.
Configuration menu - View commit details
-
Copy full SHA for 8412168 - Browse repository at this point
Copy the full SHA 8412168View commit details -
shared locking for many borg commands
not for check and compact, these need an exclusive lock. to try parallel repo access on same machine, same user, one needs to use a non-locking cache implementation: export BORG_CACHE_IMPL=adhoc this is slow due the missing files cache in that implementation, but unproblematic because no caches/indexes are persisted.
Configuration menu - View commit details
-
Copy full SHA for 0e183b2 - Browse repository at this point
Copy the full SHA 0e183b2View commit details -
Configuration menu - View commit details
-
Copy full SHA for a509a0c - Browse repository at this point
Copy the full SHA a509a0cView commit details -
check: do not create addtl. archives dir entries if we already have one
if the manifest file is missing, check generated *.1 *.2 ... archives although an entry for the correct name and id was already present. BUG! this is because if the manifest is lost, that does not imply anymore that the complete archives directory is also lost, as it did in borg 1.x. Also improved log messages a bit.
Configuration menu - View commit details
-
Copy full SHA for bc1f90b - Browse repository at this point
Copy the full SHA bc1f90bView commit details -
check --repair --undelete-archives: bring archives back from the dead
borg delete and borg prune do a quick and dirty archive deletion, just removing the archives directory entry for them. --undelete-archives can still find the archive metadata objects by completely scanning the repository and re-create missing archives directory entries. but only until borg compact would remove all unused data. if only the manifest is missing or corrupted, do not run that scan, it is not required for the manifest anymore.
Configuration menu - View commit details
-
Copy full SHA for 682aedb - Browse repository at this point
Copy the full SHA 682aedbView commit details -
Configuration menu - View commit details
-
Copy full SHA for 7442cbf - Browse repository at this point
Copy the full SHA 7442cbfView commit details -
Configuration menu - View commit details
-
Copy full SHA for b50ed04 - Browse repository at this point
Copy the full SHA b50ed04View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3794e32 - Browse repository at this point
Copy the full SHA 3794e32View commit details