Restore improvements #3986

Michal-Leszczynski · 2024-08-20T12:01:57Z

Based on #3956.

This PR introduces 3 big changes:

Indexing
Restore task no longer iterates over manifests, and restores all the files from given manifest before proceeding to the next one.
The problem with this approach was that it limited the pool of sstables used for creating batches and it resulted in nodes spending more time in the idle state in between restored manifests.
Now, SM indexes all of the files from all manifests at the beginning, and then creates batches from all of them, without the need to finish one manifest before another.
Batching.
Restore task no longer batches files based on --batch-size flag.
The problem with this approach was that it didn't take shard_cnt or sstable size into consideration. It was also difficult to guess the right value for this flag without knowing backup statistics.
Now, SM creates batches containing multiple of node shard_cnt sstables. It keeps on adding the biggest (not yet restored) sstables to the batch until it reaches size of 5% of the expected node workload (total_workload/total_shard_cnt_in_the_cluster*node_shard_cnt).
Additional, testing flags.
Those flags are there just for the testing purposes. Some of them will be removed, some might stay, depending on how useful we find them:

--table-parallel - how many tables should be restored from a single node at a single time (might be useful for many smaller tables)
--stream-to-all-replicas - runs l&s without the primary_replica_only option and skips the post restore repair (might be useful in a cluster with small RF and big amount of replica sets)
--unpin-agent-cpu - unpins agent from cpus for the time of the restore (it looks like it increases download speed to some extent)

This decreases main function complexity. Noticed by: cognitive complexity 66 of func `(*tablesWorker).restore` is high (> 50) (gocognit).

…abled and enabled As a preparation for restoring data, SM should disable tombstone_gc and compaction. They should be re-enabled after the restore finishes.

The idea is to first index all files to be restored, so that we can create better batches. Restore workload is aggregated first by table, then by remote sstable dir. From batching we expect: - batch contains X*shard_cnt sstables - batch contains similarly sized sstables - batch is created from any manifest/table - no waiting for manifest/table restore to finish - workload across different nodes is evenly distributed - --batch-size is ignored, batches are aiming to have size equal to 5% of expected node workload New flag --table-parallel - it allows for running multiple download and l&s jobs from the same node (no documentation as it might not be exposed later on).

… and --table-parallel flags

Michal-Leszczynski force-pushed the ml/restore branch 5 times, most recently from ec2d3a1 to c14c3dd Compare August 28, 2024 10:49

Michal-Leszczynski added 15 commits September 2, 2024 14:37

feat(restore): disable compaction during restore

948adde

Fixes #3953

refactor(restore): wrap smaller stages into separate functions

d99b4b5

This decreases main function complexity. Noticed by: cognitive complexity 66 of func `(*tablesWorker).restore` is high (> 50) (gocognit).

feat(restore_test): validate that compaction and tombstone_gc are dis…

5a3c05a

…abled and enabled As a preparation for restoring data, SM should disable tombstone_gc and compaction. They should be re-enabled after the restore finishes.

feat(swagger) agent.json, add pin/unpin cpu endpoints definition

8cb67ff

feat(agent) expose pin/unpin cpu endpoints

3dfdae6

feat(scyllaclient): add pin/unpin cpu methods

84f0790

feat(restore): add and execute unpin cpu target field

258310a

feat(command/restore): allow for setting --unpin-agent-cpu flag

b47e813

feat(restore): add and execute stream to all replicas target field

96820e5

feat(command/restore): allow for setting --stream-to-all-replicas flag

d498511

feat(restoretest): enable --unpin-agent-cpu, --stream-to-all-replicas…

7fcc9be

… and --table-parallel flags

feat(restore): add and execute disable compaction target field

48154eb

feat(command/restore): allow for setting --disable-compaction

97425bd

fix(restore): reset bandwidth limit for restore

98da79f

Michal-Leszczynski force-pushed the ml/restore branch from 01cb187 to 98da79f Compare September 2, 2024 12:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Restore improvements #3986

Restore improvements #3986

Michal-Leszczynski commented Aug 20, 2024 •

edited

Loading

Restore improvements #3986

Are you sure you want to change the base?

Restore improvements #3986

Conversation

Michal-Leszczynski commented Aug 20, 2024 • edited Loading

Michal-Leszczynski commented Aug 20, 2024 •

edited

Loading