Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nodetool: add method to invoke validation compaction #263

Open
denesb opened this issue Jul 8, 2021 · 2 comments
Open

nodetool: add method to invoke validation compaction #263

denesb opened this issue Jul 8, 2021 · 2 comments

Comments

@denesb
Copy link
Contributor

denesb commented Jul 8, 2021

Scylla issue for scylla implementation: scylladb/scylladb#7736

avikivity added a commit to scylladb/scylladb that referenced this issue Jul 13, 2021
"
Currently, when sstables are suspected to be corrupt, one has a few bad
choices on how to verify that they are indeed correct:
* Obtain suspect sstable files and manually inspect them. This is
  problematic because it requires a scylla engineer to have direct access
  to data, which is not always simple or even possible due to privacy
  protection rules.
* Run sstable scrub in abort mode. This is enough to confirm whether
  there is any corruption or not, but only in a binary manner. It is not
  possible to explore the full scope of the corruption, as the scrub
  will abort on the first corruption.
* Run sstable scrub in non-abort mode. Although this allows for
  exploring the full scope of the corruption and it even gets rid of it,
  it is a very intrusive and potentially destructive process that some
  users might not be willing to even risk.

This patchset offers an alternative: validation compaction. This is a
completely non-intrusive compaction that reads all sstables in turn and
validates their contents, logging any discrepancies it can find. It does
not mutate their content, it doesn't even re-writes them. It is akin to
a dry-run mode for sstable scrub. The reason it was not implemented as
such is that the current compaction infrastructure assumes that input
sstables are replaced by output sstables as part of the compaction
process. Lifting this assumption seemed error-prone and risky, so
instead I snatched the unused "Validation" compaction type for this
purpose. This compaction type completely bypasses the regular compaction
infrastructure but only at the low-level. It still integrates fully
into compaction-manager.

Fixes: #7736
Refs: scylladb/scylla-tools-java#263

Tests: unit(dev)
"

* 'validation-compaction/v5' of https://github.com/denesb/scylla:
  test/boost/sstable_datafile_test: add test for validation compaction
  test/boost/sstable_datafile_test: scrub tests: extract corrupt sst writer code into function
  api: storage_service: expose validation compaction
  sstables/compaction_manager: add perform_sstable_validation()
  sstables/compaction_manager: rewrite_sstables(): resolve maintenance group FIXME
  sstables/compaction_manager: add maintenance scheduling group
  sstables/compaction_manager: drop _scheduling_group field
  sstables/compaction_manager: run_custom_job(): replace parameter name with compaction type
  sstables/compaction_manager: run_custom_job(): keep job function alive
  sstables/compaction_descriptor: compaction_options: add validation compaction type
  sstables/compaction: compaction_options::type(): add static assert for size of index_to_type
  sstables/compaction: implement validation compaction type
  sstables/compaction: extract compaction info creation into static method
  sstables/compaction: extract sstable list formatting to a class
  sstables/compaction: scrub_compaction: extract reporting code into static methods
  position_in_paritition{_view}: add has_key()
  mutation_fragment_stream_validator: add schema() accessor
@tzach
Copy link

tzach commented Jul 27, 2021

Ping
Adding REST API without the nodetool part make it hard to find and use

@tzach
Copy link

tzach commented Jul 27, 2021

@penberg please add this to milestone 4.6

bhalevy added a commit to bhalevy/scylla-tools-java that referenced this issue Aug 18, 2021
Add support for the scrubMode option.
Includes ABORT|SKIP|SEGREGATE|VALIDATE.

Fixes scylladb#263
Fixes scylladb#268

Signed-off-by: Benny Halevy <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants