-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add integrity information to must-gather #1848
base: master
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: tnozicka The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
e681594
to
88c93a5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only had a brief look - wouldn't a merkle tree be a standard approach for this use case?
It's valid but more for cryptography. The approach here allows you to easily identify the places with changes with respect to the particular resource and couples well with linux CLI tools. Say, to verify a folder manually, you can run |
would this approach also allow you to check if a given file was originally a part of the archive?
is that actually a valuable information if you don't know which ones are missing? Since you already put the work into this I don't feel strong about using the merkle tree (altough there probable are many existing implementations), I'm struggling to understand what is the purpose of this exactly if we don't have the capability of signing it (correct me if I'm wrong here)? The PR description says |
Transiently. But anything can be messed up with, it's just the amount of work that needs to be put in. Keep in mind this is meant primarily for detecting removed files, not to remove some and add fake files. At a point where the folder doesn't match we can just ask for a new collection. We could extent this to go at the file level.
The most pressing concern is temper detection, after that point I'd usually just not proceed further.
Using a merkle tree it is not easy or possible to use standard linux tooling for verification and you have to add extra commands. It's a valid option though.
nothing, with any approach we take. they are tempering with the archive with the intent to limit the data to what they think is not related (usually wrongly), not to be intentfully malicious. |
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Are you saying that we wouldn't investigate a customer issue if they removed any file, even if we weren't sure it was necessarily related to Scylla?
I mean, I get the point, but at this point you'd have to assume they deleted a related file and assume a somewhat hostile approach. I'm not sure how that'd turn out in practice. Before we go into reviewing the implementation, @zimnx what are your thoughts on this? |
We shouldn't reject the dump if user removed anything. I think having names of these removed resources would allow us to judge - with high enough confidence I think - whether they were related. Merkle trees would be a good fit, but they come with a cost of having troubles extracting anything from them. We would need to develop tools to check integrity or dump file names. |
@tnozicka: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
The Scylla Operator project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
/lifecycle stale |
/remove-lifecycle stale |
The Scylla Operator project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
/lifecycle stale |
The Scylla Operator project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
/lifecycle stale |
Description of your changes:
This PR add integrity information to must-gather dump. This will help to identify whether files have been removed or the collection failed.
Which issue is resolved by this Pull Request:
Resolves #1847