Skip to content

Investigate secrets that might need to be included in backups #663

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jbiers opened this issue Feb 7, 2025 · 4 comments
Open

Investigate secrets that might need to be included in backups #663

jbiers opened this issue Feb 7, 2025 · 4 comments
Assignees
Milestone

Comments

@jbiers
Copy link
Member

jbiers commented Feb 7, 2025

This issue serves as a continuation of #607. Now that BRO has a separate resourceSet for sensitive information we feel safe to include all necessary secrets for a backup to restore a cluster flawlessly.

This issue should include efforts to communicate with other teams (Fleet, Harvester, etc) to try and map all such secrets.

Reference issues:

@jbiers
Copy link
Member Author

jbiers commented Mar 13, 2025

Currently there are three big areas that have Secrets which need to be included in backups:

  • Fleet
  • Elemental
  • Provisioning

For both Fleet and Elemental, we currently provide a way for the respective teams to decide which secrets will be included in a Backup via the labels elemental.cattle.io/managed: true and fleet.cattle.io/managed: true. This approach makes sense because the teams themselves have the domain knowledge to determine which Secrets should be included in a Backup and which should not. The guidance we provide is usually as follows: If a Secret can be automatically recreated by Rancher in a seamless manner during a restore/migration, it should not be included in the Backups. Otherwise, then it should be included.

This way, any possible issues that include bugs during restores/migrations can be fixed by the teams responsible for those projects without any cognitive overload from having to understand about rancher-backups or interacting with the rancher/backup-restore-operator repository to make that happen.

The same doesn't happen for Provisioning, but certainly should. Currently the responsible team needs to add Secrets manually to this line to have them added to Backups. A change in that sense would look something like this, requiring collaboration from the Hostbusters team to label these Secrets on creation time, and also previously existing Secrets that should but don't have that label.

Alternatively, since these secrets live in the fleet-default namespace, we can even use the existing fleet label to make that happen. As this is not handled by the fleet team, though, I'd rather have a different label being used here.

@jbiers
Copy link
Member Author

jbiers commented Mar 13, 2025

Both of these issues refer to Secrets created along with GitRepo fleet.cattle.io/v1alpha1 resources. The Secret name can be found in the GitRepo spec and they live in the same namespace. It could potentially be automated via the Webhook.
rancher/dashboard#11211
rancher/rancher#45812

These two are both related to provisioning secrets not being backed up when the user adds a downstream cluster from outside the UI. If I understand correctly these secrets are resulting from another provisioning resource and can likely be automated.
#574
rancher/rancher#44033

@jbiers
Copy link
Member Author

jbiers commented Mar 13, 2025

These are the goals I can define ATM:

  • Define all secrets that need to be included in Backups. Determine which ones are still left out. If they can't be created by users, only labeling them in their CR loop should be enough.
  • From all secrets that need to be backed up, determine which ones can also be created by the users, either directly or indirectly. Investigate their creation process to understand if the label can be added automatically, as previously explained with the Fleet secrets.

@mallardduck
Copy link
Member

Something to expand upon this topic that goes beyond just secrets - and would apply to all k8s resources in general - would be a second annotation that can be applied to define how the resource is handled.

So the current concept has "Layer 1" being: "Add an annotation that all controllers should use to indicate a resource should be included in a backup file".

The next layer of the concept "Layer 2" could be: "Add an annotation that defines how a resource is restored based on context; i.e. does it get restored always, only in migration, only on in-place, etc. Additionally, depending on how this second layer is setup the ability for end-users to "override" the defaults could be exposed on the Restore object.

Essentially a side-effect of this proposed idea would be that the purge field becomes less relevant. I believe we could remove all relevance of that (being a manual required option users put in, not removing it as a concept) by also adopting the #612 changes. Since we could implicitly detect "in-place" vs "migration" by checking kube-system UID on the current cluster and the tar file (along with checking for running Rancher?)

@jbiers jbiers self-assigned this Apr 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants