Skip to content

Commit

Permalink
refactoring of associated tools (#70)
Browse files Browse the repository at this point in the history
* fix(tools): refactoring
  • Loading branch information
bsctl authored Aug 13, 2024
1 parent 8a98610 commit 8ab1f1e
Show file tree
Hide file tree
Showing 13 changed files with 592 additions and 526 deletions.
19 changes: 10 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,17 +2,17 @@
A set of tools to deploy and operate a multi-tenant `etcd` datastore for [Kamaji](https://github.com/clastix/kamaji) control-plane.

## Background
Kamaji turns any Kubernetes cluster into an “admin cluster” to orchestrate other Kubernetes clusters called “tenant clusters”. The Control Plane of a tenant cluster is made of regular pods running in a namespace of the “admin cluster” instead of a dedicated set of Virtual Machines. This solution makes running control planes at scale cheaper and easier to deploy and operate.
Kamaji turns any Kubernetes cluster into a Management Cluster to orchestrate other Kubernetes clusters called Tenant Clusters. The Control Plane of a tenant cluster is made of regular pods running in a namespace of the Management Cluster instead of a dedicated set of Virtual Machines. This solution makes running control planes at scale cheaper and easier to deploy and operate.

As of any Kubernetes cluster, a “tenant cluster” needs a datastore where to save the state and be able to retrieve data. Kamaji provides multiple options: a multi-tenant `etcd` as well as _MySQL_, and _PostgreSQL_, thanks to the [kine](https://github.com/k3s-io/kine) integration.
As of any Kubernetes cluster, a Tenant Cluster needs a datastore where to save the state and be able to retrieve data. Kamaji provides multiple options: a multi-tenant `etcd` as well as _MySQL_, and _PostgreSQL_, thanks to the [kine](https://github.com/k3s-io/kine) integration.

A multi-tenant deployment for `etcd` is not common practice. However, `etcd` provides simple and robust APIs for creating users and setting up role based access control (RBAC) policies to define which user have access to what key prefix.
A multi-tenant deployment for `etcd` is not common practice. However, `etcd` provides simple and robust APIs for creating users and setting up role based access control (RBAC) policies to define which user have access to what key prefix. However, in Kamaji, you can use multiple `kamaji-etcd` for different tenants. The relationship between tenant clusters and datastore can be many-to-one, one-to-one, depending on the preferencess and use cases.

## Documentation
Refer to the [etcd documentation](https://etcd.io/docs/v3.5/op-guide). Following sections provide additional procedures to help with a specific setup as it is used into project [Kamaji](https://github.com/clastix/kamaji).

- [Recovery from a snapshot](docs/snapshot-recovery.md)
- [Backup and Restore with Velero](docs/backup-and-restore.md)
- [Backup and restore from snapshot](docs/snapshot-recovery.md)
- [Disaster Recovery with Velero](docs/velero.md)
- [Rotate Certificates](docs/rotate-certificates.md)
- [Performance and Optimization](docs/performance-and-optimization.md)

Expand All @@ -31,16 +31,17 @@ Refer to the [etcd documentation](https://etcd.io/docs/v3.5/op-guide). Following
- [ ] Benchmarking

## Getting started
On the Kamaji's “admin cluster”, install the multi-tenant `etcd` with the provided Helm Chart:
To install the multi-tenant `kamaji-etcd` on the Kamaji Management Cluster using the provided Helm Chart, run the following commands:

```
```bash
helm repo add clastix https://clastix.github.io/charts
helm repo update
helm install kamaji-etcd clastix/kamaji-etcd -n kamaji-etcd --create-namespace
```

The certificates of `etcd`, are stored as secrets into the same namespace:
The `etcd` certificates are stored as secrets into the same namespace:

- `<release_name>-certs` contains CA, peers, and server certificates
- `<release_name>-root-client-certs` contains the user `root` certificates

Make sure the Kamaji controller can access these secrets in their namespaces.
Ensure the Kamaji controller has access to these secrets.
16 changes: 0 additions & 16 deletions charts/kamaji-etcd/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,22 +63,6 @@ Here the values you can override:
| alerts.rules | list | `[]` | The rules for alerts |
| autoCompactionMode | string | `"periodic"` | Interpret 'auto-compaction-retention' one of: periodic|revision. Use 'periodic' for duration based retention, 'revision' for revision number based retention. |
| autoCompactionRetention | string | `"5m"` | Auto compaction retention length. 0 means disable auto compaction. |
| backup | object | `{"all":false,"enabled":false,"s3":{"accessKey":{"value":"","valueFrom":{}},"bucket":"mybucket","image":{"pullPolicy":"IfNotPresent","repository":"minio/mc","tag":"RELEASE.2022-11-07T23-47-39Z"},"retention":"","secretKey":{"value":"","valueFrom":{}},"url":"http://mys3storage:9000"},"schedule":"20 3 * * *","snapshotDateFormat":"$(date +%Y%m%d)","snapshotNamePrefix":"mysnapshot"}` | Enable storage backup |
| backup.all | bool | `false` | Enable backup for all endpoints. When disabled, only the leader will be taken |
| backup.enabled | bool | `false` | Enable scheduling backup job |
| backup.s3 | object | `{"accessKey":{"value":"","valueFrom":{}},"bucket":"mybucket","image":{"pullPolicy":"IfNotPresent","repository":"minio/mc","tag":"RELEASE.2022-11-07T23-47-39Z"},"retention":"","secretKey":{"value":"","valueFrom":{}},"url":"http://mys3storage:9000"}` | The S3 storage config section |
| backup.s3.accessKey | object | `{"value":"","valueFrom":{}}` | The S3 storage ACCESS KEY credential. The plain value has precedence over the valueFrom that can be used to retrieve the value from a Secret. |
| backup.s3.bucket | string | `"mybucket"` | The S3 storage bucket |
| backup.s3.image | object | `{"pullPolicy":"IfNotPresent","repository":"minio/mc","tag":"RELEASE.2022-11-07T23-47-39Z"}` | The S3 client image config section |
| backup.s3.image.pullPolicy | string | `"IfNotPresent"` | Pull policy to use |
| backup.s3.image.repository | string | `"minio/mc"` | Install image from specific repo |
| backup.s3.image.tag | string | `"RELEASE.2022-11-07T23-47-39Z"` | Install image with specific tag |
| backup.s3.retention | string | `""` | The S3 storage object lifecycle management rules; N.B. enabling this option will delete previously set lifecycle rules |
| backup.s3.secretKey | object | `{"value":"","valueFrom":{}}` | The S3 storage SECRET KEY credential. The plain value has precedence over the valueFrom that can be used to retrieve the value from a Secret. |
| backup.s3.url | string | `"http://mys3storage:9000"` | The S3 storage url |
| backup.schedule | string | `"20 3 * * *"` | The job scheduled maintenance time for backup |
| backup.snapshotDateFormat | string | `"$(date +%Y%m%d)"` | The backup file date format (bash) |
| backup.snapshotNamePrefix | string | `"mysnapshot"` | The backup file name prefix |
| clientPort | int | `2379` | The client request port. |
| clusterDomain | string | `"cluster.local"` | Domain of the Kubernetes cluster. |
| datastore.enabled | bool | `false` | Create a datastore custom resource for Kamaji |
Expand Down
119 changes: 0 additions & 119 deletions charts/kamaji-etcd/templates/etcd_cronjob_backup.yaml

This file was deleted.

77 changes: 0 additions & 77 deletions charts/kamaji-etcd/templates/etcd_job_s3retention.yaml

This file was deleted.

43 changes: 0 additions & 43 deletions charts/kamaji-etcd/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -81,49 +81,6 @@ defragmentation:
# -- The job scheduled maintenance time for defrag (empty to disable)
schedule: "*/15 * * * *" # https://crontab.guru/

# -- Enable storage backup
backup:
# -- Enable scheduling backup job
enabled: false
# -- Enable backup for all endpoints. When disabled, only the leader will be taken
all: false
# -- The job scheduled maintenance time for backup
schedule: "20 3 * * *" # https://crontab.guru/
# -- The backup file name prefix
snapshotNamePrefix: mysnapshot
# -- The backup file date format (bash)
snapshotDateFormat: $(date +%Y%m%d)
# -- The S3 storage config section
s3:
# -- The S3 storage url
url: http://mys3storage:9000
# -- The S3 storage bucket
bucket: mybucket
# -- The S3 storage object lifecycle management rules; N.B. enabling this option will delete previously set lifecycle rules
retention: "" #"--expiry-days 7"
# -- The S3 storage ACCESS KEY credential. The plain value has precedence over the valueFrom that can be used to retrieve the value from a Secret.
accessKey:
value: ""
valueFrom: {}
# secretKeyRef:
# key: access_key
# name: minio-key
# -- The S3 storage SECRET KEY credential. The plain value has precedence over the valueFrom that can be used to retrieve the value from a Secret.
secretKey:
value: ""
valueFrom: {}
# secretKeyRef:
# key: secret_key
# name: minio-key
# -- The S3 client image config section
image:
# -- Install image from specific repo
repository: minio/mc
# -- Install image with specific tag
tag: "RELEASE.2022-11-07T23-47-39Z"
# -- Pull policy to use
pullPolicy: IfNotPresent

# -- Labels to add to all etcd pods
podLabels:
application: kamaji-etcd
Expand Down
47 changes: 25 additions & 22 deletions docs/rotate-certificates.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,15 @@
# Certificates Renewal Scripts

This script is a simple way to renew the certificates of a `kamaji-etcd` datastore.
This guide explains how to use the `certs-renew.sh` script to renew the certificates of a `kamaji-etcd` datastore.

It performs the following steps:

1. Check the expiration date and fingerprint of the old certificates
2. Generates a kubernetes job to create certificates through `cfssl`
3. Patches existing secrets with new certificates
4. Reset `etcd` pods and recreates `datastore-certs` secret
1. Check the expiration date of the old certificates
2. Cretates temporary role and rolebinding to permit the script to access certificates
3. Cretates a kubernetes job to create certificates through `cfssl`
4. Patches existing secrets with new certificates
5. Reset `etcd` pods and recreates `datastore-certs` secret
6. Remove temporary role and rolebinding

> *WARNING*: during the operation, the tenant control plane won't be reachable for a solid minute
Expand All @@ -19,33 +21,34 @@ It performs the following steps:
- `openssl`
- `kubectl`

## Procedure
## Usage

Once you set proper env variables according to your specific setup
To run the script, use the following command:

```bash
# kamaji-etcd namespace
export ETCD_NAMESPACE=solar-energy-lab
# kamaji-etcd sts name
export ETCD_NAME=solar-energy-etcd
``` bash
./scripts/certs-renew.sh [-e etcd_name] [-s etcd_service] [-n etcd_namespace]
```

run:
## Parameters

- `-e etcd_name`: The name of the etcd instance (default: `kamaji-etcd`).
- `-s etcd_service`: The name of the etcd service (default: `kamaji-etcd`).
- `-n etcd_namespace`: The namespace where etcd is deployed (default: `kamaji-system`).

For example:

```bash
./scripts/certs-renew.sh
``` bash
./scripts/certs-renew.sh -e my-etcd -s my-etcd-service -n my-namespace
```

finally, the script will provide the new certificates dates and fingerprint;
## Notes

> _NOTE:_ tenant control plane pods are gonna fail with `Error 3/4` but them will auto-heal in about a minute.
- Tenant Control Plane pods may fail with `Error 3/4` but will auto-heal in about a minute.
- Ensure you have the necessary permissions to create and delete roles and role bindings in the specified namespace.

## Debug mode

At the beginning of the script, the following line sets the script to run in debug mode if the environment variable `DEBUG` is set to `1`:
To run the script in debug mode set the environment variable `DEBUG`:

``` bash
if [ "${DEBUG}" = 1 ]; then
set -x
fi
export DEBUG=1
```
Loading

0 comments on commit 8ab1f1e

Please sign in to comment.