Skip to content

Commit

Permalink
Merge pull request #28 from cybozu-go/restore-design-doc
Browse files Browse the repository at this point in the history
add design doc about restore processes
  • Loading branch information
satoru-takeuchi authored Jul 11, 2024
2 parents 4d07d80 + c39b660 commit 1cf3c5c
Showing 1 changed file with 96 additions and 27 deletions.
123 changes: 96 additions & 27 deletions docs/design.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ We want to backup and restore RBD PVCs managed by a Rook/Ceph cluster, either by

1. Backup arbitrary RBD PVCs.
2. Restore RBD PVCs from backups.
3. Backup arbitary RBD PVCs periodically.
3. Backup arbitrary RBD PVCs periodically.
4. Copy backup data to another cluster in another data center.

Currently, the goal 1 and 3 are implemented. Other goals will be achieved later.
Expand All @@ -21,39 +21,71 @@ Currently, the goal 1 and 3 are implemented. Other goals will be achieved later.
flowchart LR
style Architecture fill:#FFFFFF
USER([User])
subgraph Architecture
USER([User])
RBSC[mantle-controller]
RPB[MantleBackup]
PVC[PersistentVolumeClaim]
PV[PersistentVolume]
RI[RBD Image]
RS[RBD Snapshot]
MBC[MantleBackupConfig]
MBCCronJob[CronJob]
%% restore
MR -- point --> MB
MRR -- watch --> MR
MRR -- create/delete --> RC
MRR -- create/delete --> RES_PVC
MRR -- create/delete --> RES_PV
USER -- create/delete --> MR
RES_PVC -- consume --> RES_PV
MR -.-|related| RC
RES_PV -- point --> RC
RC -- point --> RS
%% backup config
MBCCronJob -- create/delete --> MB
MBCR -- watch --> MBC
MBC -- point --> SRC_PVC
MBCR -- create --> MBCCronJob
MBCCronJob -.-|related| MBC
%% backup
MB -.-|related| RS
USER -- create/delete --> MB
MBR -- watch --> MB
MB -- point --> SRC_PVC
SRC_PVC -- consume --> SRC_PV
USER -- create/delete --> MBC
MBR -- create/delete --> RS
SRC_PV -- point --> RI
RS -- point --> RI
subgraph Kubernetes Layer
USER -- create/delete --> RPB
RBSC -- watch --> RPB
RPB -- point --> PVC
PVC -- consume --> PV
USER -- create/delete --> MBC
RBSC -- watch --> MBC
RBSC -- create --> MBCCronJob
MBCCronJob -- create/delete --> RPB
MBCCronJob -.-|related| MBC
MBC -- point --> PVC
end
subgraph Ceph Layer
RBSC -- create/delete --> RS
PV -- point --> RI
RS -- point --> RI
RI[RBD Image]
RS[RBD Snapshot]
RC[RBD cloned Image]
end
subgraph Kubernetes Layer
SRC_PVC[source PersistentVolumeClaim]
SRC_PV[source PersistentVolume]
subgraph Mantle controller
MBCR[MantleBackupConfigReconciler]
MBR[MantleBackupReconciler]
MRR[MantleRestoreReconciler]
end
subgraph Backup related manifests
MBC[MantleBackupConfig]
MBCCronJob[CronJob]
MB[MantleBackup]
end
subgraph Restore related manifests
MR[MantleRestore]
RES_PVC[restored PersistentVolumeClaim]
RES_PV[restored PersistentVolume]
end
end
end
```

Expand Down Expand Up @@ -88,6 +120,7 @@ apiVersion: mantle.cybozu.io/v1
kind: MantleBackup
metadata:
name: <MantleBackup resource name>
namespace: <should be the same as the target PVC>
spec:
# The name of the backup target PVC
pvc: <target PVC name>
Expand All @@ -111,3 +144,39 @@ spec:
expire: 2w # when the MantleBackups generated by this MantleBackupConfig should expire.
suspend: false # whether the periodic backup is active or not.
```

### Restore flow

Precondition: Process will not start until conditions are met.
- The target MantleBackup must exist and be ready to use.

1. Users create a `MantleRestore` resource.
2. The controller gets the target MantleBackup from the `MantleRestore` resource.
3. The controller stores the pool name for the `status.pool` field and cluster ID for the `status.clusterID` field. This value is used to remove the restored PV/PVC when the MantleRestore resource is deleted.
4. The controller gets backup target RBD snapshot name from the MantleBackup.
5. The controller creates a new RBD clone image from the RBD snapshot.
6. The controller creates a new PV/PVC using the above-mentioned RBD clone image.

### Cleanup restore flow

1. Users delete the `MantleRestore` resource.
2. The controller tries to delete the PV/PVC created by the `MantleRestore` resource and wait until the Pod consuming the PV/PVC are stopped and deleted.
3. The controller removes the RBD clone image created by the `MantleRestore` resource. However, the controller should not remove the RBD clone image if the previous step is not completed and a PV/PVC exists.

#### The manifest to get restore PV/PVC from a backup

```yaml
apiVersion: mantle.cybozu.io/v1
kind: MantleRestore
metadata:
name: <MantleRestore resource name>
namespace: <should be the same as the target MantleBackup>
spec:
# The name of the restore target backup
backup: <MantleBackup resource name>
status:
conditions:
# The corresponding restore PV/PVC is ready to use if `status` is "True"
- type: "ReadyToUse"
status: "True"
```

0 comments on commit 1cf3c5c

Please sign in to comment.