Skip to content

Latest commit

 

History

History
78 lines (65 loc) · 5.75 KB

index.adoc

File metadata and controls

78 lines (65 loc) · 5.75 KB

Backup and restore

Control plane backup and restore operations

As a cluster administrator, you might need to stop an {product-title} cluster for a period and restart it later. Some reasons for restarting a cluster are that you need to perform maintenance on a cluster or want to reduce resource costs. In {product-title}, you can perform a graceful shutdown of a cluster so that you can easily restart the cluster later.

You must back up etcd data before shutting down a cluster; etcd is the key-value store for {product-title}, which persists the state of all resource objects. An etcd backup plays a crucial role in disaster recovery. In {product-title}, you can also replace an unhealthy etcd member.

When you want to get your cluster running again, restart the cluster gracefully.

Note

A cluster’s certificates expire one year after the installation date. You can shut down a cluster and expect it to restart gracefully while the certificates are still valid. Although the cluster automatically retrieves the expired control plane certificates, you must still approve the certificate signing requests (CSRs).

You might run into several situations where {product-title} does not work as expected, such as:

  • You have a cluster that is not functional after the restart because of unexpected conditions, such as node failure, or network connectivity issues.

  • You have deleted something critical in the cluster by mistake.

  • You have lost the majority of your control plane hosts, leading to etcd quorum loss.

You can always recover from a disaster situation by restoring your cluster to its previous state using the saved etcd snapshots.

Application backup and restore operations

As a cluster administrator, you can back up and restore applications running on {product-title} by using the OpenShift API for Data Protection (OADP).