-
Notifications
You must be signed in to change notification settings - Fork 470
Docs for PCR on Cloud cluster #19503
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Files changed:
|
✅ Deploy Preview for cockroachdb-interactivetutorials-docs canceled.
|
✅ Deploy Preview for cockroachdb-api-docs canceled.
|
✅ Netlify Preview
To edit notification comments on pull requests, go to your Netlify project configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM except I think we need to go through and replace 'replication-streams' with 'physical-replication-streams' and target with standby, source with primary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great - thanks for all the last min changes!!!
- Physical cluster replication is supported in CockroachDB {{ site.data.products.core }} clusters on v23.2 or later. The primary cluster can be a [new]({% link {{ page.version.version }}/set-up-physical-cluster-replication.md %}#step-1-create-the-primary-cluster) or [existing]({% link {{ page.version.version }}/set-up-physical-cluster-replication.md %}#set-up-pcr-from-an-existing-cluster) cluster. The standby cluster must be a [new cluster started with the `--virtualized-empty` flag]({% link {{ page.version.version }}/set-up-physical-cluster-replication.md %}#step-2-create-the-standby-cluster). | ||
- Physical cluster replication is supported in: | ||
- CockroachDB {{ site.data.products.core }} clusters on v23.2 or later. The primary cluster can be a [new]({% link {{ page.version.version }}/set-up-physical-cluster-replication.md %}#step-1-create-the-primary-cluster) or [existing]({% link {{ page.version.version }}/set-up-physical-cluster-replication.md %}#set-up-pcr-from-an-existing-cluster) cluster. The standby cluster must be a [new cluster started with the --virtualized-empty flag]({% link {{ page.version.version }}/set-up-physical-cluster-replication.md %}#step-2-create-the-standby-cluster). | ||
- [CockroachDB {{ site.data.products.advanced }} in clusters]({% link cockroachcloud/physical-cluster-replication.md %}) on v25.2 or later. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We support it in clusters 24.3 or later
|
||
CockroachDB **physical cluster replication (PCR)** continuously sends all data at the cluster level from a _primary_ cluster to an independent _standby_ cluster. Existing data and ongoing changes on the active primary cluster, which is serving application data, replicate asynchronously to the passive standby cluster. | ||
|
||
PCR provides a **two-datacenter resiliency strategy** where clusters are limited to two regions. In a disaster recovery scenario, you can [fail over](#fail-over-to-the-standby-cluster) from the unavailable primary cluster to the standby cluster. This will stop the PCR stream, reset the standby cluster to a point in time where all ingested data is consistent, and mark the standby as ready to accept application traffic. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we say 'to the available standby cluster'
|
||
You'll need the following: | ||
|
||
- **Two CockroachDB {{ site.data.products.advanced }} clusters.** To set up PCR successfully, configure your clusters as per the following: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add a big note here that says 'you must have the pcr flag enabled when you create the cluster'
} | ||
~~~ | ||
|
||
- `failover_at`: The requested timestamp for failover. If you used `"status":"FAILING_OVER"` to initiate the failover and omitted `failover_at`, the failover time will default to the latest consistent replicated time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of 'if you used' can we phrase it like 'you can use'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or make it a 'tip'
|
||
### Fail back to the primary cluster | ||
|
||
To fail back from the standby to the primary cluster, start another PCR stream with the standby cluster as the `primary_cluster_id` and the original primary cluster as the `standby_cluster_id`. You can only fail back to the original primary cluster if the cluster was created with the `"support_physical_cluster_replication"` set to `true`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we remove the final sentence on the 'support pcr' set to true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, non-blocking comments
@@ -1,3 +1,6 @@ | |||
- Physical cluster replication is supported in CockroachDB {{ site.data.products.core }} clusters on v23.2 or later. The primary cluster can be a [new]({% link {{ page.version.version }}/set-up-physical-cluster-replication.md %}#step-1-create-the-primary-cluster) or [existing]({% link {{ page.version.version }}/set-up-physical-cluster-replication.md %}#set-up-pcr-from-an-existing-cluster) cluster. The standby cluster must be a [new cluster started with the `--virtualized-empty` flag]({% link {{ page.version.version }}/set-up-physical-cluster-replication.md %}#step-2-create-the-standby-cluster). | |||
- Physical cluster replication is supported in: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggest code formatting for the --virtualized-empty
flag unless maybe it broke something?
- Physical cluster replication is supported in: | ||
- CockroachDB {{ site.data.products.core }} clusters on v23.2 or later. The primary cluster can be a [new]({% link {{ page.version.version }}/set-up-physical-cluster-replication.md %}#step-1-create-the-primary-cluster) or [existing]({% link {{ page.version.version }}/set-up-physical-cluster-replication.md %}#set-up-pcr-from-an-existing-cluster) cluster. The standby cluster must be a [new cluster started with the --virtualized-empty flag]({% link {{ page.version.version }}/set-up-physical-cluster-replication.md %}#step-2-create-the-standby-cluster). | ||
- [CockroachDB {{ site.data.products.advanced }} in clusters]({% link cockroachcloud/physical-cluster-replication.md %}) on v24.3 or later. | ||
- The primary and standby clusters must have the same [zone configurations]({% link {{ page.version.version }}/configure-replication-zones.md %}) in CockroachDB self-hosted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think "CockroachDB self-hosted" can be expressed as
CockroachDB {{ site.data.products.core }}
sorry to annoy, i spent a lot of time searching for these free-form usages a while ago :-D
|
||
CockroachDB **physical cluster replication (PCR)** continuously sends all data at the cluster level from a _primary_ cluster to an independent _standby_ cluster. Existing data and ongoing changes on the active primary cluster, which is serving application data, replicate asynchronously to the passive standby cluster. | ||
|
||
PCR provides a **two-datacenter resiliency strategy** where clusters are limited to two regions. In a disaster recovery scenario, you can [fail over](#fail-over-to-the-standby-cluster) from the unavailable primary cluster to the available standby cluster. This will stop the PCR stream, reset the standby cluster to a point in time where all ingested data is consistent, and mark the standby as ready to accept application traffic. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this could be followed by (2DC)
in parens or something since I have also seen it referred to as that
In this guide, you'll use the [{{ site.data.products.cloud }} API]({% link cockroachcloud/cloud-api.md %}) to set up PCR from a primary cluster to a standby cluster, monitor the PCR stream, and fail over from the primary to the standby cluster. | ||
|
||
{{site.data.alerts.callout_info}} | ||
PCR is supported on CockroachDB {{ site.data.products.advanced }} and CockroachDB self-hosted clusters. For a guide to setting up PCR on CockroachDB self-hosted, refer to the [Set Up Physical Cluster Replication]({% link {{ site.current_cloud_version }}/set-up-physical-cluster-replication.md %}) tutorial. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as mentioned above i think "self-hosted" can be expressed as {{ site.data.products.core }}
- Clusters must be in the same cloud (AWS, GCP, or Azure). | ||
- Clusters must be single [region]({% link cockroachcloud/regions.md %}) (multiple availability zones per cluster is supported). | ||
- The primary and standby cluster in AWS and Azure must be in different regions. | ||
- The primary and standby cluster in GCP can be in the same region, but must not have overlapping CIDR ranges. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a CIDR range setup thing we can link to?
- `"primary_cluster_id"`, `"standby_cluster_id"`: The cluster IDs of the primary and standby clusters. | ||
- `"created_at"`: The time at which the PCR stream was created. | ||
|
||
To start PCR between clusters, CockroachDB {{ site.data.products.cloud }} sets up VPC peering between clusters and validates the connectivity. As a result, it may take around 5 minutes to initialize the PCR job during which the status will be `STARTING`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we have cloud docs on setting up VPC peering to link to?
Replace: | ||
|
||
- `api_secret_key` with your API secret key. | ||
- `job_id` with the PCR job's ID. You can find this in the response from when you created the PCR stream. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above re: "job" linking to some kind of job docs?
- `"id"`: The ID of the PCR stream. | ||
- `"status"`: The status of the PCR stream. For descriptions, refer to [Status](#status). | ||
- `"primary_cluster_id"`, `"standby_cluster_id"`: The cluster IDs of the primary and standby clusters. | ||
- `"created_at"`: The time at which the PCR stream was created. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same q as above re: what exactly are these times, are they SQL timestamps, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They're date-time, I've linked to the timestamp docs, also the schema of the response is listed in the Cloud API docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I've also added a link to the API ref docs in the prereq section for schema details on the responses.
|
||
Status | Description | ||
-------+------------ | ||
`STARTING` | Setting up VPC peering between clusters and validating the connectivity. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
link to VPC peering docs? if not pls ignore
-------+------------ | ||
`STARTING` | Setting up VPC peering between clusters and validating the connectivity. | ||
`REPLICATING` | Completing an initial scan and then continuing ongoing replication between the primary and standby clusters. | ||
`FAILING_OVER` | Initiating the failover from the primary to the standby cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could "fail over" link to the failover section?
TFTRs! |
DOC-10050
Added docs for running PCR on Cloud Advanced clusters with the API. This adds the page under the Cloud Deployments section of the docs.