Skip to content

Docs for PCR on Cloud cluster #19503

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jun 27, 2025
Merged

Docs for PCR on Cloud cluster #19503

merged 6 commits into from
Jun 27, 2025

Conversation

katmayb
Copy link
Contributor

@katmayb katmayb commented Apr 7, 2025

DOC-10050

Added docs for running PCR on Cloud Advanced clusters with the API. This adds the page under the Cloud Deployments section of the docs.

@katmayb katmayb changed the title Docs for PCR on Cloud clyster v25.2 Docs for PCR on Cloud cluster v25.2 Apr 7, 2025
Copy link

github-actions bot commented Apr 7, 2025

Files changed:

Copy link

netlify bot commented Apr 7, 2025

Deploy Preview for cockroachdb-interactivetutorials-docs canceled.

Name Link
🔨 Latest commit a196009
🔍 Latest deploy log https://app.netlify.com/projects/cockroachdb-interactivetutorials-docs/deploys/685e94f19cf9d70008d522e1

Copy link

netlify bot commented Apr 7, 2025

Deploy Preview for cockroachdb-api-docs canceled.

Name Link
🔨 Latest commit a196009
🔍 Latest deploy log https://app.netlify.com/projects/cockroachdb-api-docs/deploys/685e94f103ae740008ce5075

Copy link

netlify bot commented Apr 7, 2025

Netlify Preview

Name Link
🔨 Latest commit a196009
🔍 Latest deploy log https://app.netlify.com/projects/cockroachdb-docs/deploys/685e94f1b9c8a3000867cd4b
😎 Deploy Preview https://deploy-preview-19503--cockroachdb-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@katmayb katmayb marked this pull request as ready for review May 2, 2025 15:47
@katmayb katmayb requested review from alicia-l2 and davidwding May 2, 2025 17:39
@katmayb katmayb changed the title Docs for PCR on Cloud cluster v25.2 Docs for PCR on Cloud cluster Jun 6, 2025
Copy link

@alicia-l2 alicia-l2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except I think we need to go through and replace 'replication-streams' with 'physical-replication-streams' and target with standby, source with primary

@katmayb katmayb requested a review from alicia-l2 June 25, 2025 18:26
Copy link

@alicia-l2 alicia-l2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great - thanks for all the last min changes!!!

- Physical cluster replication is supported in CockroachDB {{ site.data.products.core }} clusters on v23.2 or later. The primary cluster can be a [new]({% link {{ page.version.version }}/set-up-physical-cluster-replication.md %}#step-1-create-the-primary-cluster) or [existing]({% link {{ page.version.version }}/set-up-physical-cluster-replication.md %}#set-up-pcr-from-an-existing-cluster) cluster. The standby cluster must be a [new cluster started with the `--virtualized-empty` flag]({% link {{ page.version.version }}/set-up-physical-cluster-replication.md %}#step-2-create-the-standby-cluster).
- Physical cluster replication is supported in:
- CockroachDB {{ site.data.products.core }} clusters on v23.2 or later. The primary cluster can be a [new]({% link {{ page.version.version }}/set-up-physical-cluster-replication.md %}#step-1-create-the-primary-cluster) or [existing]({% link {{ page.version.version }}/set-up-physical-cluster-replication.md %}#set-up-pcr-from-an-existing-cluster) cluster. The standby cluster must be a [new cluster started with the --virtualized-empty flag]({% link {{ page.version.version }}/set-up-physical-cluster-replication.md %}#step-2-create-the-standby-cluster).
- [CockroachDB {{ site.data.products.advanced }} in clusters]({% link cockroachcloud/physical-cluster-replication.md %}) on v25.2 or later.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We support it in clusters 24.3 or later


CockroachDB **physical cluster replication (PCR)** continuously sends all data at the cluster level from a _primary_ cluster to an independent _standby_ cluster. Existing data and ongoing changes on the active primary cluster, which is serving application data, replicate asynchronously to the passive standby cluster.

PCR provides a **two-datacenter resiliency strategy** where clusters are limited to two regions. In a disaster recovery scenario, you can [fail over](#fail-over-to-the-standby-cluster) from the unavailable primary cluster to the standby cluster. This will stop the PCR stream, reset the standby cluster to a point in time where all ingested data is consistent, and mark the standby as ready to accept application traffic.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we say 'to the available standby cluster'


You'll need the following:

- **Two CockroachDB {{ site.data.products.advanced }} clusters.** To set up PCR successfully, configure your clusters as per the following:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add a big note here that says 'you must have the pcr flag enabled when you create the cluster'

}
~~~

- `failover_at`: The requested timestamp for failover. If you used `"status":"FAILING_OVER"` to initiate the failover and omitted `failover_at`, the failover time will default to the latest consistent replicated time.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of 'if you used' can we phrase it like 'you can use'

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or make it a 'tip'


### Fail back to the primary cluster

To fail back from the standby to the primary cluster, start another PCR stream with the standby cluster as the `primary_cluster_id` and the original primary cluster as the `standby_cluster_id`. You can only fail back to the original primary cluster if the cluster was created with the `"support_physical_cluster_replication"` set to `true`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we remove the final sentence on the 'support pcr' set to true

@katmayb katmayb requested a review from rmloveland June 26, 2025 15:31
Copy link
Contributor

@rmloveland rmloveland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, non-blocking comments

@@ -1,3 +1,6 @@
- Physical cluster replication is supported in CockroachDB {{ site.data.products.core }} clusters on v23.2 or later. The primary cluster can be a [new]({% link {{ page.version.version }}/set-up-physical-cluster-replication.md %}#step-1-create-the-primary-cluster) or [existing]({% link {{ page.version.version }}/set-up-physical-cluster-replication.md %}#set-up-pcr-from-an-existing-cluster) cluster. The standby cluster must be a [new cluster started with the `--virtualized-empty` flag]({% link {{ page.version.version }}/set-up-physical-cluster-replication.md %}#step-2-create-the-standby-cluster).
- Physical cluster replication is supported in:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggest code formatting for the --virtualized-empty flag unless maybe it broke something?

- Physical cluster replication is supported in:
- CockroachDB {{ site.data.products.core }} clusters on v23.2 or later. The primary cluster can be a [new]({% link {{ page.version.version }}/set-up-physical-cluster-replication.md %}#step-1-create-the-primary-cluster) or [existing]({% link {{ page.version.version }}/set-up-physical-cluster-replication.md %}#set-up-pcr-from-an-existing-cluster) cluster. The standby cluster must be a [new cluster started with the --virtualized-empty flag]({% link {{ page.version.version }}/set-up-physical-cluster-replication.md %}#step-2-create-the-standby-cluster).
- [CockroachDB {{ site.data.products.advanced }} in clusters]({% link cockroachcloud/physical-cluster-replication.md %}) on v24.3 or later.
- The primary and standby clusters must have the same [zone configurations]({% link {{ page.version.version }}/configure-replication-zones.md %}) in CockroachDB self-hosted.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think "CockroachDB self-hosted" can be expressed as

CockroachDB {{ site.data.products.core }}

sorry to annoy, i spent a lot of time searching for these free-form usages a while ago :-D


CockroachDB **physical cluster replication (PCR)** continuously sends all data at the cluster level from a _primary_ cluster to an independent _standby_ cluster. Existing data and ongoing changes on the active primary cluster, which is serving application data, replicate asynchronously to the passive standby cluster.

PCR provides a **two-datacenter resiliency strategy** where clusters are limited to two regions. In a disaster recovery scenario, you can [fail over](#fail-over-to-the-standby-cluster) from the unavailable primary cluster to the available standby cluster. This will stop the PCR stream, reset the standby cluster to a point in time where all ingested data is consistent, and mark the standby as ready to accept application traffic.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could be followed by (2DC) in parens or something since I have also seen it referred to as that

In this guide, you'll use the [{{ site.data.products.cloud }} API]({% link cockroachcloud/cloud-api.md %}) to set up PCR from a primary cluster to a standby cluster, monitor the PCR stream, and fail over from the primary to the standby cluster.

{{site.data.alerts.callout_info}}
PCR is supported on CockroachDB {{ site.data.products.advanced }} and CockroachDB self-hosted clusters. For a guide to setting up PCR on CockroachDB self-hosted, refer to the [Set Up Physical Cluster Replication]({% link {{ site.current_cloud_version }}/set-up-physical-cluster-replication.md %}) tutorial.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as mentioned above i think "self-hosted" can be expressed as {{ site.data.products.core }}

- Clusters must be in the same cloud (AWS, GCP, or Azure).
- Clusters must be single [region]({% link cockroachcloud/regions.md %}) (multiple availability zones per cluster is supported).
- The primary and standby cluster in AWS and Azure must be in different regions.
- The primary and standby cluster in GCP can be in the same region, but must not have overlapping CIDR ranges.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a CIDR range setup thing we can link to?

- `"primary_cluster_id"`, `"standby_cluster_id"`: The cluster IDs of the primary and standby clusters.
- `"created_at"`: The time at which the PCR stream was created.

To start PCR between clusters, CockroachDB {{ site.data.products.cloud }} sets up VPC peering between clusters and validates the connectivity. As a result, it may take around 5 minutes to initialize the PCR job during which the status will be `STARTING`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have cloud docs on setting up VPC peering to link to?

Replace:

- `api_secret_key` with your API secret key.
- `job_id` with the PCR job's ID. You can find this in the response from when you created the PCR stream.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above re: "job" linking to some kind of job docs?

- `"id"`: The ID of the PCR stream.
- `"status"`: The status of the PCR stream. For descriptions, refer to [Status](#status).
- `"primary_cluster_id"`, `"standby_cluster_id"`: The cluster IDs of the primary and standby clusters.
- `"created_at"`: The time at which the PCR stream was created.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same q as above re: what exactly are these times, are they SQL timestamps, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They're date-time, I've linked to the timestamp docs, also the schema of the response is listed in the Cloud API docs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I've also added a link to the API ref docs in the prereq section for schema details on the responses.


Status | Description
-------+------------
`STARTING` | Setting up VPC peering between clusters and validating the connectivity.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

link to VPC peering docs? if not pls ignore

-------+------------
`STARTING` | Setting up VPC peering between clusters and validating the connectivity.
`REPLICATING` | Completing an initial scan and then continuing ongoing replication between the primary and standby clusters.
`FAILING_OVER` | Initiating the failover from the primary to the standby cluster.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could "fail over" link to the failover section?

@katmayb
Copy link
Contributor Author

katmayb commented Jun 27, 2025

TFTRs!
@rmloveland You are the best at links always, thank you for the reminders!

@katmayb katmayb merged commit f20d6eb into main Jun 27, 2025
6 checks passed
@katmayb katmayb deleted the pcr-cloud-25.2 branch June 27, 2025 13:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants