Skip to content

Commit

Permalink
This is an automated cherry-pick of pingcap#14958
Browse files Browse the repository at this point in the history
Signed-off-by: ti-chi-bot <[email protected]>
  • Loading branch information
Oreoxmt authored and ti-chi-bot committed Oct 13, 2023
1 parent 6aa8e13 commit b679768
Show file tree
Hide file tree
Showing 5 changed files with 29 additions and 2 deletions.
Binary file modified media/ticdc/cdc-architecture.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
10 changes: 10 additions & 0 deletions migration-tools.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,12 +42,22 @@ This document introduces the user scenarios, supported upstreams and downstreams

## [TiCDC](/ticdc/ticdc-overview.md)

<<<<<<< HEAD
| User scenario | <span style="font-weight:normal">This tool is implemented by pulling TiKV change logs. It can restore cluster data to a consistent state with any upstream TSO, and support other systems to subscribe to data changes.</span> |
|---|---|
| **Upstream** | TiDB |
| **Downstream** | TiDB, MySQL, Kafka, Confluent |
| **Advantages** | Provide TiCDC Open Protocol |
| **Limitation** | TiCDC only replicates tables that have at least one valid index. The following scenarios are not supported:<ul><li>The TiKV cluster that uses RawKV alone.</li><li>The DDL operation `CREATE SEQUENCE` and the `SEQUENCE` function in TiDB.</li></ul> |
=======
- **User scenario**: This tool is implemented by pulling TiKV change logs. It can restore cluster data to a consistent state with any upstream TSO, and support other systems to subscribe to data changes.
- **Upstream**: TiDB
- **Downstream**: TiDB, MySQL, Kafka, MQ, Confluent, storage services such as Amazon S3, GCS, Azure Blob Storage, and NFS.
- **Advantages**: Provide TiCDC Open Protocol
- **Limitation**: TiCDC only replicates tables that have at least one valid index. The following scenarios are not supported:
- The TiKV cluster that uses RawKV alone.
- The DDL operation `CREATE SEQUENCE` and the `SEQUENCE` function in TiDB.
>>>>>>> b17f60c637 (ticdc: add the Storage Sink feature to ticdc-overview.md (#14958))
## [Backup & Restore (BR)](/br/backup-and-restore-overview.md)

Expand Down
2 changes: 1 addition & 1 deletion production-deployment-using-tiup.md
Original file line number Diff line number Diff line change
Expand Up @@ -275,7 +275,7 @@ The following examples cover seven common scenarios. You need to modify the conf
| :-- | :-- | :-- | :-- |
| OLTP | [Deploy minimal topology](/minimal-deployment-topology.md) | [Simple minimal configuration template](https://github.com/pingcap/docs/blob/master/config-templates/simple-mini.yaml) <br/> [Full minimal configuration template](https://github.com/pingcap/docs/blob/master/config-templates/complex-mini.yaml) | This is the basic cluster topology, including tidb-server, tikv-server, and pd-server. |
| HTAP | [Deploy the TiFlash topology](/tiflash-deployment-topology.md) | [Simple TiFlash configuration template](https://github.com/pingcap/docs/blob/master/config-templates/simple-tiflash.yaml) <br/> [Full TiFlash configuration template](https://github.com/pingcap/docs/blob/master/config-templates/complex-tiflash.yaml) | This is to deploy TiFlash along with the minimal cluster topology. TiFlash is a columnar storage engine, and gradually becomes a standard cluster topology. |
| Replicate incremental data using [TiCDC](/ticdc/ticdc-overview.md) | [Deploy the TiCDC topology](/ticdc-deployment-topology.md) | [Simple TiCDC configuration template](https://github.com/pingcap/docs/blob/master/config-templates/simple-cdc.yaml) <br/> [Full TiCDC configuration template](https://github.com/pingcap/docs/blob/master/config-templates/complex-cdc.yaml) | This is to deploy TiCDC along with the minimal cluster topology. TiCDC supports multiple downstream platforms, such as TiDB, MySQL, and MQ. |
| Replicate incremental data using [TiCDC](/ticdc/ticdc-overview.md) | [Deploy the TiCDC topology](/ticdc-deployment-topology.md) | [Simple TiCDC configuration template](https://github.com/pingcap/docs/blob/master/config-templates/simple-cdc.yaml) <br/> [Full TiCDC configuration template](https://github.com/pingcap/docs/blob/master/config-templates/complex-cdc.yaml) | This is to deploy TiCDC along with the minimal cluster topology. TiCDC supports multiple downstream platforms, such as TiDB, MySQL, Kafka, MQ, and storage services. |
| Replicate incremental data using [TiDB Binlog](/tidb-binlog/tidb-binlog-overview.md) | [Deploy the TiDB Binlog topology](/tidb-binlog-deployment-topology.md) | [Simple TiDB Binlog configuration template (MySQL as downstream)](https://github.com/pingcap/docs/blob/master/config-templates/simple-tidb-binlog.yaml) <br/> [Simple TiDB Binlog configuration template (Files as downstream)](https://github.com/pingcap/docs/blob/master/config-templates/simple-file-binlog.yaml) <br/> [Full TiDB Binlog configuration template](https://github.com/pingcap/docs/blob/master/config-templates/complex-tidb-binlog.yaml) | This is to deploy TiDB Binlog along with the minimal cluster topology. |
| Use OLAP on Spark | [Deploy the TiSpark topology](/tispark-deployment-topology.md) | [Simple TiSpark configuration template](https://github.com/pingcap/docs/blob/master/config-templates/simple-tispark.yaml) <br/> [Full TiSpark configuration template](https://github.com/pingcap/docs/blob/master/config-templates/complex-tispark.yaml) | This is to deploy TiSpark along with the minimal cluster topology. TiSpark is a component built for running Apache Spark on top of TiDB/TiKV to answer the OLAP queries. Currently, TiUP cluster's support for TiSpark is still **experimental**. |
| Deploy multiple instances on a single machine | [Deploy a hybrid topology](/hybrid-deployment-topology.md) | [Simple configuration template for hybrid deployment](https://github.com/pingcap/docs/blob/master/config-templates/simple-multi-instance.yaml) <br/> [Full configuration template for hybrid deployment](https://github.com/pingcap/docs/blob/master/config-templates/complex-multi-instance.yaml) | The deployment topologies also apply when you need to add extra configurations for the directory, port, resource ratio, and label. |
Expand Down
2 changes: 1 addition & 1 deletion ticdc-deployment-topology.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ summary: Learn the deployment topology of TiCDC based on the minimal TiDB topolo
This document describes the deployment topology of [TiCDC](/ticdc/ticdc-overview.md) based on the minimal cluster topology.

TiCDC is a tool for replicating the incremental data of TiDB, introduced in TiDB 4.0. It supports multiple downstream platforms, such as TiDB, MySQL, and MQ. Compared with TiDB Binlog, TiCDC has lower latency and native high availability.
TiCDC is a tool for replicating the incremental data of TiDB, introduced in TiDB 4.0. It supports multiple downstream platforms, such as TiDB, MySQL, Kafka, MQ, and storage services. Compared with TiDB Binlog, TiCDC has lower latency and native high availability.

## Topology information

Expand Down
17 changes: 17 additions & 0 deletions ticdc/ticdc-overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,13 +16,26 @@ summary: Learn what TiCDC is, what features TiCDC provides, and how to install a

### Key capabilities

<<<<<<< HEAD
- Replicate incremental data from one TiDB cluster to another TiDB cluster with second-level RPO and minute-level RTO.
- Replicate data bidirectionally between TiDB clusters, based on which you can create a multi-active TiDB solution using TiCDC.
- Replicate incremental data from a TiDB cluster to a MySQL database (or other MySQL-compatible databases) with low latency.
- Replicate incremental data from a TiDB cluster to a Kafka cluster. The recommended data format includes [Canal-JSON](/ticdc/ticdc-canal-json.md) and [Avro](/ticdc/ticdc-avro-protocol.md).
- Replicate tables with the ability to filter databases, tables, DMLs, and DDLs.
- Be highly available with no single point of failure. Supports dynamically adding and deleting TiCDC nodes.
- Support cluster management through [Open API](/ticdc/ticdc-open-api.md), including querying task status, dynamically modifying task configuration, and creating or deleting tasks.
=======
TiCDC has the following key capabilities:

- Replicating incremental data between TiDB clusters with second-level RPO and minute-level RTO.
- Bidirectional replication between TiDB clusters, allowing the creation of a multi-active TiDB solution using TiCDC.
- Replicating incremental data from a TiDB cluster to a MySQL database or other MySQL-compatible databases with low latency.
- Replicating incremental data from a TiDB cluster to a Kafka cluster. The recommended data format includes [Canal-JSON](/ticdc/ticdc-canal-json.md) and [Avro](/ticdc/ticdc-avro-protocol.md).
- Replicating incremental data from a TiDB cluster to storage services, such as Amazon S3, GCS, Azure Blob Storage, and NFS.
- Replicating tables with the ability to filter databases, tables, DMLs, and DDLs.
- High availability with no single point of failure, supporting dynamically adding and deleting TiCDC nodes.
- Cluster management through [Open API](/ticdc/ticdc-open-api-v2.md), including querying task status, dynamically modifying task configuration, and creating or deleting tasks.
>>>>>>> b17f60c637 (ticdc: add the Storage Sink feature to ticdc-overview.md (#14958))
### Replication order

Expand Down Expand Up @@ -65,7 +78,11 @@ The components in the preceding architecture diagram are described as follows:
- TiCDC: TiCDC nodes where the TiCDC processes run. Each node runs a TiCDC process. Each process pulls data changes from one or more tables in TiKV nodes, and replicates the changes to the downstream system through the sink component.
- PD: The scheduling module in a TiDB cluster. This module is in charge of scheduling cluster data and usually consists of three PD nodes. PD provides high availability through the etcd cluster. In the etcd cluster, TiCDC stores its metadata, such as node status information and changefeed configurations.

<<<<<<< HEAD
As shown in the preceding architecture diagram, TiCDC supports replicating data to TiDB, MySQL, and Kafka databases.
=======
As shown in the architecture diagram, TiCDC supports replicating data to TiDB, MySQL, Kafka, and storage services.
>>>>>>> b17f60c637 (ticdc: add the Storage Sink feature to ticdc-overview.md (#14958))
## Best practices

Expand Down

0 comments on commit b679768

Please sign in to comment.