+
+
+Private Connect leverages **Private Link** or **Private Service Connect** technologies from cloud providers to enable resources in your VPC to connect to services in other VPCs using private IP addresses, as if those services were hosted directly within your VPC.
+
+Currently, TiDB Cloud supports Private Connect for generic Kafka only. It does not include special integration with MSK, Confluent Kafka, or other services.
+
+- If your Apache Kafka service is hosted in AWS, follow [Set Up Self-Hosted Kafka Private Link Service in AWS](/tidb-cloud/setup-self-hosted-kafka-private-link-service.md) to ensure that the network connection is properly configured. After setup, provide the following information in the TiDB Cloud console to create the changefeed:
+
+ - The ID in Kafka Advertised Listener Pattern
+ - The Endpoint Service Name
+ - The Bootstrap Ports
+
+- If your Apache Kafka service is hosted in Google Cloud, follow [Set Up Self-Hosted Kafka Private Service Connect in Google Cloud](/tidb-cloud/setup-self-hosted-kafka-private-service-connect.md) to ensure that the network connection is properly configured. After setup, provide the following information in the TiDB Cloud console to create the changefeed:
+
+ - The ID in Kafka Advertised Listener Pattern
+ - The Service Attachment
+ - The Bootstrap Ports
+
+
+
If your Apache Kafka service is in an AWS VPC that has no internet access, take the following steps:
@@ -39,7 +76,7 @@ If your Apache Kafka service is in an AWS VPC that has no internet access, take
3. If the Apache Kafka URL contains hostnames, you need to allow TiDB Cloud to be able to resolve the DNS hostnames of the Apache Kafka brokers.
- 1. Follow the steps in [Enable DNS resolution for a VPC peering connection](https://docs.aws.amazon.com/vpc/latest/peering/modify-peering-connections.html#vpc-peering-dns).
+ 1. Follow the steps in [Enable DNS resolution for a VPC peering connection](https://docs.aws.amazon.com/vpc/latest/peering/vpc-peering-dns.html).
2. Enable the **Accepter DNS resolution** option.
If your Apache Kafka service is in a Google Cloud VPC that has no internet access, take the following steps:
@@ -49,6 +86,16 @@ If your Apache Kafka service is in a Google Cloud VPC that has no internet acces
You must add the CIDR of the region where your TiDB Cloud cluster is located to the ingress firewall rules. The CIDR can be found on the **VPC Peering** page. Doing so allows the traffic to flow from your TiDB cluster to the Kafka brokers.
+
+
+
+If you want to provide Public IP access to your Apache Kafka service, assign Public IP addresses to all your Kafka brokers.
+
+It is **NOT** recommended to use Public IP in a production environment.
+
+
+
+
### Kafka ACL authorization
To allow TiDB Cloud changefeeds to stream data to Apache Kafka and create Kafka topics automatically, ensure that the following permissions are added in Kafka:
@@ -60,21 +107,71 @@ For example, if your Kafka cluster is in Confluent Cloud, you can see [Resources
## Step 1. Open the changefeed page for Apache Kafka
-1. In the [TiDB Cloud console](https://tidbcloud.com), navigate to the cluster overview page of the target TiDB cluster, and then click **Changefeed** in the left navigation pane.
-2. Click **Create Changefeed**, and select **Kafka** as **Target Type**.
+1. Log in to the [TiDB Cloud console](https://tidbcloud.com).
+2. Navigate to the cluster overview page of the target TiDB cluster, and then click **Changefeed** in the left navigation pane.
+3. Click **Create Changefeed**, and select **Kafka** as **Target Type**.
## Step 2. Configure the changefeed target
-1. Under **Brokers Configuration**, fill in your Kafka brokers endpoints. You can use commas `,` to separate multiple endpoints.
-2. Select an authentication option according to your Kafka authentication configuration.
+The steps vary depending on the connectivity method you select.
+
+
+
+
+1. In **Connectivity Method**, select **VPC Peering** or **Public IP**, fill in your Kafka brokers endpoints. You can use commas `,` to separate multiple endpoints.
+2. Select an **Authentication** option according to your Kafka authentication configuration.
- If your Kafka does not require authentication, keep the default option **Disable**.
- - If your Kafka requires authentication, select the corresponding authentication type, and then fill in the user name and password of your Kafka account for authentication.
+ - If your Kafka requires authentication, select the corresponding authentication type, and then fill in the **user name** and **password** of your Kafka account for authentication.
-3. Select your Kafka version. If you do not know that, use Kafka V2.
-4. Select a desired compression type for the data in this changefeed.
+3. Select your **Kafka Version**. If you do not know which one to use, use **Kafka v2**.
+4. Select a **Compression** type for the data in this changefeed.
5. Enable the **TLS Encryption** option if your Kafka has enabled TLS encryption and you want to use TLS encryption for the Kafka connection.
-6. Click **Next** to check the configurations you set and go to the next page.
+6. Click **Next** to test the network connection. If the test succeeds, you will be directed to the next page.
+
+
+
+
+1. In **Connectivity Method**, select **Private Link**.
+2. Authorize the [AWS Principal](https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_policies_elements_principal.html#principal-accounts) of TiDB Cloud to create an endpoint for your endpoint service. The AWS Principal is provided in the tip on the web page.
+3. Make sure you select the same **Number of AZs** and **AZ IDs of Kafka Deployment**, and fill the same unique ID in **Kafka Advertised Listener Pattern** when you [Set Up Self-Hosted Kafka Private Link Service in AWS](/tidb-cloud/setup-self-hosted-kafka-private-link-service.md) in the **Network** section.
+4. Fill in the **Endpoint Service Name** which is configured in [Set Up Self-Hosted Kafka Private Link Service in AWS](/tidb-cloud/setup-self-hosted-kafka-private-link-service.md).
+5. Fill in the **Bootstrap Ports**. It is recommended that you set at least one port for one AZ. You can use commas `,` to separate multiple ports.
+6. Select an **Authentication** option according to your Kafka authentication configuration.
+
+ - If your Kafka does not require authentication, keep the default option **Disable**.
+ - If your Kafka requires authentication, select the corresponding authentication type, and then fill in the **user name** and **password** of your Kafka account for authentication.
+
+7. Select your **Kafka Version**. If you do not know which one to use, use **Kafka v2**.
+8. Select a **Compression** type for the data in this changefeed.
+9. Enable the **TLS Encryption** option if your Kafka has enabled TLS encryption and you want to use TLS encryption for the Kafka connection.
+10. Click **Next** to test the network connection. If the test succeeds, you will be directed to the next page.
+11. TiDB Cloud creates the endpoint for **Private Link**, which might take several minutes.
+12. Once the endpoint is created, log in to your cloud provider console and accept the connection request.
+13. Return to the [TiDB Cloud console](https://tidbcloud.com) to confirm that you have accepted the connection request. TiDB Cloud will test the connection and proceed to the next page if the test succeeds.
+
+
+
+
+1. In **Connectivity Method**, select **Private Service Connect**.
+2. Ensure that you fill in the same unique ID in **Kafka Advertised Listener Pattern** when you [Set Up Self-Hosted Kafka Private Service Connect in Google Cloud](/tidb-cloud/setup-self-hosted-kafka-private-service-connect.md) in the **Network** section.
+3. Fill in the **Service Attachment** that you have configured in [Setup Self Hosted Kafka Private Service Connect in Google Cloud](/tidb-cloud/setup-self-hosted-kafka-private-service-connect.md)
+4. Fill in the **Bootstrap Ports**. It is recommended that you provide more than one port. You can use commas `,` to separate multiple ports.
+5. Select an **Authentication** option according to your Kafka authentication configuration.
+
+ - If your Kafka does not require authentication, keep the default option **Disable**.
+ - If your Kafka requires authentication, select the corresponding authentication type, and then fill in the **user name** and **password** of your Kafka account for authentication.
+
+6. Select your **Kafka Version**. If you do not know which one to use, use **Kafka v2**.
+7. Select a **Compression** type for the data in this changefeed.
+8. Enable the **TLS Encryption** option if your Kafka has enabled TLS encryption and you want to use TLS encryption for the Kafka connection.
+9. Click **Next** to test the network connection. If the test succeeds, you will be directed to the next page.
+10. TiDB Cloud creates the endpoint for **Private Service Connect**, which might take several minutes.
+11. Once the endpoint is created, log in to your cloud provider console and accept the connection request.
+12. Return to the [TiDB Cloud console](https://tidbcloud.com) to confirm that you have accepted the connection request. TiDB Cloud will test the connection and proceed to the next page if the test succeeds.
+
+
+
## Step 3. Set the changefeed
@@ -91,8 +188,10 @@ For example, if your Kafka cluster is in Confluent Cloud, you can see [Resources
3. In the **Data Format** area, select your desired format of Kafka messages.
- - Avro is a compact, fast, and binary data format with rich data structures, which is widely used in various flow systems. For more information, see [Avro data format](https://docs.pingcap.com/tidb/stable/ticdc-avro-protocol).
- - Canal-JSON is a plain JSON text format, which is easy to parse. For more information, see [Canal-JSON data format](https://docs.pingcap.com/tidb/stable/ticdc-canal-json).
+ - Avro is a compact, fast, and binary data format with rich data structures, which is widely used in various flow systems. For more information, see [Avro data format](https://docs.pingcap.com/tidb/stable/ticdc-avro-protocol).
+ - Canal-JSON is a plain JSON text format, which is easy to parse. For more information, see [Canal-JSON data format](https://docs.pingcap.com/tidb/stable/ticdc-canal-json).
+ - Open Protocol is a row-level data change notification protocol that provides data sources for monitoring, caching, full-text indexing, analysis engines, and primary-secondary replication between different databases. For more information, see [Open Protocol data format](https://docs.pingcap.com/tidb/stable/ticdc-open-protocol).
+ - Debezium is a tool for capturing database changes. It converts each captured database change into a message called an "event" and sends these events to Kafka. For more information, see [Debezium data format](https://docs.pingcap.com/tidb/stable/ticdc-debezium).
4. Enable the **TiDB Extension** option if you want to add TiDB-extension fields to the Kafka message body.
@@ -109,32 +208,40 @@ For example, if your Kafka cluster is in Confluent Cloud, you can see [Resources
The distribution mode controls how the changefeed creates Kafka topics, by table, by database, or creating one topic for all changelogs.
- - **Distribute changelogs by table to Kafka Topics**
+ - **Distribute changelogs by table to Kafka Topics**
If you want the changefeed to create a dedicated Kafka topic for each table, choose this mode. Then, all Kafka messages of a table are sent to a dedicated Kafka topic. You can customize topic names for tables by setting a topic prefix, a separator between a database name and table name, and a suffix. For example, if you set the separator as `_`, the topic names are in the format of `