Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update bdr mode && add description related bdr sql #15925

Merged
merged 16 commits into from
Jan 18, 2024
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions TOC.md
Original file line number Diff line number Diff line change
Expand Up @@ -710,6 +710,7 @@
- [`ADMIN PAUSE DDL`](/sql-statements/sql-statement-admin-pause-ddl.md)
- [`ADMIN RECOVER INDEX`](/sql-statements/sql-statement-admin-recover.md)
- [`ADMIN RESUME DDL`](/sql-statements/sql-statement-admin-resume-ddl.md)
- [`ADMIN [SET|SHOW|UNSET] BDR ROLE`](/sql-statements/sql-statement-admin-bdr-role.md)
- [`ADMIN SHOW DDL [JOBS|JOB QUERIES]`](/sql-statements/sql-statement-admin-show-ddl.md)
- [`ADMIN SHOW TELEMETRY`](/sql-statements/sql-statement-admin-show-telemetry.md)
- [`ALTER DATABASE`](/sql-statements/sql-statement-alter-database.md)
Expand Down
4 changes: 4 additions & 0 deletions error-codes.md
Original file line number Diff line number Diff line change
Expand Up @@ -526,6 +526,10 @@ TiDB is compatible with the error codes in MySQL, and in most cases returns the

DDL is paused by `ADMIN PAUSE` and cannot be paused again.

* Error Number: 8263

This DDL cannot be executed under a specific BDR role. Make sure that the cluster is in [bidirectional replication](/ticdc/ticdc-bidirectional-replication.md). If the cluster is not in bidirectional replication, you can use `ADMIN SET BDR ROLE;` to restore the DDL for normal use.
hfxsd marked this conversation as resolved.
Show resolved Hide resolved

* Error Number: 9001

The complete error message: `ERROR 9001 (HY000): PD Server Timeout`
Expand Down
2 changes: 1 addition & 1 deletion keywords.md
Original file line number Diff line number Diff line change
Expand Up @@ -393,7 +393,6 @@ The following list shows the keywords in TiDB. Reserved keywords are marked with
- LIST
- LOAD (R)
- LOCAL
- LOCAL_ONLY
- LOCALTIME (R)
- LOCALTIMESTAMP (R)
- LOCATION
Expand Down Expand Up @@ -732,6 +731,7 @@ The following list shows the keywords in TiDB. Reserved keywords are marked with
- UNIQUE (R)
- UNKNOWN
- UNLOCK (R)
- UNSET
- UNSIGNED (R)
- UNTIL (R)
- UPDATE (R)
Expand Down
88 changes: 88 additions & 0 deletions sql-statements/sql-statement-admin-bdr-role.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
---
title: ADMIN [SET|SHOW|UNSET] BDR ROLE
summary: An overview of the usage of ADMIN [SET|SHOW|UNSET] BDR ROLE for the TiDB database.
---

# ADMIN [SET|SHOW|UNSET] BDR ROLE

- Use `ADMIN SET BDR ROLE` to set the BDR role of the cluster. Currently, you can set the following BDR roles for a TiDB cluster: `PRIMARY` and `SECONDARY`. For more information about BDR roles, see [DDL Synchronization in TiCDC Bidirectional Replication](/ticdc/ticdc-bidirectional-replication.md#ddl-replication).
hfxsd marked this conversation as resolved.
Show resolved Hide resolved
- Use `ADMIN SHOW BDR ROLE` to show the BDR role of the cluster.
- Use `ADMIN UNSET BDR ROLE` to cancel the BDR role of the cluster.
hfxsd marked this conversation as resolved.
Show resolved Hide resolved

> **Warning:**
>
> This feature is experimental. It is not recommended that you use it in the production environment. This feature might be changed or removed without prior notice. If you find a bug, you can report an [issue](https://github.com/pingcap/tidb/issues) on GitHub.

## Synopsis

```ebnf+diagram
AdminShowBDRRoleStmt ::=
'ADMIN' 'SHOW' 'BDR' 'ROLE'

AdminSetBDRRoleStmt ::=
'ADMIN' 'SET' 'BDR' 'ROLE' ('PRIMARY' | 'SECONDARY')

AdminUnsetBDRRoleStmt ::=
'ADMIN' 'UNSET' 'BDR' 'ROLE'
```

## Examples

By default, a TiDB cluster has no BDR role. Run the folloiwng command to show the BDR role of the cluster.
hfxsd marked this conversation as resolved.
Show resolved Hide resolved

```sql
ADMIN SHOW BDR ROLE;
```

```sql
+------------+
| BDR_ROLE |
+------------+
| |
+------------+
1 row in set (0.01 sec)
```

Run the following command to set the BDR role to `PRIMARY`.

```sql
ADMIN SET BDR ROLE PRIMARY;
```

```sql
Query OK, 0 rows affected (0.01 sec)
```

```sql
ADMIN SHOW BDR ROLE;
+----------+
| BDR_ROLE |
+----------+
| primary |
+----------+
1 row in set (0.00 sec)
```

Run the following command to cancel the BDR role of the cluster.
hfxsd marked this conversation as resolved.
Show resolved Hide resolved

```sql
ADMIN UNSET BDR ROLE;
```

```sql
Query OK, 0 rows affected (0.01 sec)
```

```sql
ADMIN SHOW BDR ROLE;
+----------+
| BDR_ROLE |
+----------+
| |
+----------+
1 row in set (0.01 sec)
```

## MySQL compatibility

This statement is a TiDB extension to MySQL syntax.
138 changes: 104 additions & 34 deletions ticdc/ticdc-bidirectional-replication.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ summary: Learn how to use bidirectional replication of TiCDC.

# Bidirectional Replication

Starting from v6.5.0, TiCDC supports bi-directional replication among two TiDB clusters. Based on this feature, you can create a multi-active TiDB solution using TiCDC.
Starting from v6.5.0, TiCDC supports bi-directional replication (BDR) among two TiDB clusters. Based on this feature, you can create a multi-active TiDB solution using TiCDC.
hfxsd marked this conversation as resolved.
Show resolved Hide resolved

This section describes how to use bi-directional replication taking two TiDB clusters as an example.

Expand Down Expand Up @@ -34,47 +34,117 @@ TiCDC only replicates incremental data changes that occur after a specified time

After the configuration takes effect, the clusters can perform bi-directional replication.

## Execute DDL

After the bidirectional replication is enabled, TiCDC does not replicate any DDL statements. You need to execute DDL statements in the upstream and downstream clusters respectively.

Note that some DDL statements might cause table structure changes or data change time sequence problems, which might lead to data inconsistency after the replication. Therefore, after enabling bidirectional replication, only the DDL statements in the following table can be executed without stopping the write operations of the application.

| Event | Does it cause changefeed errors | Note |
|---|---|---|
| create database | Yes | After you manually execute the DDL statements in the upstream and downstream clusters, the errors can be automatically recovered. |
| drop database | Yes | You need to manually restart the changefeed and specify `--overwrite-checkpoint-ts` as the `commitTs` of the DDL statement to recover the errors. |
| create table | Yes | After you manually execute the DDL statements in the upstream and downstream clusters, the errors can be automatically recovered. |
| drop table | Yes | You need to manually restart the changefeed and specify `--overwrite-checkpoint-ts` as the `commitTs` of the DDL statement to recover the errors. |
| alter table comment | No | |
| rename index | No | |
| alter table index visibility | No | |
| add partition | Yes | After you manually execute the DDL statements in the upstream and downstream clusters, the errors can be automatically recovered. |
| drop partition | No | |
| create view | No | |
| drop view | No | |
| alter column default value | No | |
| reorganize partition | Yes | After you manually execute the DDL statements in the upstream and downstream clusters, the errors can be automatically recovered. |
| alter table ttl | No | |
| alter table remove ttl | No | |
| add **not unique** index | No | |
| drop **not unique** index | No | |

If you need to execute DDL statements that are not in the preceding table, take the following steps:

1. Pause the write operations in the tables that need to execute DDL in all clusters.
2. After the write operations of the corresponding tables in all clusters have been replicated to other clusters, manually execute all DDL statements in each TiDB cluster.
3. After the DDL statements are executed, resume the write operations.
## DDL types

Starting from v7.6.0, to support DDL replication as much as possible in bi-directional replication, TiDB divides the [DDLs that TiCDC originally supports](/ticdc/ticdc-ddl.md) into two types: replicable DDLs and non-replicable DDLs, according to the impact of DDLs on the business.

### Replicable DDLs

Replicable DDLs are the DDLs that can be directly executed and replicated to other TiDB clusters in bi-directional replication.

Replicable DDLs include:

- `CREATE DATABASE`
- `CREATE TABLE`
- `ADD COLUMN`: the column must be `not null` or have a `default value`
hfxsd marked this conversation as resolved.
Show resolved Hide resolved
- `ADD NON-UNIQUE INDEX`
- `DROP INDEX`
- `MODIFY COLUMN`: can only modify the `default value` and `comment` of the column
hfxsd marked this conversation as resolved.
Show resolved Hide resolved
- `ALTER COLUMN DEFAULT VALUE`
- `MODIFY TABLE COMMENT`
- `RENAME INDEX`
- `ADD TABLE PARTITION`
- `DROP PRIMARY KEY`
- `ALTER TABLE INDEX VISIBILITY`
- `ALTER TABLE TTL`
- `ALTER TABLE REMOVE TTL`
- `CREATE VIEW`
- `DROP VIEW`

### 不可复制的 DDL
hfxsd marked this conversation as resolved.
Show resolved Hide resolved

Non-replicable DDLs are the DDLs that have a great impact on the business, and might cause data inconsistency between clusters. Non-replicable DDLs cannot be directly replicated to other TiDB clusters in bi-directional replication through TiCDC. Non-replicable DDLs must be executed through specific operations.

Non-replicable DDLs include:

- `DROP DATABASE`
- `DROP TABLE`
- `ADD COLUMN`: the column is `null` and does not have a `default value`
hfxsd marked this conversation as resolved.
Show resolved Hide resolved
- `DROP COLUMN`
- `ADD UNIQUE INDEX`
- `TRUNCATE TABLE`
- `MODIFY COLUMN`: modify the attributes of the column except `default value` and `comment`
hfxsd marked this conversation as resolved.
Show resolved Hide resolved
- `RENAME TABLE`
- `DROP PARTITION`
- `TRUNCATE PARTITION`
- `ALTER TABLE CHARACTER SET`
- `ALTER DATABASE CHARACTER SET`
- `RECOVER TABLE`
- `ADD PRIMARY KEY`
- `REBASE AUTO ID`
- `EXCHANGE PARTITION`
- `REORGANIZE PARTITION`

## DDL replication

To solve the problem of replicable DDLs and non-replicable DDLs, TiDB introduces the following BDR roles:

- `PRIMARY`: you can execute replicable DDLs, but cannot execute non-replicable DDLs. Replicable DDLs will be replicated to the downstream by TiCDC.
- `SECONDARY`: you cannot execute replicable DDLs or non-replicable DDLs, but can execute the DDLs replicated by TiCDC.

hfxsd marked this conversation as resolved.
Show resolved Hide resolved
When no BDR role is set, you can execute any DDL. But after you set `bdr_mode=true` on TiCDC, the executed DDL will not be replicated by TiCDC.

> **Warning:**
>
> This feature is experimental. It is not recommended that you use it in the production environment. This feature might be changed or removed without prior notice. If you find a bug, you can report an [issue](https://github.com/pingcap/tidb/issues) on GitHub.

### Replication scenarios of replicable DDLs

1. Choose a TiDB cluster and execute `ADMIN SET BDR ROLE PRIMARY` to set it as the primary cluster.
2. On other TiDB clusters, execute `ADMIN SET BDR ROLE SECONDARY` to set them as the secondary clusters.
3. Execute **replicable DDLs** on the primary cluster. The successfully executed DDLs will be replicated to the secondary clusters by TiCDC.

> **Note:**
>
> To prevent misoperations:
hfxsd marked this conversation as resolved.
Show resolved Hide resolved
>
> - If you try to execute **replicable DDLs** on the secondary clusters, you will get an [Error 8263](/error-codes.md).
hfxsd marked this conversation as resolved.
Show resolved Hide resolved
> - No matter you try to execute **replicable DDLs** or **non-replicable DDLs** on the secondary clusters, you will get an [Error 8263](/error-codes.md).
hfxsd marked this conversation as resolved.
Show resolved Hide resolved

### Replication scenarios of non-replicable DDLs

1. Execute `ADMIN UNSET BDR ROLE` on all TiDB clusters to cancel the BDR role.
hfxsd marked this conversation as resolved.
Show resolved Hide resolved
2. Stop writing data to the tables that need to execute DDLs in all clusters.
3. Wait until all writes to the corresponding tables in all clusters are replicated to other clusters, and then manually execute all DDLs on each TiDB cluster.
4. Wait until the DDLs are completed, and then resume writing data.
5. Follow the steps in [Replication scenarios of replicable DDLs](#replication-scenarios-of-replicable-ddls) to switch back to the replication scenario of replicable DDLs.

hfxsd marked this conversation as resolved.
Show resolved Hide resolved
> **Warning:**
>
> Do not execute any other DDLs during this time.
hfxsd marked this conversation as resolved.
Show resolved Hide resolved

## Stop bi-directional replication

After the application has stopped writing data, you can insert a special record into each cluster. By checking the two special records, you can make sure that data in two clusters are consistent.

After the check is completed, you can stop the changefeed to stop bi-directional replication.
After the check is completed, you can stop the changefeed to stop bi-directional replication, and execute `ADMIN UNSET BDR ROLE` on all TiDB clusters.

## Limitations

- For the limitations of DDL, see [Execute DDL](#execute-ddl).
- Use BDR role only in the following scenarios:

- 1 `PRIMARY` cluster and n `SECONDARY` clusters (replication scenarios of replicable DDLs)
- n clusters that have no BDR roles (replication scenarios in which you can manually execute non-replicable DDLs on each cluster)

**Note that do not set the BDR role to other scenarios, for example, setting `PRIMARY`, `SECONDARY`, and no BDR roles at the same time. If you set the BDR role incorrectly, TiDB cannot guarantee data correctness and consistency during data replication.**
hfxsd marked this conversation as resolved.
Show resolved Hide resolved

- Usually do not use `AUTO_INCREMENT` or `AUTO_RANDOM` to avoid data conflicts in the replicated tables. If you need to use `AUTO_INCREMENT` or `AUTO_RANDOM`, you can set different `auto_increment_increment` and `auto_increment_offset` for different clusters to ensure that different clusters can be assigned different primary keys. For example, if there are three TiDB clusters (A, B, and C) in bi-directional replication, you can set them as follows:

- In Cluster A, set `auto_increment_increment=3` and `auto_increment_offset=2000`
- In Cluster B, set `auto_increment_increment=3` and `auto_increment_offset=2001`
- In Cluster C, set `auto_increment_increment=3` and `auto_increment_offset=2002`

This way, A, B, and C will not conflict with each other in the implicitly assigned `AUTO_INCREMENT` ID and `AUTO_RANDOM` ID. If you need to add a cluster in BDR mode, you need to temporarily stop writing data of the related application, set the appropriate values for `auto_increment_increment` and `auto_increment_offset` on all clusters, and then resume writing data of the related application.

- Bi-directional replication clusters cannot detect write conflicts, which might cause undefined behaviors. Therefore, you must ensure that there are no write conflicts from the application side.

Expand Down
Loading