Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update bdr mode && add description related bdr sql #15925

Merged
merged 16 commits into from
Jan 18, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions TOC.md
Original file line number Diff line number Diff line change
Expand Up @@ -710,6 +710,7 @@
- [`ADMIN PAUSE DDL`](/sql-statements/sql-statement-admin-pause-ddl.md)
- [`ADMIN RECOVER INDEX`](/sql-statements/sql-statement-admin-recover.md)
- [`ADMIN RESUME DDL`](/sql-statements/sql-statement-admin-resume-ddl.md)
- [`ADMIN [SET|SHOW] BDR ROLE`](/sql-statements/sql-statement-admin-bdr-role.md)
hfxsd marked this conversation as resolved.
Show resolved Hide resolved
- [`ADMIN SHOW DDL [JOBS|JOB QUERIES]`](/sql-statements/sql-statement-admin-show-ddl.md)
- [`ADMIN SHOW TELEMETRY`](/sql-statements/sql-statement-admin-show-telemetry.md)
- [`ALTER DATABASE`](/sql-statements/sql-statement-alter-database.md)
Expand Down
4 changes: 4 additions & 0 deletions error-codes.md
Original file line number Diff line number Diff line change
Expand Up @@ -526,6 +526,10 @@ TiDB is compatible with the error codes in MySQL, and in most cases returns the

DDL is paused by `ADMIN PAUSE` and cannot be paused again.

* Error Number: 8263

This DDL cannot be executed under a specific BDR role. Make sure that the cluster is in [bidirectional replication](/ticdc/ticdc-bidirectional-replication.md). If the cluster is not in bidirectional replication, you can use `ADMIN SET BDR ROLE LOCAL_ONLY;` to restore the DDL for normal use.
hfxsd marked this conversation as resolved.
Show resolved Hide resolved

* Error Number: 9001

The complete error message: `ERROR 9001 (HY000): PD Server Timeout`
Expand Down
64 changes: 64 additions & 0 deletions sql-statements/sql-statement-admin-bdr-role.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
---
title: ADMIN [SET|SHOW] BDR ROLE
summary: An overview of the usage of ADMIN [SET|SHOW] BDR ROLE for the TiDB database.
---

# ADMIN [SET|SHOW] BDR ROLE

- Use `ADMIN SET BDR ROLE` to set the BDR role of the cluster. Currently, you can set three BDR roles for a TiDB cluster: `PRIMARY`, `SECONDARY`, and `LOCAL_ONLY` (default). For more information about BDR roles, see [DDL Synchronization in TiCDC Bidirectional Replication](/ticdc/ticdc-bidirectional-replication.md#ddl-replication).
- Use `ADMIN SHOW BDR ROLE` to show the BDR role of the cluster.

> **Warning:**
>
> This feature is experimental. It is not recommended that you use it in the production environment. This feature might be changed or removed without prior notice. If you find a bug, you can report an [issue](https://github.com/pingcap/tidb/issues) on GitHub.

## Synopsis

```ebnf+diagram
AdminShowBDRRoleStmt ::=
'ADMIN' 'SHOW' 'BDR' 'ROLE'

AdminSetBDRRoleStmt ::=
'ADMIN' 'SET' 'BDR' 'ROLE' ('PRIMARY' | 'SECONDARY' | 'LOCAL_ONLY')
```

## Examples

The default BDR role of a TiDB cluster is `LOCAL_ONLY`. Run the folloiwng command to show the BDR role of the cluster.

```sql
ADMIN SHOW BDR ROLE;
```

```sql
+------------+
| BDR_ROLE |
+------------+
| local_only |
+------------+
1 row in set (0.01 sec)
```

Run the following command to set the BDR role to `PRIMARY`.

```sql
ADMIN SET BDR ROLE PRIMARY;
```

```sql
Query OK, 0 rows affected (0.01 sec)
```

```sql
ADMIN SHOW BDR ROLE;
+----------+
| BDR_ROLE |
+----------+
| primary |
+----------+
1 row in set (0.00 sec)
```

## MySQL compatibility

This statement is a TiDB extension to MySQL syntax.
14 changes: 12 additions & 2 deletions statement-summary-tables.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ This document details these tables and introduces how to use them to troubleshoo

## `statements_summary`

`statements_summary` is a system table in `information_schema`. `statements_summary` groups the SQL statements by the SQL digest and the plan digest, and provides statistics for each SQL category.
`statements_summary` is a system table in `information_schema`. `statements_summary` groups the SQL statements by the resource group, the SQL digest and the plan digest, and provides statistics for each SQL category.
hfxsd marked this conversation as resolved.
Show resolved Hide resolved

The "SQL digest" here means the same as used in slow logs, which is a unique identifier calculated through normalized SQL statements. The normalization process ignores constant, blank characters, and is case insensitive. Therefore, statements with consistent syntaxes have the same digest. For example:

Expand Down Expand Up @@ -86,7 +86,8 @@ The following is a sample output of querying `statements_summary`:

> **Note:**
>
> In TiDB, the time unit of fields in statement summary tables is nanosecond (ns), whereas in MySQL the time unit is picosecond (ps).
> - In TiDB, the time unit of fields in statement summary tables is nanosecond (ns), whereas in MySQL the time unit is picosecond (ps).
> - Starting from v7.6.0, for clusters with [resource control](/tidb-resource-control.md) enabled, `statements_summary` will be aggregated by resource group, i.e., the same statements executed in different resource groups will be collected as different records.
hfxsd marked this conversation as resolved.
Show resolved Hide resolved

## `statements_summary_history`

Expand Down Expand Up @@ -402,6 +403,15 @@ Transaction-related fields:
- `AVG_AFFECTED_ROWS`: The average number of rows affected.
- `PREV_SAMPLE_TEXT`: When the current SQL statement is `COMMIT`, `PREV_SAMPLE_TEXT` is the previous statement to `COMMIT`. In this case, SQL statements are grouped by the digest and `prev_sample_text`. This means that `COMMIT` statements with different `prev_sample_text` are grouped to different rows. When the current SQL statement is not `COMMIT`, the `PREV_SAMPLE_TEXT` field is an empty string.

Fields related to Resource Control:

- `Max_QUEUED_RC_TIME`: the maximum waiting time for available RU when executing SQL statements.
- `AVG_REQUEST_UNIT_WRITE`: the average number of write RUs consumed by SQL statements.
- `MAX_REQUEST_UNIT_WRITE`: the maximum number of write RUs consumed by SQL statements.
- `AVG_REQUEST_UNIT_READ`: the average number of read RUs consumed by SQL statements.
- `MAX_REQUEST_UNIT_READ`: the maximum number of read RUs consumed by SQL statements.
- `RESOURCE_GROUP`: the resource group bound to SQL statements.

hfxsd marked this conversation as resolved.
Show resolved Hide resolved
### `statements_summary_evicted` fields description

- `BEGIN_TIME`: Records the starting time.
Expand Down
131 changes: 98 additions & 33 deletions ticdc/ticdc-bidirectional-replication.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ summary: Learn how to use bidirectional replication of TiCDC.

# Bidirectional Replication

Starting from v6.5.0, TiCDC supports bi-directional replication among two TiDB clusters. Based on this feature, you can create a multi-active TiDB solution using TiCDC.
Starting from v6.5.0, TiCDC supports bi-directional replication (BDR) among two TiDB clusters. Based on this feature, you can create a multi-active TiDB solution using TiCDC.
hfxsd marked this conversation as resolved.
Show resolved Hide resolved

This section describes how to use bi-directional replication taking two TiDB clusters as an example.

Expand Down Expand Up @@ -34,37 +34,89 @@ TiCDC only replicates incremental data changes that occur after a specified time

After the configuration takes effect, the clusters can perform bi-directional replication.

## Execute DDL

After the bidirectional replication is enabled, TiCDC does not replicate any DDL statements. You need to execute DDL statements in the upstream and downstream clusters respectively.

Note that some DDL statements might cause table structure changes or data change time sequence problems, which might lead to data inconsistency after the replication. Therefore, after enabling bidirectional replication, only the DDL statements in the following table can be executed without stopping the write operations of the application.

| Event | Does it cause changefeed errors | Note |
|---|---|---|
| create database | Yes | After you manually execute the DDL statements in the upstream and downstream clusters, the errors can be automatically recovered. |
| drop database | Yes | You need to manually restart the changefeed and specify `--overwrite-checkpoint-ts` as the `commitTs` of the DDL statement to recover the errors. |
| create table | Yes | After you manually execute the DDL statements in the upstream and downstream clusters, the errors can be automatically recovered. |
| drop table | Yes | You need to manually restart the changefeed and specify `--overwrite-checkpoint-ts` as the `commitTs` of the DDL statement to recover the errors. |
| alter table comment | No | |
| rename index | No | |
| alter table index visibility | No | |
| add partition | Yes | After you manually execute the DDL statements in the upstream and downstream clusters, the errors can be automatically recovered. |
| drop partition | No | |
| create view | No | |
| drop view | No | |
| alter column default value | No | |
| reorganize partition | Yes | After you manually execute the DDL statements in the upstream and downstream clusters, the errors can be automatically recovered. |
| alter table ttl | No | |
| alter table remove ttl | No | |
| add **not unique** index | No | |
| drop **not unique** index | No | |

If you need to execute DDL statements that are not in the preceding table, take the following steps:

1. Pause the write operations in the tables that need to execute DDL in all clusters.
2. After the write operations of the corresponding tables in all clusters have been replicated to other clusters, manually execute all DDL statements in each TiDB cluster.
3. After the DDL statements are executed, resume the write operations.
## DDL types

Starting from v7.6.0, to support DDL replication as much as possible in bi-directional replication, TiDB divides the [DDLs that TiCDC originally supports](/ticdc/ticdc-ddl.md) into two types: replicable DDLs and non-replicable DDLs, according to the impact of DDLs on the business.

### Replicable DDLs

Replicable DDLs are the DDLs that can be directly executed and replicated to other TiDB clusters in bi-directional replication.

Replicable DDLs include:

- `CREATE DATABASE`
- `CREATE TABLE`
- `ADD COLUMN`: the column must be `not null` or have a `default value`
hfxsd marked this conversation as resolved.
Show resolved Hide resolved
- `ADD NON-UNIQUE INDEX`
- `DROP INDEX`
- `MODIFY COLUMN`: can only modify the `default value` and `comment` of the column
hfxsd marked this conversation as resolved.
Show resolved Hide resolved
- `ALTER COLUMN DEFAULT VALUE`
- `MODIFY TABLE COMMENT`
- `RENAME INDEX`
- `ADD TABLE PARTITION`
- `DROP PRIMARY KEY`
- `ALTER TABLE INDEX VISIBILITY`
- `ALTER TABLE TTL`
- `ALTER TABLE REMOVE TTL`
- `CREATE VIEW`
- `DROP VIEW`

### 不可复制的 DDL
hfxsd marked this conversation as resolved.
Show resolved Hide resolved

Non-replicable DDLs are the DDLs that have a great impact on the business and cannot be directly replicated to other TiDB clusters in bi-directional replication through TiCDC. Non-replicable DDLs must be executed through specific operations.
hfxsd marked this conversation as resolved.
Show resolved Hide resolved

Non-replicable DDLs include:

- `DROP DATABASE`
- `DROP TABLE`
- `ADD COLUMN`: the column is `null` and does not have a `default value`
hfxsd marked this conversation as resolved.
Show resolved Hide resolved
- `DROP COLUMN`
- `ADD UNIQUE INDEX`
- `TRUNCATE TABLE`
- `MODIFY COLUMN`: modify the attributes of the column except `default value` and `comment`
hfxsd marked this conversation as resolved.
Show resolved Hide resolved
- `RENAME TABLE`
- `DROP PARTITION`
- `TRUNCATE PARTITION`
- `ALTER TABLE CHARACTER SET`
- `ALTER DATABASE CHARACTER SET`
- `RECOVER TABLE`
- `ADD PRIMARY KEY`
- `REBASE AUTO ID`
- `EXCHANGE PARTITION`
- `REORGANIZE PARTITION`

## DDL replication

To solve the problem of replicable DDLs and non-replicable DDLs, TiDB introduces three BDR roles:

- `LOCAL_ONLY` (default): you can execute any DDL, but the DDLs executed after you enable `bdr_mode=true` in TiCDC will not be replicated by TiCDC.
hfxsd marked this conversation as resolved.
Show resolved Hide resolved
- `PRIMARY`: you can execute replicable DDLs, but cannot execute non-replicable DDLs. Replicable DDLs will be replicated to the downstream by TiCDC.
- `SECONDARY`: you cannot execute replicable DDLs or non-replicable DDLs, but can execute the DDLs replicated by TiCDC.

hfxsd marked this conversation as resolved.
Show resolved Hide resolved
> **Warning:**
>
> This feature is experimental. It is not recommended that you use it in the production environment. This feature might be changed or removed without prior notice. If you find a bug, you can report an [issue](https://github.com/pingcap/tidb/issues) on GitHub.

### Replication scenarios of replicable DDLs

1. Choose a TiDB cluster and execute `ADMIN SET BDR ROLE PRIMARY` to set it as the primary cluster.
2. On other TiDB clusters, execute `ADMIN SET BDR ROLE SECONDARY` to set them as the secondary clusters.
3. Execute **replicable DDLs** on the primary cluster. The successfully executed DDLs will be replicated to the secondary clusters by TiCDC.

> **Note:**
>
> To prevent misoperations:
hfxsd marked this conversation as resolved.
Show resolved Hide resolved
>
> - If you try to execute **replicable DDLs** on the secondary clusters, you will get an [Error 8263](/error-codes.md).
hfxsd marked this conversation as resolved.
Show resolved Hide resolved
> - No matter you try to execute **replicable DDLs** or **non-replicable DDLs** on the secondary clusters, you will get an [Error 8263](/error-codes.md).
hfxsd marked this conversation as resolved.
Show resolved Hide resolved

### Replication scenarios of non-replicable DDLs

1. Set the BDR role of all TiDB clusters to `LOCAL_ONLY` (default) and then execute `ADMIN SET BDR ROLE LOCAL_ONLY`.
hfxsd marked this conversation as resolved.
Show resolved Hide resolved
2. Stop writing data to the tables that need to execute DDLs in all clusters.
3. Wait until all writes to the corresponding tables in all clusters are replicated to other clusters, and then manually execute all DDLs on each TiDB cluster.
4. Wait until the DDLs are completed, and then resume writing data.
5. Follow the steps in [Replication scenarios of replicable DDLs](#replication-scenarios-of-replicable-ddls) to switch back to the replication scenario of replicable DDLs.

hfxsd marked this conversation as resolved.
Show resolved Hide resolved
## Stop bi-directional replication

Expand All @@ -74,7 +126,20 @@ After the check is completed, you can stop the changefeed to stop bi-directional

## Limitations

- For the limitations of DDL, see [Execute DDL](#execute-ddl).
- Use BDR role only in the following scenarios:

- 1 `PRIMARY` cluster and n `SECONDARY` clusters (replication scenarios of replicable DDLs)
- n `LOCAL_ONLY` clusters (replication scenarios of non-replicable DDLs)

**Note that do not set the BDR role to other scenarios, for example, setting `PRIMARY`, `SECONDARY`, and `LOCAL_ONLY` at the same time. If you set the BDR role incorrectly, TiDB cannot guarantee data correctness.**

- Usually do not use `AUTO_INCREMENT` or `AUTO_RANDOM` to avoid data conflicts in the replicated tables. If you need to use `AUTO_INCREMENT` or `AUTO_RANDOM`, you can set different `auto_increment_increment` and `auto_increment_offset` for different clusters to ensure that different clusters can be assigned different primary keys. For example, if there are three TiDB servers (A, B, and C) in bi-directional replication, you can set them as follows:
hfxsd marked this conversation as resolved.
Show resolved Hide resolved

- In A, set `auto_increment_increment=3` and `auto_increment_offset=2000`
- In B, set `auto_increment_increment=3` and `auto_increment_offset=2001`
- In C, set `auto_increment_increment=3` and `auto_increment_offset=2002`
hfxsd marked this conversation as resolved.
Show resolved Hide resolved

This way, A, B, and C will not conflict with each other in the implicitly assigned `AUTO_INCREMENT` ID and `AUTO_RANDOM` ID. If you need to add a cluster in BDR mode, you need to temporarily stop writing data of the related application, set the appropriate values for `auto_increment_increment` and `auto_increment_offset` on all clusters, and then resume writing data of the related application.

- Bi-directional replication clusters cannot detect write conflicts, which might cause undefined behaviors. Therefore, you must ensure that there are no write conflicts from the application side.

Expand Down
Loading