Skip to content

Commit

Permalink
This is an automated cherry-pick of #16192
Browse files Browse the repository at this point in the history
Signed-off-by: ti-chi-bot <[email protected]>
  • Loading branch information
hfxsd authored and ti-chi-bot committed Jan 19, 2024
1 parent 7249d82 commit d83cc26
Show file tree
Hide file tree
Showing 2 changed files with 57 additions and 1 deletion.
56 changes: 56 additions & 0 deletions ddl-v2.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
---
title: Use TiDB DDL V2 to Accelerate Table Creation
summary: Learn the concept, principles, and implementation details of TiDB DDL V2 for acceleration table creation.
---

# Use TiDB DDL V2 to Accelerate Table Creation

Starting from v7.6.0, the new version V2 of TiDB DDL supports creating tables quickly, which improves the efficiency of bulk table creation.

TiDB uses the online asynchronous schema change algorithm to change the metadata. All DDL jobs are submitted to the `mysql.tidb_ddl_job` table, and the owner node pulls the DDL job to execute. After executing each phase of the online DDL algorithm, the DDL job is marked as completed and moved to the `mysql.tidb_ddl_history` table. Therefore, DDL statements can only be executed on the owner node and cannot be linearly extended.

However, for some DDL statements, it is not necessary to strictly follow the online DDL algorithm. For example, the `CREATE TABLE` statement only has two states for the job: `none` and `public`. Therefore, TiDB can simplify the execution process of DDL, and executes the `CREATE TABLE` statement on a non-owner node to accelerate table creation.

> **Warning:**
>
> This feature is currently an experimental feature and it is not recommended to use in a production environment. This feature might change or be removed without prior notice. If you find a bug, please give feedback by raising an [issue](https://github.com/pingcap/tidb/issues) on GitHub.
## Compatibility with TiDB tools

- [TiCDC](https://docs.pingcap.com/tidb/stable/ticdc-overview) does not support replicating the tables that are created by TiDB DDL V2.

## Limitation

You can now use TiDB DDL V2 only in the [`CREATE TABLE`](/sql-statements/sql-statement-create-table.md) statement, and this statement must not include any foreign key constraints.

## Use TiDB DDL V2

You can enable or disable TiDB DDL V2 by specifying the value of the system variable [`tidb_ddl_version`](/system-variables.md#tidb_ddl_version-new-in-v760) .

To enable TiDB DDL V2, set the value of this variable to `2`:

```sql
SET GLOBAL tidb_ddl_version = 2;
```

To disable TiDB DDL V2, set the value of this variable to `1`:

```sql
SET GLOBAL tidb_ddl_version = 1;
```

## Implementation principle

The detailed implementation principle of TiDB DDL V2 for accelerating table creation is as follows:

1. Create a `CREATE TABLE` Job.

This step is the same as that of the V1 implementation. The corresponding DDL Job is generated by parsing the `CREATE TABLE` statement.

2. Execute the `CREATE TABLE` job.

Different from V1, in TiDB DDL V2, the TiDB node that receives the `CREATE TABLE` statement executes it directly, and then persists the table structure to TiKV. At the same time, the `CREATE TABLE` job is marked as completed and inserted into the `mysql.tidb_ddl_history` table.

3. Synchronize the table information.

TiDB notifies other nodes to synchronize the newly created table structure.
2 changes: 1 addition & 1 deletion migrate-from-parquet-files-to-tidb.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ Each table in Hive can be exported to parquet files by annotating `STORED AS PAR
DROP TABLE temp;
```

3. The parquet files exported from Hive might not have the `.parquet` suffix and cannot be correctly identified by TiDB Lightning. Therefore, before importing the files, you need to rename the exported files and add the `.parquet` suffix.
3. The parquet files exported from Hive might not have the `.parquet` suffix and cannot be correctly identified by TiDB Lightning. Therefore, before importing the files, you need to rename the exported files and add the `.parquet` suffix to change the full filename to a format that TiDB Lightning recognizes, for example, `${db_name}. ${table_name}.parquet`. For more information about file types and patterns, see [TiDB Lightning Data Sources](/tidb-lightning/tidb-lightning-data-source.md). You can also match data files by setting correct [customized expressions](/tidb-lightning/tidb-lightning-data-source.md#match-customized-files).

4. Put all the parquet files in a unified directory, for example, `/data/my_datasource/` or `s3://my-bucket/sql-backup`. TiDB Lightning will recursively search for all `.parquet` files in this directory and its subdirectories.

Expand Down

0 comments on commit d83cc26

Please sign in to comment.