From b2696686b3363f3d48723888377ded6b498afa16 Mon Sep 17 00:00:00 2001 From: xixirangrang Date: Fri, 19 Jan 2024 14:46:48 +0800 Subject: [PATCH 1/2] This is an automated cherry-pick of #16192 Signed-off-by: ti-chi-bot --- ddl-v2.md | 56 +++++++++++++++++++++++++++ migrate-from-parquet-files-to-tidb.md | 2 +- 2 files changed, 57 insertions(+), 1 deletion(-) create mode 100644 ddl-v2.md diff --git a/ddl-v2.md b/ddl-v2.md new file mode 100644 index 0000000000000..59db024d00c94 --- /dev/null +++ b/ddl-v2.md @@ -0,0 +1,56 @@ +--- +title: Use TiDB DDL V2 to Accelerate Table Creation +summary: Learn the concept, principles, and implementation details of TiDB DDL V2 for acceleration table creation. +--- + +# Use TiDB DDL V2 to Accelerate Table Creation + +Starting from v7.6.0, the new version V2 of TiDB DDL supports creating tables quickly, which improves the efficiency of bulk table creation. + +TiDB uses the online asynchronous schema change algorithm to change the metadata. All DDL jobs are submitted to the `mysql.tidb_ddl_job` table, and the owner node pulls the DDL job to execute. After executing each phase of the online DDL algorithm, the DDL job is marked as completed and moved to the `mysql.tidb_ddl_history` table. Therefore, DDL statements can only be executed on the owner node and cannot be linearly extended. + +However, for some DDL statements, it is not necessary to strictly follow the online DDL algorithm. For example, the `CREATE TABLE` statement only has two states for the job: `none` and `public`. Therefore, TiDB can simplify the execution process of DDL, and executes the `CREATE TABLE` statement on a non-owner node to accelerate table creation. + +> **Warning:** +> +> This feature is currently an experimental feature and it is not recommended to use in a production environment. This feature might change or be removed without prior notice. If you find a bug, please give feedback by raising an [issue](https://github.com/pingcap/tidb/issues) on GitHub. + +## Compatibility with TiDB tools + +- [TiCDC](https://docs.pingcap.com/tidb/stable/ticdc-overview) does not support replicating the tables that are created by TiDB DDL V2. + +## Limitation + +You can now use TiDB DDL V2 only in the [`CREATE TABLE`](/sql-statements/sql-statement-create-table.md) statement, and this statement must not include any foreign key constraints. + +## Use TiDB DDL V2 + +You can enable or disable TiDB DDL V2 by specifying the value of the system variable [`tidb_ddl_version`](/system-variables.md#tidb_ddl_version-new-in-v760) . + +To enable TiDB DDL V2, set the value of this variable to `2`: + +```sql +SET GLOBAL tidb_ddl_version = 2; +``` + +To disable TiDB DDL V2, set the value of this variable to `1`: + +```sql +SET GLOBAL tidb_ddl_version = 1; +``` + +## Implementation principle + +The detailed implementation principle of TiDB DDL V2 for accelerating table creation is as follows: + +1. Create a `CREATE TABLE` Job. + + This step is the same as that of the V1 implementation. The corresponding DDL Job is generated by parsing the `CREATE TABLE` statement. + +2. Execute the `CREATE TABLE` job. + + Different from V1, in TiDB DDL V2, the TiDB node that receives the `CREATE TABLE` statement executes it directly, and then persists the table structure to TiKV. At the same time, the `CREATE TABLE` job is marked as completed and inserted into the `mysql.tidb_ddl_history` table. + +3. Synchronize the table information. + + TiDB notifies other nodes to synchronize the newly created table structure. diff --git a/migrate-from-parquet-files-to-tidb.md b/migrate-from-parquet-files-to-tidb.md index 0ca90cb019469..ef37bb9d10fd1 100644 --- a/migrate-from-parquet-files-to-tidb.md +++ b/migrate-from-parquet-files-to-tidb.md @@ -41,7 +41,7 @@ Each table in Hive can be exported to parquet files by annotating `STORED AS PAR DROP TABLE temp; ``` -3. The parquet files exported from Hive might not have the `.parquet` suffix and cannot be correctly identified by TiDB Lightning. Therefore, before importing the files, you need to rename the exported files and add the `.parquet` suffix. +3. The parquet files exported from Hive might not have the `.parquet` suffix and cannot be correctly identified by TiDB Lightning. Therefore, before importing the files, you need to rename the exported files and add the `.parquet` suffix to change the full filename to a format that TiDB Lightning recognizes, for example, `${db_name}. ${table_name}.parquet`. For more information about file types and patterns, see [TiDB Lightning Data Sources](/tidb-lightning/tidb-lightning-data-source.md). You can also match data files by setting correct [customized expressions](/tidb-lightning/tidb-lightning-data-source.md#match-customized-files). 4. Put all the parquet files in a unified directory, for example, `/data/my_datasource/` or `s3://my-bucket/sql-backup`. TiDB Lightning will recursively search for all `.parquet` files in this directory and its subdirectories. From 4c1097fd0336b53baba3f5d24319ed612647dc19 Mon Sep 17 00:00:00 2001 From: xixirangrang <35301108+hfxsd@users.noreply.github.com> Date: Fri, 19 Jan 2024 14:50:10 +0800 Subject: [PATCH 2/2] Delete ddl-v2.md --- ddl-v2.md | 56 ------------------------------------------------------- 1 file changed, 56 deletions(-) delete mode 100644 ddl-v2.md diff --git a/ddl-v2.md b/ddl-v2.md deleted file mode 100644 index 59db024d00c94..0000000000000 --- a/ddl-v2.md +++ /dev/null @@ -1,56 +0,0 @@ ---- -title: Use TiDB DDL V2 to Accelerate Table Creation -summary: Learn the concept, principles, and implementation details of TiDB DDL V2 for acceleration table creation. ---- - -# Use TiDB DDL V2 to Accelerate Table Creation - -Starting from v7.6.0, the new version V2 of TiDB DDL supports creating tables quickly, which improves the efficiency of bulk table creation. - -TiDB uses the online asynchronous schema change algorithm to change the metadata. All DDL jobs are submitted to the `mysql.tidb_ddl_job` table, and the owner node pulls the DDL job to execute. After executing each phase of the online DDL algorithm, the DDL job is marked as completed and moved to the `mysql.tidb_ddl_history` table. Therefore, DDL statements can only be executed on the owner node and cannot be linearly extended. - -However, for some DDL statements, it is not necessary to strictly follow the online DDL algorithm. For example, the `CREATE TABLE` statement only has two states for the job: `none` and `public`. Therefore, TiDB can simplify the execution process of DDL, and executes the `CREATE TABLE` statement on a non-owner node to accelerate table creation. - -> **Warning:** -> -> This feature is currently an experimental feature and it is not recommended to use in a production environment. This feature might change or be removed without prior notice. If you find a bug, please give feedback by raising an [issue](https://github.com/pingcap/tidb/issues) on GitHub. - -## Compatibility with TiDB tools - -- [TiCDC](https://docs.pingcap.com/tidb/stable/ticdc-overview) does not support replicating the tables that are created by TiDB DDL V2. - -## Limitation - -You can now use TiDB DDL V2 only in the [`CREATE TABLE`](/sql-statements/sql-statement-create-table.md) statement, and this statement must not include any foreign key constraints. - -## Use TiDB DDL V2 - -You can enable or disable TiDB DDL V2 by specifying the value of the system variable [`tidb_ddl_version`](/system-variables.md#tidb_ddl_version-new-in-v760) . - -To enable TiDB DDL V2, set the value of this variable to `2`: - -```sql -SET GLOBAL tidb_ddl_version = 2; -``` - -To disable TiDB DDL V2, set the value of this variable to `1`: - -```sql -SET GLOBAL tidb_ddl_version = 1; -``` - -## Implementation principle - -The detailed implementation principle of TiDB DDL V2 for accelerating table creation is as follows: - -1. Create a `CREATE TABLE` Job. - - This step is the same as that of the V1 implementation. The corresponding DDL Job is generated by parsing the `CREATE TABLE` statement. - -2. Execute the `CREATE TABLE` job. - - Different from V1, in TiDB DDL V2, the TiDB node that receives the `CREATE TABLE` statement executes it directly, and then persists the table structure to TiKV. At the same time, the `CREATE TABLE` job is marked as completed and inserted into the `mysql.tidb_ddl_history` table. - -3. Synchronize the table information. - - TiDB notifies other nodes to synchronize the newly created table structure.