This is an automated cherry-pick of #16192

Signed-off-by: ti-chi-bot <[email protected]>
pingcap · Jan 19, 2024 · d83cc26 · d83cc26
1 parent 7249d82
commit d83cc26
Show file tree

Hide file tree

Showing 2 changed files with 57 additions and 1 deletion.
diff --git a/ddl-v2.md b/ddl-v2.md
@@ -0,0 +1,56 @@
+---
+title: Use TiDB DDL V2 to Accelerate Table Creation
+summary: Learn the concept, principles, and implementation details of TiDB DDL V2 for acceleration table creation.
+---
+
+# Use TiDB DDL V2 to Accelerate Table Creation
+
+Starting from v7.6.0, the new version V2 of TiDB DDL supports creating tables quickly, which improves the efficiency of bulk table creation.
+
+TiDB uses the online asynchronous schema change algorithm to change the metadata. All DDL jobs are submitted to the `mysql.tidb_ddl_job` table, and the owner node pulls the DDL job to execute. After executing each phase of the online DDL algorithm, the DDL job is marked as completed and moved to the `mysql.tidb_ddl_history` table. Therefore, DDL statements can only be executed on the owner node and cannot be linearly extended.
+
+However, for some DDL statements, it is not necessary to strictly follow the online DDL algorithm. For example, the `CREATE TABLE` statement only has two states for the job: `none` and `public`. Therefore, TiDB can simplify the execution process of DDL, and executes the `CREATE TABLE` statement on a non-owner node to accelerate table creation.
+
+> **Warning:**
+>
+> This feature is currently an experimental feature and it is not recommended to use in a production environment. This feature might change or be removed without prior notice. If you find a bug, please give feedback by raising an [issue](https://github.com/pingcap/tidb/issues) on GitHub.
+
+## Compatibility with TiDB tools
+
+- [TiCDC](https://docs.pingcap.com/tidb/stable/ticdc-overview) does not support replicating the tables that are created by TiDB DDL V2.
+
+## Limitation
+
+You can now use TiDB DDL V2 only in the [`CREATE TABLE`](/sql-statements/sql-statement-create-table.md) statement, and this statement must not include any foreign key constraints.
+
+## Use TiDB DDL V2
+
+You can enable or disable TiDB DDL V2 by specifying the value of the system variable [`tidb_ddl_version`](/system-variables.md#tidb_ddl_version-new-in-v760) .
+
+To enable TiDB DDL V2, set the value of this variable to `2`:
+
+```sql
+SET GLOBAL tidb_ddl_version = 2;
+```
+
+To disable TiDB DDL V2, set the value of this variable to `1`:
+
+```sql
+SET GLOBAL tidb_ddl_version = 1;
+```
+
+## Implementation principle
+
+The detailed implementation principle of TiDB DDL V2 for accelerating table creation is as follows:
+
+1. Create a `CREATE TABLE` Job.
+
+   This step is the same as that of the V1 implementation. The corresponding DDL Job is generated by parsing the `CREATE TABLE` statement.
+
+2. Execute the `CREATE TABLE` job.
+
+   Different from V1, in TiDB DDL V2, the TiDB node that receives the `CREATE TABLE` statement executes it directly, and then persists the table structure to TiKV. At the same time, the `CREATE TABLE` job is marked as completed and inserted into the `mysql.tidb_ddl_history` table.
+
+3. Synchronize the table information.
+
+   TiDB notifies other nodes to synchronize the newly created table structure.
diff --git a/migrate-from-parquet-files-to-tidb.md b/migrate-from-parquet-files-to-tidb.md
@@ -41,7 +41,7 @@ Each table in Hive can be exported to parquet files by annotating `STORED AS PAR
     DROP TABLE temp;
     ```
 
-3. The parquet files exported from Hive might not have the `.parquet` suffix and cannot be correctly identified by TiDB Lightning. Therefore, before importing the files, you need to rename the exported files and add the `.parquet` suffix.
+3. The parquet files exported from Hive might not have the `.parquet` suffix and cannot be correctly identified by TiDB Lightning. Therefore, before importing the files, you need to rename the exported files and add the `.parquet` suffix to change the full filename to a format that TiDB Lightning recognizes, for example, `${db_name}. ${table_name}.parquet`. For more information about file types and patterns, see [TiDB Lightning Data Sources](/tidb-lightning/tidb-lightning-data-source.md). You can also match data files by setting correct [customized expressions](/tidb-lightning/tidb-lightning-data-source.md#match-customized-files).
 
 4. Put all the parquet files in a unified directory, for example, `/data/my_datasource/` or `s3://my-bucket/sql-backup`. TiDB Lightning will recursively search for all `.parquet` files in this directory and its subdirectories.