Skip to content

Commit

Permalink
tiflash refactor: split use-tiflash into multiple docs (pingcap#9452)
Browse files Browse the repository at this point in the history
  • Loading branch information
TomShawn authored Jul 8, 2022
1 parent 685f548 commit a6bc923
Show file tree
Hide file tree
Showing 24 changed files with 724 additions and 678 deletions.
8 changes: 7 additions & 1 deletion TOC.md
Original file line number Diff line number Diff line change
Expand Up @@ -832,7 +832,13 @@
- [Titan Configuration](/storage-engine/titan-configuration.md)
- TiFlash
- [Overview](/tiflash/tiflash-overview.md)
- [Use TiFlash](/tiflash/use-tiflash.md)
- [Create TiFlash Replicas](/tiflash/create-tiflash-replicas.md)
- [Use TiDB to Read TiFlash Replicas](/tiflash/use-tidb-to-read-tiflash.md)
- [Use TiSpark to Read TiFlash Replicas](/tiflash/use-tispark-to-read-tiflash.md)
- [Use MPP Mode](/tiflash/use-tiflash-mpp-mode.md)
- [Supported Push-down Calculations](/tiflash/tiflash-supported-pushdown-calculations.md)
- [Data Validation](/tiflash/tiflash-data-validation.md)
- [Compatibility](/tiflash/tiflash-compatibility.md)
- [Telemetry](/telemetry.md)
- [Errors Codes](/error-codes.md)
- [Table Filter](/table-filter.md)
Expand Down
2 changes: 1 addition & 1 deletion develop/dev-guide-create-table.md
Original file line number Diff line number Diff line change
Expand Up @@ -272,7 +272,7 @@ ALTER TABLE {table_name} SET TIFLASH REPLICA {count};
- `{table_name}`: The table name.
- `{count}`: The number of replicated replicas. If it is 0, replicated replicas are deleted.

**TiFlash** will then replicate the table. When a query is performed, TiDB automatically selects TiKV (row-based) or TiFlash (column-based) for the query based on cost optimization. Alternatively, you can manually specify whether the query uses a **TiFlash** replica. To learn how to specify it, refer to [Use TiDB to read TiFlash replicas](/tiflash/use-tiflash.md#use-tidb-to-read-tiflash-replicas).
**TiFlash** will then replicate the table. When a query is performed, TiDB automatically selects TiKV (row-based) or TiFlash (column-based) for the query based on cost optimization. Alternatively, you can manually specify whether the query uses a **TiFlash** replica. To learn how to specify it, refer to [Use TiDB to read TiFlash replicas](/tiflash/use-tidb-to-read-tiflash.md).

### An example of using HTAP capabilities

Expand Down
4 changes: 2 additions & 2 deletions develop/dev-guide-hybrid-oltp-and-olap-queries.md
Original file line number Diff line number Diff line change
Expand Up @@ -254,11 +254,11 @@ SELECT * FROM acc;

You can use the `EXPLAIN` statement to check the execution plan of the above SQL statement. If `cop[tiflash]` and `cop[tikv]` appear in the task column at the same time, it means that TiFlash and TiKV are both scheduled to complete this query. Note that TiFlash and TiKV storage engines usually use different TiDB nodes, so the two query types are not affected by each other.

For more information about how TiDB chooses to use TiFlash, see [Use TiDB to read TiFlash replicas](/tiflash/use-tiflash.md#use-tidb-to-read-tiflash-replicas)
For more information about how TiDB chooses to use TiFlash, see [Use TiDB to read TiFlash replicas](/tiflash/use-tidb-to-read-tiflash.md)

## Read more

- [Quick Start with HTAP](/quick-start-with-htap.md)
- [Explore HTAP](/explore-htap.md)
- [Window Functions](/functions-and-operators/window-functions.md)
- [Use TiFlash](/tiflash/use-tiflash.md)
- [Use TiFlash](/tiflash/tiflash-overview.md#use-tiflash)
2 changes: 1 addition & 1 deletion develop/dev-guide-sql-development-specification.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,6 @@ This document introduces some general development specifications for using SQL.
- Avoid using the `%` prefix for fuzzy prefix queries.
- If the application uses **Multi Statements** to execute SQL, that is, multiple SQLs are joined with semicolons and sent to the client for execution at once, TiDB only returns the result of the first SQL execution.
- When you use expressions, check if the expressions support computing push-down to the storage layer (TiKV or TiFlash). If not, you should expect more memory consumption and even OOM at the TiDB layer. Computing that can be pushe down the storage layer is as follows:
- [TiFlash supported push-down calculations](/tiflash/use-tiflash.md#supported-push-down-calculations).
- [TiFlash supported push-down calculations](/tiflash/tiflash-supported-pushdown-calculations.md).
- [TiKV - List of Expressions for Pushdown](/functions-and-operators/expressions-pushed-down.md).
- [Predicate push down](/predicate-push-down.md).
2 changes: 1 addition & 1 deletion explain-mpp.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ summary: Learn about the execution plan information returned by the EXPLAIN stat

# Explain Statements in the MPP Mode

TiDB supports using the [MPP mode](/tiflash/use-tiflash.md#use-the-mpp-mode) to execute queries. In the MPP mode, the TiDB optimizer generates execution plans for MPP. Note that the MPP mode is only available for tables that have replicas on [TiFlash](/tiflash/tiflash-overview.md).
TiDB supports using the [MPP mode](/tiflash/use-tiflash-mpp-mode.md) to execute queries. In the MPP mode, the TiDB optimizer generates execution plans for MPP. Note that the MPP mode is only available for tables that have replicas on [TiFlash](/tiflash/tiflash-overview.md).

The examples in this document are based on the following sample data:

Expand Down
2 changes: 1 addition & 1 deletion explore-htap.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ Before exploring the features of TiDB HTAP, you need to deploy TiDB and the corr
After TiFlash is deployed, TiKV does not replicate data to TiFlash automatically. You need to manually specify which tables need to be replicated to TiFlash. After that, TiDB creates the corresponding TiFlash replicas.

- If there is no data in the TiDB Cluster, migrate the data to TiDB first. For detailed information, see [data migration](/migration-overview.md).
- If the TiDB cluster already has the replicated data from upstream, after TiFlash is deployed, data replication does not automatically begin. You need to manually specify the tables to be replicated to TiFlash. For detailed information, see [Use TiFlash](/tiflash/use-tiflash.md).
- If the TiDB cluster already has the replicated data from upstream, after TiFlash is deployed, data replication does not automatically begin. You need to manually specify the tables to be replicated to TiFlash. For detailed information, see [Use TiFlash](/tiflash/tiflash-overview.md#use-tiflash).

## Data processing

Expand Down
2 changes: 1 addition & 1 deletion production-deployment-using-tiup.md
Original file line number Diff line number Diff line change
Expand Up @@ -423,7 +423,7 @@ If the output log shows `Up` status, the cluster is running properly.

If you have deployed [TiFlash](/tiflash/tiflash-overview.md) along with the TiDB cluster, see the following documents:

- [Use TiFlash](/tiflash/use-tiflash.md)
- [Use TiFlash](/tiflash/tiflash-overview.md#use-tiflash)
- [Maintain a TiFlash Cluster](/tiflash/maintain-tiflash.md)
- [TiFlash Alert Rules and Solutions](/tiflash/tiflash-alert-rules.md)
- [Troubleshoot TiFlash](/tiflash/troubleshoot-tiflash.md)
Expand Down
6 changes: 3 additions & 3 deletions quick-start-with-htap.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ Before using TiDB HTAP, you need to have some basic knowledge about [TiKV](/tikv
- Storage engines of HTAP: The row-based storage engine and the columnar storage engine co-exist for HTAP. Both storage engines can replicate data automatically and keep strong consistency. The row-based storage engine optimizes OLTP performance, and the columnar storage engine optimizes OLAP performance.
- Data consistency of HTAP: As a distributed and transactional key-value database, TiKV provides transactional interfaces with ACID compliance, and guarantees data consistency between multiple replicas and high availability with the implementation of the [Raft consensus algorithm](https://raft.github.io/raft.pdf). As a columnar storage extension of TiKV, TiFlash replicates data from TiKV in real time according to the Raft Learner consensus algorithm, which ensures that data is strongly consistent between TiKV and TiFlash.
- Data isolation of HTAP: TiKV and TiFlash can be deployed on different machines as needed to solve the problem of HTAP resource isolation.
- MPP computing engine: [MPP](/tiflash/use-tiflash.md#control-whether-to-select-the-mpp-mode) is a distributed computing framework provided by the TiFlash engine since TiDB 5.0, which allows data exchange between nodes and provides high-performance, high-throughput SQL algorithms. In the MPP mode, the run time of the analytic queries can be significantly reduced.
- MPP computing engine: [MPP](/tiflash/use-tiflash-mpp-mode.md#control-whether-to-select-the-mpp-mode) is a distributed computing framework provided by the TiFlash engine since TiDB 5.0, which allows data exchange between nodes and provides high-performance, high-throughput SQL algorithms. In the MPP mode, the run time of the analytic queries can be significantly reduced.

## Steps

Expand Down Expand Up @@ -202,12 +202,12 @@ limit 10;

If the result of the `EXPLAIN` statement shows `ExchangeSender` and `ExchangeReceiver` operators, it indicates that the MPP mode has taken effect.

In addition, you can specify that each part of the entire query is computed using only the TiFlash engine. For detailed information, see [Use TiDB to read TiFlash replicas](/tiflash/use-tiflash.md#use-tidb-to-read-tiflash-replicas).
In addition, you can specify that each part of the entire query is computed using only the TiFlash engine. For detailed information, see [Use TiDB to read TiFlash replicas](/tiflash/use-tidb-to-read-tiflash.md).

You can compare query results and query performance of these two methods.

## What's next

- [Architecture of TiDB HTAP](/tiflash/tiflash-overview.md#architecture)
- [Explore HTAP](/explore-htap.md)
- [Use TiFlash](/tiflash/use-tiflash.md#use-tiflash)
- [Use TiFlash](/tiflash/tiflash-overview.md#use-tiflash)
6 changes: 3 additions & 3 deletions quick-start-with-tidb.md
Original file line number Diff line number Diff line change
Expand Up @@ -128,7 +128,7 @@ As a distributed system, a basic TiDB test cluster usually consists of 2 TiDB in

7. Access the Grafana dashboard of TiDB through <http://127.0.0.1:3000>. Both the default username and password are `admin`.

8. (Optional) [Load data to TiFlash](/tiflash/use-tiflash.md) for analysis.
8. (Optional) [Load data to TiFlash](/tiflash/tiflash-overview.md#use-tiflash) for analysis.

9. Clean up the cluster after the test deployment:

Expand Down Expand Up @@ -247,7 +247,7 @@ As a distributed system, a basic TiDB test cluster usually consists of 2 TiDB in

7. Access the Grafana dashboard of TiDB through <http://127.0.0.1:3000>. Both the default username and password are `admin`.

8. (Optional) [Load data to TiFlash](/tiflash/use-tiflash.md) for analysis.
8. (Optional) [Load data to TiFlash](/tiflash/tiflash-overview.md#use-tiflash) for analysis.

9. Clean up the cluster after the test deployment:

Expand Down Expand Up @@ -511,5 +511,5 @@ Other requirements for the target machine:
- If you're looking for analytics solution with TiFlash:
- [Use TiFlash](/tiflash/use-tiflash.md)
- [Use TiFlash](/tiflash/tiflash-overview.md#use-tiflash)
- [TiFlash Overview](/tiflash/tiflash-overview.md)
4 changes: 2 additions & 2 deletions releases/release-5.0.0.md
Original file line number Diff line number Diff line change
Expand Up @@ -151,15 +151,15 @@ This feature is introduced in v5.0. To use the feature, enable the system variab

### MPP architecture

[User document](/tiflash/use-tiflash.md)
[User document](/tiflash/use-tiflash-mpp-mode.md)

TiDB introduces the MPP architecture through TiFlash nodes. This architecture allows multiple TiFlash nodes to share the execution workload of large join queries.

When the MPP mode is on, TiDB determines whether to send a query to the MPP engine for computation based on the calculation cost. In the MPP mode, TiDB distributes the computation of table joins to each running TiFlash node by redistributing the join key during data calculation (`Exchange` operation), and thus accelerates the calculation. Furthermore, with the aggregation computing feature that TiFlash has already supported, TiDB can pushdown the computation of a query to the TiFlash MPP cluster. Then the distributed environment can help accelerate the entire execution process and dramatically increase the speed of analytic queries.

In the TPC-H 100 benchmark test, TiFlash MPP delivers significant processing speed over analytic engines of traditional analytic databases and SQL on Hadoop. With this architecture, you can perform large-scale analytic queries directly on the latest transaction data, with a higher performance than traditional offline analytic solutions. According to the benchmark, with the same cluster resource, TiDB 5.0 MPP shows 2 to 3 times of speedup over Greenplum 6.15.0 and Apache Spark 3.1.1, and some queries have 8 times better performance.

Currently, the main features that the MPP mode does not support are as follows (For details, refer to [Use TiFlash](/tiflash/use-tiflash.md)):
Currently, the main features that the MPP mode does not support are as follows (For details, refer to [Use TiFlash](/tiflash/use-tiflash-mpp-mode.md)):

+ Table partitioning
+ Window Function
Expand Down
2 changes: 1 addition & 1 deletion releases/release-5.4.0.md
Original file line number Diff line number Diff line change
Expand Up @@ -127,7 +127,7 @@ In v5.4, the key new features or improvements are as follows:
- Improve the efficiency of converting data from row-based storage format to column-based storage format when replicating data from TiKV, which brings 50% improvement in the overall performance of data replication
- Improve TiFlash performance and stability by tuning the default values of some configuration items. In an HTAP hybrid load, the performance of simple queries on a single table improves up to 20%.

User documents: [Supported push-down calculations](/tiflash/use-tiflash.md#supported-push-down-calculations), [Configure the tiflash.toml file](/tiflash/tiflash-configuration.md#configure-the-tiflashtoml-file)
User documents: [Supported push-down calculations](/tiflash/tiflash-supported-pushdown-calculations.md), [Configure the tiflash.toml file](/tiflash/tiflash-configuration.md#configure-the-tiflashtoml-file)

- **Read historical data within a specified time range through a session variable**

Expand Down
8 changes: 4 additions & 4 deletions releases/release-6.0.0-dmr.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ TiDB v6.0.0 is a DMR, and its version is 6.0.0-DMR.

- Support building TiFlash replicas by databases. To add TiFlash replicas for all tables in a database, you only need to use a single SQL statement, which greatly saves operation and maintenance costs.

[User document](/tiflash/use-tiflash.md#create-tiflash-replicas-for-databases)
[User document](/tiflash/create-tiflash-replicas.md#create-tiflash-replicas-for-databases)

### Transaction

Expand Down Expand Up @@ -121,7 +121,7 @@ TiDB v6.0.0 is a DMR, and its version is 6.0.0-DMR.

In this mode, TiDB can read and compute the data on partitioned tables using the MPP engine of TiFlash, which greatly improves the query performance of partitioned tables.

[User document](/tiflash/use-tiflash.md#access-partitioned-tables-in-the-mpp-mode)
[User document](/tiflash/use-tiflash-mpp-mode.md#access-partitioned-tables-in-the-mpp-mode)

- Improve the computing performance of the MPP engine

Expand All @@ -133,7 +133,7 @@ TiDB v6.0.0 is a DMR, and its version is 6.0.0-DMR.
- Date functions: `DAYNAME()`, `DAYOFMONTH()`, `DAYOFWEEK()`, `DAYOFYEAR()`, `LAST_DAY()`, `MONTHNAME()`
- Operators: Anti Left Outer Semi Join, Left Outer Semi Join

[User document](/tiflash/use-tiflash.md#supported-push-down-calculations)
[User document](/tiflash/tiflash-supported-pushdown-calculations.md)

- The elastic thread pool (enabled by default) becomes GA. This feature aims to improve CPU utilization.

Expand Down Expand Up @@ -165,7 +165,7 @@ TiDB v6.0.0 is a DMR, and its version is 6.0.0-DMR.

Warning: Newer version of data format cannot be downgraded in place to versions earlier than v5.4.0. During such a downgrade, you need to delete TiFlash replicas and replicate data after the downgrade. Alternatively, you can perform a downgrade by referring to [dttool migrate](/tiflash/tiflash-command-line-flags.md#dttool-migrate).

[User document](/tiflash/use-tiflash.md#use-data-validation)
[User document](/tiflash/tiflash-data-validation.md)

- Improve thread utilization

Expand Down
6 changes: 3 additions & 3 deletions releases/release-6.1.0.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ In 6.1.0, the key new features or improvements are as follows:
* `DENSE_RANK()`
* `ROW_NUMBER()`

[User document](/tiflash/use-tiflash.md#supported-push-down-calculations), [#33072](https://github.com/pingcap/tidb/issues/33072)
[User document](/tiflash/tiflash-supported-pushdown-calculations.md), [#33072](https://github.com/pingcap/tidb/issues/33072)

### Observability

Expand Down Expand Up @@ -83,13 +83,13 @@ In 6.1.0, the key new features or improvements are as follows:
* `TO_SECONDS`
* `WEEKOFYEAR`

[User document](/tiflash/use-tiflash.md#supported-push-down-calculations), [#4679](https://github.com/pingcap/tiflash/issues/4679), [#4678](https://github.com/pingcap/tiflash/issues/4678), [#4677](https://github.com/pingcap/tiflash/issues/4677)
[User document](/tiflash/tiflash-supported-pushdown-calculations.md), [#4679](https://github.com/pingcap/tiflash/issues/4679), [#4678](https://github.com/pingcap/tiflash/issues/4678), [#4677](https://github.com/pingcap/tiflash/issues/4677)

* TiFlash supports partitioned tables in dynamic pruning mode.

To enhance performance in OLAP scenarios, dynamic pruning mode is supported for partitioned tables. If your TiDB is upgraded from versions earlier than v6.0.0, it is recommended that you manually update statistics of existing partitioned tables, so as to maximize the performance (not required for new installations or new partitions created after upgrade to v6.1.0).

User documents: [Access partitioned tables in the MPP mode](/tiflash/use-tiflash.md#access-partitioned-tables-in-the-mpp-mode), [Dynamic pruning mode](/partitioned-table.md#dynamic-pruning-mode), [#3873](https://github.com/pingcap/tiflash/issues/3873)
User documents: [Access partitioned tables in the MPP mode](/tiflash/use-tiflash-mpp-mode.md#access-partitioned-tables-in-the-mpp-mode), [Dynamic pruning mode](/partitioned-table.md#dynamic-pruning-mode), [#3873](https://github.com/pingcap/tiflash/issues/3873)

### Stability

Expand Down
2 changes: 1 addition & 1 deletion scale-tidb-using-tiup.md
Original file line number Diff line number Diff line change
Expand Up @@ -347,7 +347,7 @@ Before the node goes down, make sure that the number of remaining nodes in the T
ALTER TABLE <db-name>.<table-name> SET tiflash replica 0;
```
2. Wait for the TiFlash replicas of the related tables to be deleted. [Check the table replication progress](/tiflash/use-tiflash.md#check-replication-progress) and the replicas are deleted if the replication information of the related tables is not found.
2. Wait for the TiFlash replicas of the related tables to be deleted. [Check the table replication progress](/tiflash/create-tiflash-replicas.md#check-replication-progress) and the replicas are deleted if the replication information of the related tables is not found.
### 2. Perform the scale-in operation
Expand Down
4 changes: 2 additions & 2 deletions system-variables.md
Original file line number Diff line number Diff line change
Expand Up @@ -499,7 +499,7 @@ This variable is an alias for `last_insert_id`.
- `0` or `OFF`, which means that the MPP mode will not be used.
- `1` or `ON`, which means that the optimizer determines whether to use the MPP mode based on the cost estimation (by default).
MPP is a distributed computing framework provided by the TiFlash engine, which allows data exchange between nodes and provides high-performance, high-throughput SQL algorithms. For details about the selection of the MPP mode, refer to [Control whether to select the MPP mode](/tiflash/use-tiflash.md#control-whether-to-select-the-mpp-mode).
MPP is a distributed computing framework provided by the TiFlash engine, which allows data exchange between nodes and provides high-performance, high-throughput SQL algorithms. For details about the selection of the MPP mode, refer to [Control whether to select the MPP mode](/tiflash/use-tiflash-mpp-mode.md#control-whether-to-select-the-mpp-mode).
### tidb_allow_remove_auto_inc <span class="version-mark">New in v2.1.18 and v3.0.4</span>
Expand Down Expand Up @@ -1158,7 +1158,7 @@ Query OK, 0 rows affected (0.09 sec)
- `0` or `OFF`, which means that the MPP mode is not forcibly used (by default).
- `1` or `ON`, which means that the cost estimation is ignored and the MPP mode is forcibly used. Note that this setting only takes effect when `tidb_allow_mpp=true`.
MPP is a distributed computing framework provided by the TiFlash engine, which allows data exchange between nodes and provides high-performance, high-throughput SQL algorithms. For details about the selection of the MPP mode, refer to [Control whether to select the MPP mode](/tiflash/use-tiflash.md#control-whether-to-select-the-mpp-mode).
MPP is a distributed computing framework provided by the TiFlash engine, which allows data exchange between nodes and provides high-performance, high-throughput SQL algorithms. For details about the selection of the MPP mode, refer to [Control whether to select the MPP mode](/tiflash/use-tiflash-mpp-mode.md#control-whether-to-select-the-mpp-mode).
### tidb_evolve_plan_baselines <span class="version-mark">New in v4.0</span>
Expand Down
Loading

0 comments on commit a6bc923

Please sign in to comment.