Skip to content

Commit

Permalink
Apply suggestions from code review
Browse files Browse the repository at this point in the history
  • Loading branch information
lilin90 authored Nov 1, 2024
1 parent af2ec0a commit 69a9adc
Showing 1 changed file with 7 additions and 7 deletions.
14 changes: 7 additions & 7 deletions tiflash/tiflash-mintso-scheduler.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,23 +5,23 @@ summary: Learn the implementation principles of the TiFlash MinTSO Scheduler.

# TiFlash MinTSO Scheduler

The TiFlash MinTSO Scheduler is a distributed scheduler for [MPP](/glossary.md#mpp) Tasks in TiFlash. This article introduces the implementation principles of the TiFlash MinTSO Scheduler.
The TiFlash MinTSO scheduler is a distributed scheduler for [MPP](/glossary.md#mpp) tasks in TiFlash. This document describes the implementation principles of the TiFlash MinTSO scheduler.

## Background

When processing MPP queries, TiDB splits the query into one or more MPP Tasks and sends these MPP Tasks to the corresponding TiFlash nodes for compilation and execution. Before TiFlash used the [pipeline execution model](/tiflash/tiflash-pipeline-model.md), TiFlash needed to use several threads to execute each MPP Task, with the specific number of threads depending on the complexity of the MPP Task and the concurrency parameters set in TiFlash.
When processing MPP queries, TiDB splits the query into one or more MPP tasks and sends these MPP tasks to the corresponding TiFlash nodes for compilation and execution. Before TiFlash uses the [pipeline execution model](/tiflash/tiflash-pipeline-model.md), TiFlash needs to use several threads to execute each MPP task, with the specific number of threads depending on the complexity of the MPP task and the concurrency parameters set in TiFlash.

In high concurrency scenarios, TiFlash nodes receive multiple MPP Tasks simultaneously. If the execution of MPP Tasks is not controlled, the number of threads TiFlash needs to request from the system will increase linearly with the number of MPP Tasks. Too many threads will affect TiFlash's execution efficiency, and since the operating system itself supports a limited number of threads, TiFlash will encounter errors when it requests more threads than the operating system can provide.
In high concurrency scenarios, TiFlash nodes receive multiple MPP tasks simultaneously. If the execution of MPP tasks is not controlled, the number of threads that TiFlash needs to request from the system will increase linearly along with the increasing number of MPP tasks. Too many threads will affect TiFlash's execution efficiency, and since the operating system itself supports a limited number of threads, TiFlash will encounter errors when it requests more threads than the operating system can provide.

Check warning on line 14 in tiflash/tiflash-mintso-scheduler.md

View workflow job for this annotation

GitHub Actions / vale

[vale] reported by reviewdog 🐶 [PingCAP.Ambiguous] Consider using a clearer word than 'many' because it may cause confusion. Raw Output: {"message": "[PingCAP.Ambiguous] Consider using a clearer word than 'many' because it may cause confusion.", "location": {"path": "tiflash/tiflash-mintso-scheduler.md", "range": {"start": {"line": 14, "column": 280}}}, "severity": "INFO"}

To improve TiFlash's processing capability in high concurrency scenarios, an MPP Task scheduler needs to be introduced in TiFlash.
To improve TiFlash's processing capability in high concurrency scenarios, an MPP task scheduler needs to be introduced into TiFlash.

## Implementation Principles
## Implementation principles

As mentioned in the background, the initial purpose of introducing the TiFlash Task Scheduler is to control the number of threads used during MPP query execution. A simple scheduling strategy is to specify the maximum number of threads TiFlash can request. For each MPP Task, the scheduler decides whether the MPP Task can be scheduled based on the current number of threads used by the system and the expected number of threads the MPP Task will use:
As mentioned in the [background](#background), the initial purpose of introducing the TiFlash task scheduler is to control the number of threads used during MPP query execution. A simple scheduling strategy is to specify the maximum number of threads TiFlash can request. For each MPP task, the scheduler decides whether the MPP task can be scheduled based on the current number of threads used by the system and the expected number of threads the MPP task will use:

![TiFlash MinTSO Scheduler v1](/media/tiflash/tiflash_mintso_v1.png)

Although the above scheduling strategy can effectively control the number of system threads, MPP Tasks are not the smallest independent execution units, and there are dependencies between different MPP Tasks:
Although the preceding scheduling strategy can effectively control the number of system threads, an MPP task is not the smallest independent execution unit, and dependencies exist between different MPP tasks:

```sql
EXPLAIN SELECT count(*) FROM t0 a JOIN t0 b ON a.id = b.id;
Expand Down

0 comments on commit 69a9adc

Please sign in to comment.