Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resource units (RUs) used should be computable from the output of EXPLAIN and EXPLAIN ANALYZE #47269

Closed
adamf-db opened this issue Sep 25, 2023 · 10 comments
Labels
type/feature-request Categorizes issue or PR as related to a new feature.

Comments

@adamf-db
Copy link

adamf-db commented Sep 25, 2023

Feature Request

Is your feature request related to a problem? Please describe:
I'd like to be know how the RUs reported by EXPLAIN and EXPLAIN ANALYZE where calculated. Right now, it's not possible to use the output of an EXPLAIN ANALYZE to determine why the RUs are what they are.

Describe the feature you'd like:
The output of EXPLAIN ANALYZE... should include the counts for reads, batch reads, and CPU time. Ideally, this would be done for all operators including sub operators. This should also report the base cost values used.

Basically, I want to be able to do the calculations in tikv/pd/client/resource_manager/model.go & config.go but using the output of EXPLAIN ANALYZE (or estimated in EXPLAIN).

Describe alternatives you've considered:
There's not really any other way.

Teachability, Documentation, Adoption, Migration Strategy:
I think it's reasonable to want to understand the cost of a query you issue, and especially when you are trying to map something like QPS-> RUs -> resource groups. It's very hard to do that right now.

Examples of current output from EXPLAIN ANALYZE format = tidb_json with the executeInfo blocks also JSON:

EXPLAIN ANALYZE FORMAT = "tidb_json" select * from trips where trip_id = 5105991
{'accessObject': 'table:trips',
 'actRows': '1',
 'diskInfo': 'N/A',
 'estRows': '1.00',
 'executeInfo': 'time:2.55ms, loops:2, RU:0.329460, Get:{num_rpc:1, '
                'total_time:2.13ms}, total_process_time: 231.5µs, '
                'total_wait_time: 732.9µs, tikv_wall_time: 995.8µs, '
                'scan_detail: {total_process_keys: 1, total_process_keys_size: '
                '150, total_keys: 1, get_snapshot_time: 691.7µs, rocksdb: '
                '{block: {cache_hit_count: 2, read_count: 1, read_byte: 8.19 '
                'KB, read_time: 10.3µs}}}',
 'id': 'Point_Get_1',
 'memoryInfo': 'N/A',
 'operatorInfo': 'handle:11158051',
 'taskType': 'root'}
EXPLAIN ANALYZE FORMAT = "tidb_json" INSERT INTO ten_col (i1 , i2 , i3 , i4 , i5 , i6 , i7 {'actRows': '0',
 'diskInfo': 'N/A',
 'estRows': 'N/A',
 'executeInfo': 'time:71.8µs, loops:1, prepare: 33.3µs, insert:38.4µs, '
                'RU:1.416016',
 'id': 'Insert_1',
 'memoryInfo': '781 Bytes',
 'operatorInfo': 'N/A',
 'taskType': 'root'}

The below is a select that partially hits a composite index:

{'actRows': '1433',
 'diskInfo': 'N/A',
 'estRows': '9647.77',
 'executeInfo': 'time:5.09ms, loops:3, RU:4.476161, Concurrency:5',
 'id': 'Projection_4',
 'memoryInfo': '111.4 KB',
 'operatorInfo': 'bikeshare.adam_trips.bike_number',
 'subOperators': [{'actRows': '1433',
                   'diskInfo': 'N/A',
                   'estRows': '9647.77',
                   'executeInfo': {'cop_task': {'avg': '1.64ms',
                                                'build_task_duration': '7.08µs',
                                                'copr_cache_hit_ratio': '0.00',
                                                'max': '1.76ms',
                                                'max_distsql_concurrency': '1',
                                                'max_proc_keys': '729',
                                                'min': '1.56ms',
                                                'num': '3',
                                                'p95': '1.76ms',
                                                'p95_proc_keys': '729',
                                                'rpc_num': '3',
                                                'rpc_time': '4.87ms',
                                                'tot_proc': '1.07ms',
                                                'tot_wait': '113.2µs'},
                                   'loops': '3',
                                   'time': '5.03ms'},
                   'id': 'IndexReader_6',
                   'memoryInfo': '75.2 KB',
                   'operatorInfo': 'index:IndexRangeScan_5',
                   'subOperators': [{'accessObject': 'table:adam_trips, '
                                                     'index:start_end_bike_number(start_station, '
                                                     'end_station, '
                                                     'bike_number)',
                                     'actRows': '1433',
                                     'diskInfo': 'N/A',
                                     'estRows': '9647.77',
                                     'executeInfo': 'tikv_task:{proc max:0s, '
                                                    'min:0s, avg: 0s, p80:0s, '
                                                    'p95:0s, iters:12, '
                                                    'tasks:3}, scan_detail: '
                                                    '{total_process_keys: '
                                                    '1433, '
                                                    'total_process_keys_size: '
                                                    '220781, total_keys: 1436, '
                                                    'get_snapshot_time: '
                                                    '42.6µs, rocksdb: '
                                                    '{key_skipped_count: 1433, '
                                                    'block: {cache_hit_count: '
                                                    '10}}}',
                                     'id': 'IndexRangeScan_5',
                                     'memoryInfo': 'N/A',
                                     'operatorInfo': 'range:["11th & Kenyon St '
                                                     'NW" "18th & M St '
                                                     'NW","11th & Kenyon St '
                                                     'NW" "18th & M St NW"], '
                                                     'keep order:false',
                                     'taskType': 'cop[tikv]'}],
                   'taskType': 'root'}],
 'taskType': 'root'}
@adamf-db adamf-db added the type/feature-request Categorizes issue or PR as related to a new feature. label Sep 25, 2023
@kevin-xianliu
Copy link

@nolouch PTAL.

@kevin-xianliu
Copy link

We support using EXPLAIN ANALYZE to display RU statistical information: #42080

@adamf-db
Copy link
Author

We support using EXPLAIN ANALYZE to display RU statistical information: #42080

Yep - but you can't figure out how the RUs were computed from the explain plan. It's a black box - this issue is about adding the information needed to a) verify the RU computation, and b) see what factors (reads, batch reads, bytes read, writes, bytes written, CPU ms) make up the RU.

@nolouch
Copy link
Member

nolouch commented Sep 27, 2023

@adamf-db Actually, the main factors are already displayed in executeInfo, two main factors are the total_process_keys_size and rpc_nums. What's your purpose in knowing the factors? only want to know if is it calculated as expected or want to optimize the query.
We can optimize the content of this display. but does the user really need to know the details of the calculation to optimize SQL? We should already have detailed execution information and per query RU.

@adamf-db
Copy link
Author

@nolouch I want to have a good mental model for future queries, and I want to be able to predict the impact of RGs/RU usage on my queries. I also want to just prove that the my QPS -> RU usage -> RG settings make sense.

How do total_process_keys_size and rpc_nums map to the RU calculations in model.go?

For example, given:

 'executeInfo': 'time:2.55ms, loops:2, RU:0.329460, Get:{num_rpc:1, '
                'total_time:2.13ms}, total_process_time: 231.5µs, '
                'total_wait_time: 732.9µs, tikv_wall_time: 995.8µs, '
                'scan_detail: {total_process_keys: 1, total_process_keys_size: '
                '150, total_keys: 1, get_snapshot_time: 691.7µs, rocksdb: '
                '{block: {cache_hit_count: 2, read_count: 1, read_byte: 8.19 '
                'KB, read_time: 10.3µs}}}',
 'id': 'Point_Get_1',

How do we end up with 0.329460 RUs?

@nolouch
Copy link
Member

nolouch commented Sep 27, 2023

@adamf-db If you are using 7.1. x, the calculation is sum BeforeKVRequest() + AfterKVRequest() in model.go, that is :

before : 
     consumption.RRU += float64(kc.ReadBaseCost)  ->   kv.ReadBaseCost * rpc_nums  
after:  
      consumption.RRU += float64(kc.ReadBytesCost) * readBytes  -> kc.ReadBytesCost * total_process_keys_size 
      consumption.RRU += float64(kc.CPUMsCost) * kvCPUMs   -> kc.CPUMsCost * total_process_time 

and according to https://github.com/tikv/pd/blob/aeb259335644d65a97285d7e62b38e7e43c6ddca/client/resource_group/controller/config.go#L58C19-L67, it's a read request, and

    CPUMsCost  = (1/3) 
    ReadBytesCost =  1/ (64*1024) 
    ReadBaseCost  = 0.25

so RU is:

kv.ReadBaseCost * rpc_nums  +  kc.ReadBytesCost * total_process_keys_size  +  kc.CPUMsCost * total_process_time 
= 0.25 *1 + 1/ (64*1024) * 150 + (1/3) *  0.2315
≈ 0.329

@adamf-db
Copy link
Author

adamf-db commented Oct 1, 2023

Thanks!

How does the calculation work for the below plan?

{'actRows': '1433',
 'diskInfo': 'N/A',
 'estRows': '9647.77',
 'executeInfo': 'time:5.09ms, loops:3, RU:4.476161, Concurrency:5',
 'id': 'Projection_4',
 'memoryInfo': '111.4 KB',
 'operatorInfo': 'bikeshare.adam_trips.bike_number',
 'subOperators': [{'actRows': '1433',
                   'diskInfo': 'N/A',
                   'estRows': '9647.77',
                   'executeInfo': {'cop_task': {'avg': '1.64ms',
                                                'build_task_duration': '7.08µs',
                                                'copr_cache_hit_ratio': '0.00',
                                                'max': '1.76ms',
                                                'max_distsql_concurrency': '1',
                                                'max_proc_keys': '729',
                                                'min': '1.56ms',
                                                'num': '3',
                                                'p95': '1.76ms',
                                                'p95_proc_keys': '729',
                                                'rpc_num': '3',
                                                'rpc_time': '4.87ms',
                                                'tot_proc': '1.07ms',
                                                'tot_wait': '113.2µs'},
                                   'loops': '3',
                                   'time': '5.03ms'},
                   'id': 'IndexReader_6',
                   'memoryInfo': '75.2 KB',
                   'operatorInfo': 'index:IndexRangeScan_5',
                   'subOperators': [{'accessObject': 'table:adam_trips, '
                                                     'index:start_end_bike_number(start_station, '
                                                     'end_station, '
                                                     'bike_number)',
                                     'actRows': '1433',
                                     'diskInfo': 'N/A',
                                     'estRows': '9647.77',
                                     'executeInfo': 'tikv_task:{proc max:0s, '
                                                    'min:0s, avg: 0s, p80:0s, '
                                                    'p95:0s, iters:12, '
                                                    'tasks:3}, scan_detail: '
                                                    '{total_process_keys: '
                                                    '1433, '
                                                    'total_process_keys_size: '
                                                    '220781, total_keys: 1436, '
                                                    'get_snapshot_time: '
                                                    '42.6µs, rocksdb: '
                                                    '{key_skipped_count: 1433, '
                                                    'block: {cache_hit_count: '
                                                    '10}}}',
                                     'id': 'IndexRangeScan_5',
                                     'memoryInfo': 'N/A',
                                     'operatorInfo': 'range:["11th & Kenyon St '
                                                     'NW" "18th & M St '
                                                     'NW","11th & Kenyon St '
                                                     'NW" "18th & M St NW"], '
                                                     'keep order:false',
                                     'taskType': 'cop[tikv]'}],
                   'taskType': 'root'}],
 'taskType': 'root'}

@nolouch
Copy link
Member

nolouch commented Oct 4, 2023

@adamf-db Same as the formula(7.1) above:

kv.ReadBaseCost * rpc_nums  +  kc.ReadBytesCost * total_process_keys_size  +  kc.CPUMsCost * total_process_time 
=  0.25 *3 +  1/ (64*1024) * 220781+ (1/3) * 1.07
≈ 4.476

@adamf-db
Copy link
Author

adamf-db commented Oct 5, 2023

Thanks! Opened pingcap/docs#14956 to update the docs with this information.

@adamf-db
Copy link
Author

adamf-db commented Oct 5, 2023

Closing now that there's a PR for the docs to have this information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/feature-request Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

No branches or pull requests

3 participants