Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature](exchange) enable shared exchange sink buffer to reduce RPC concurrency #46764

Open
wants to merge 3 commits into
base: branch-3.0
Choose a base branch
from

Conversation

Mryange
Copy link
Contributor

@Mryange Mryange commented Jan 10, 2025

What problem does this PR solve?

#43284
#44850

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

…n problem (apache#43284)

Picked from the 2.1 branch, only the RPC profile-related code was
selected.
apache#39852
apache#40117

```
                      DATA_STREAM_SINK_OPERATOR  (id=2,dst_id=2):
                            -  RpcCount:  sum  16,  avg  4,  max  4,  min  4
                            -  RpcMaxTime:  avg  1.15ms,  max  1.163ms,  min  818.493us

 -  RpcAvgTime:  11.850ms
                          -  RpcCount:  10
                          -  RpcMaxTime:  86.891ms
                          -  RpcMinTime:  15.200ms
                          -  RpcSumTime:  118.503ms
                          -  SerializeBatchTime:  13.517ms
                          -  SplitBlockDistributeByChannelTime:  38.923ms
                          -  SplitBlockHashComputeTime:  2.659ms
                          -  UncompressedRowBatchSize:  135.19  KB
                          -  WaitForDependencyTime:  0ns
                              -  WaitForRpcBufferQueue:  0ns
                        RpcInstanceDetails:
                              -  Instance  85d4f75b72a9ea61:  Count:  4,  MaxTime:  36.238ms,  MinTime:  12.107ms,  AvgTime:  21.722ms,  SumTime:  86.891ms
                              -  Instance  85d4f75b72a9ea91:  Count:  3,  MaxTime:  11.107ms,  MinTime:  2.431ms,  AvgTime:  5.470ms,  SumTime:  16.412ms
                              -  Instance  85d4f75b72a9eac1:  Count:  3,  MaxTime:  7.554ms,  MinTime:  3.160ms,  AvgTime:  5.066ms,  SumTime:  15.200m
```
…concurrency. (apache#44850)

In the past, each exchange sink had its own sink buffer.
If the query concurrency is n, there would be n * n RPCs running
concurrently
in a typical shuffle scenario (each sender instance can send data to all
downstream instances).
Here, we introduce support for shared sink buffers.
This does not reduce the total number of RPCs but can limit the number
of concurrent RPCs.
@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@Mryange
Copy link
Contributor Author

Mryange commented Jan 10, 2025

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 41454 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 7909a478f34fb9a3ff924c3d6d28fceb0e9a4b6f, data reload: false

------ Round 1 ----------------------------------
q1	17585	7497	7311	7311
q2	2042	194	171	171
q3	10522	1141	1161	1141
q4	10550	793	733	733
q5	7758	2969	2949	2949
q6	241	154	153	153
q7	1013	631	614	614
q8	9383	1959	2027	1959
q9	6674	6467	6406	6406
q10	6973	2313	2382	2313
q11	478	278	267	267
q12	408	215	215	215
q13	17761	2987	2979	2979
q14	243	213	208	208
q15	577	521	522	521
q16	687	615	612	612
q17	989	597	539	539
q18	7384	6836	6779	6779
q19	1418	1137	1149	1137
q20	468	201	205	201
q21	4048	3340	3248	3248
q22	1103	998	1001	998
Total cold run time: 108305 ms
Total hot run time: 41454 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7244	7227	7216	7216
q2	331	234	226	226
q3	2984	3020	3008	3008
q4	2063	1773	1794	1773
q5	5787	5721	5726	5721
q6	219	136	136	136
q7	2224	1836	1865	1836
q8	3339	3575	3503	3503
q9	8876	8942	8836	8836
q10	3599	3554	3575	3554
q11	600	499	502	499
q12	848	621	614	614
q13	8210	3117	3026	3026
q14	276	253	258	253
q15	561	509	520	509
q16	685	657	673	657
q17	1802	1551	1574	1551
q18	7902	7486	7320	7320
q19	1670	1482	1485	1482
q20	2062	1824	1823	1823
q21	5295	5030	5154	5030
q22	1092	1014	1045	1014
Total cold run time: 67669 ms
Total hot run time: 59587 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 191941 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 7909a478f34fb9a3ff924c3d6d28fceb0e9a4b6f, data reload: false

query1	972	376	365	365
query2	6539	2105	2064	2064
query3	6699	215	219	215
query4	33871	23532	23537	23532
query5	4355	473	462	462
query6	297	190	182	182
query7	4628	318	317	317
query8	299	235	221	221
query9	9639	2690	2710	2690
query10	464	264	268	264
query11	18004	15309	15181	15181
query12	158	108	102	102
query13	1632	429	417	417
query14	9905	6966	7406	6966
query15	260	171	174	171
query16	8209	471	469	469
query17	1658	558	544	544
query18	2152	305	321	305
query19	373	154	145	145
query20	116	106	105	105
query21	208	102	103	102
query22	4662	4221	4360	4221
query23	34790	34770	33984	33984
query24	11221	2951	2858	2858
query25	714	426	412	412
query26	1459	170	172	170
query27	2856	350	345	345
query28	7870	2458	2441	2441
query29	938	445	445	445
query30	325	164	166	164
query31	1094	801	827	801
query32	101	63	60	60
query33	799	306	315	306
query34	906	518	540	518
query35	880	705	712	705
query36	1111	919	926	919
query37	135	87	74	74
query38	4015	3871	3880	3871
query39	1460	1466	1437	1437
query40	274	104	102	102
query41	55	49	51	49
query42	117	104	97	97
query43	536	495	492	492
query44	1263	806	800	800
query45	185	170	171	170
query46	1165	726	711	711
query47	1975	1829	1859	1829
query48	481	392	394	392
query49	1184	419	413	413
query50	836	409	415	409
query51	7165	7082	7089	7082
query52	103	94	89	89
query53	255	188	187	187
query54	1286	463	470	463
query55	82	79	79	79
query56	271	260	251	251
query57	1202	1081	1091	1081
query58	250	217	229	217
query59	3224	2984	3060	2984
query60	311	277	274	274
query61	136	132	135	132
query62	864	684	688	684
query63	226	189	191	189
query64	5515	778	687	687
query65	3275	3154	3188	3154
query66	1477	320	317	317
query67	16084	15494	15494	15494
query68	4568	567	560	560
query69	437	274	269	269
query70	1169	1068	1080	1068
query71	408	260	258	258
query72	6729	4084	4122	4084
query73	761	350	355	350
query74	10199	8979	9030	8979
query75	3414	2650	2580	2580
query76	2937	1078	1079	1078
query77	472	282	274	274
query78	10544	9660	9506	9506
query79	2416	596	619	596
query80	1065	428	430	428
query81	554	242	242	242
query82	963	125	118	118
query83	232	146	158	146
query84	243	78	84	78
query85	1346	308	300	300
query86	427	297	298	297
query87	4515	4296	4275	4275
query88	4367	2371	2356	2356
query89	406	300	289	289
query90	2031	187	185	185
query91	180	146	151	146
query92	68	55	55	55
query93	1674	557	546	546
query94	931	299	300	299
query95	356	254	256	254
query96	613	279	276	276
query97	3380	3203	3182	3182
query98	218	202	202	202
query99	1530	1333	1287	1287
Total cold run time: 304724 ms
Total hot run time: 191941 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 33.41 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 7909a478f34fb9a3ff924c3d6d28fceb0e9a4b6f, data reload: false

query1	0.03	0.03	0.03
query2	0.07	0.03	0.04
query3	0.23	0.08	0.07
query4	1.60	0.10	0.11
query5	0.53	0.55	0.51
query6	1.14	0.73	0.74
query7	0.03	0.02	0.02
query8	0.03	0.03	0.04
query9	0.54	0.50	0.51
query10	0.54	0.55	0.57
query11	0.15	0.10	0.10
query12	0.14	0.12	0.11
query13	0.62	0.59	0.61
query14	2.95	3.03	2.96
query15	0.92	0.84	0.83
query16	0.37	0.40	0.38
query17	1.06	1.08	1.08
query18	0.23	0.22	0.23
query19	1.85	1.94	2.01
query20	0.01	0.01	0.01
query21	15.39	0.56	0.60
query22	2.77	2.98	2.42
query23	17.18	0.93	0.80
query24	2.68	1.30	0.57
query25	0.28	0.14	0.15
query26	0.36	0.15	0.14
query27	0.04	0.05	0.05
query28	11.11	1.11	1.08
query29	12.56	3.22	3.20
query30	0.24	0.06	0.06
query31	2.85	0.40	0.39
query32	3.24	0.46	0.45
query33	3.02	3.00	3.04
query34	17.10	4.60	4.47
query35	4.56	4.48	4.54
query36	0.68	0.48	0.48
query37	0.10	0.06	0.06
query38	0.05	0.04	0.04
query39	0.04	0.03	0.02
query40	0.17	0.13	0.13
query41	0.08	0.03	0.03
query42	0.04	0.03	0.02
query43	0.04	0.03	0.03
Total cold run time: 107.62 s
Total hot run time: 33.41 s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants