-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Enhancement] support phased schedule (backport #47868) #51033
Conversation
Signed-off-by: stdpain <[email protected]> (cherry picked from commit d7ad29e) # Conflicts: # be/src/exec/pipeline/scan/olap_scan_context.cpp # fe/fe-core/src/main/java/com/starrocks/qe/QeProcessorImpl.java
Cherry-pick of d7ad29e has failed:
To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally |
@mergify[bot]: Backport conflict, please reslove the conflict and resubmit the pr |
Signed-off-by: silverbullet233 <[email protected]>
5b1c954
to
4d575ab
Compare
Quality Gate passedIssues Measures |
Why I'm doing:
For particularly complex queries, the tried schedule approach will instantly consume a lot of cpu and memory, resulting in a very easy OOM.
An example is a union all 100 AGG, each AGG needs to consume about 100M memory, then this query at least 10G memory.
These 10 queries are concurrently sent down to the BE will cause the cpu of the BE to be overloaded and the performance will be degraded instead. If we limit N fragments to be sent down at a time, then it will reduce a lot of memory.
What I'm doing:
To solve above problem. we support a fixed phased scheduler
Tired: schedule A, then schedule B,C,D
phased: schedule A, schedule C, when C finished shedule B...
for phased max concurrency = 1
for 100 concurrency test union all 100 small AGG
for 200 concurrency
todo list
support adaptive phased schedule
Fixes #issue
What type of PR is this:
Does this PR entail a change in behavior?
If yes, please specify the type of change:
Checklist:
Bugfix cherry-pick branch check:
This is an automatic backport of pull request #47868 done by [Mergify](https://mergify.com). ## Why I'm doing:
For particularly complex queries, the tried schedule approach will instantly consume a lot of cpu and memory, resulting in a very easy OOM.
An example is a union all 100 AGG, each AGG needs to consume about 100M memory, then this query at least 10G memory.
These 10 queries are concurrently sent down to the BE will cause the cpu of the BE to be overloaded and the performance will be degraded instead. If we limit N fragments to be sent down at a time, then it will reduce a lot of memory.
What I'm doing:
To solve above problem. we support a fixed phased scheduler
Tired: schedule A, then schedule B,C,D
phased: schedule A, schedule C, when C finished shedule B...
for phased max concurrency = 1
for 100 concurrency test union all 100 small AGG
for 200 concurrency
todo list
support adaptive phased schedule
Fixes #issue
What type of PR is this:
Does this PR entail a change in behavior?
If yes, please specify the type of change:
Checklist: