-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[enhance](nereids) add rule MultiDistinctSplit #45209
Merged
morrySnow
merged 5 commits into
apache:master
from
feiniaofeiafei:count_distinct_rewrite
Jan 3, 2025
Merged
[enhance](nereids) add rule MultiDistinctSplit #45209
morrySnow
merged 5 commits into
apache:master
from
feiniaofeiafei:count_distinct_rewrite
Jan 3, 2025
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
run buildall |
TPC-H: Total hot run time: 40494 ms
|
TPC-DS: Total hot run time: 196435 ms
|
ClickBench: Total hot run time: 32.27 s
|
feiniaofeiafei
force-pushed
the
count_distinct_rewrite
branch
from
December 10, 2024 12:41
fb681c9
to
99b8ad5
Compare
run buildall |
feiniaofeiafei
force-pushed
the
count_distinct_rewrite
branch
from
December 10, 2024 13:37
99b8ad5
to
bcd9fcc
Compare
run buildall |
TPC-H: Total hot run time: 39674 ms
|
TPC-DS: Total hot run time: 197910 ms
|
ClickBench: Total hot run time: 32.7 s
|
morrySnow
reviewed
Dec 11, 2024
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/analysis/CheckAnalysis.java
Outdated
Show resolved
Hide resolved
fe/fe-core/src/main/java/org/apache/doris/nereids/jobs/executor/Rewriter.java
Outdated
Show resolved
Hide resolved
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/DistinctSplit.java
Outdated
Show resolved
Hide resolved
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/DistinctSplit.java
Outdated
Show resolved
Hide resolved
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/DistinctSplit.java
Outdated
Show resolved
Hide resolved
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/DistinctSplit.java
Outdated
Show resolved
Hide resolved
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/DistinctSplit.java
Outdated
Show resolved
Hide resolved
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/DistinctSplit.java
Outdated
Show resolved
Hide resolved
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/DistinctSplit.java
Outdated
Show resolved
Hide resolved
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/DistinctSplit.java
Outdated
Show resolved
Hide resolved
run buildall |
1 similar comment
run buildall |
feiniaofeiafei
force-pushed
the
count_distinct_rewrite
branch
from
December 11, 2024 13:45
d420066
to
93ee342
Compare
run buildall |
TPC-H: Total hot run time: 39791 ms
|
TPC-DS: Total hot run time: 196228 ms
|
ClickBench: Total hot run time: 32.23 s
|
feiniaofeiafei
force-pushed
the
count_distinct_rewrite
branch
3 times, most recently
from
December 12, 2024 04:04
5aa17ad
to
fbdbae1
Compare
run buildall |
TPC-H: Total hot run time: 40034 ms
|
TPC-DS: Total hot run time: 189796 ms
|
ClickBench: Total hot run time: 32.85 s
|
feiniaofeiafei
force-pushed
the
count_distinct_rewrite
branch
2 times, most recently
from
December 12, 2024 06:31
e798aac
to
a6f812c
Compare
run buildall |
feiniaofeiafei
force-pushed
the
count_distinct_rewrite
branch
from
December 24, 2024 14:29
791a242
to
92ebe5c
Compare
run buildall |
feiniaofeiafei
force-pushed
the
count_distinct_rewrite
branch
from
December 25, 2024 02:53
92ebe5c
to
6603b90
Compare
add rule count distinct split add rule count distinct split add regression test add regression fix code style change by comment change to custom rewrite change to custom rewrite fix regression fix regression
feiniaofeiafei
force-pushed
the
count_distinct_rewrite
branch
from
December 25, 2024 02:54
6603b90
to
a6d2c1c
Compare
run buildall |
TPC-H: Total hot run time: 32459 ms
|
TPC-DS: Total hot run time: 189957 ms
|
ClickBench: Total hot run time: 31.28 s
|
feiniaofeiafei
changed the title
[enhance](nereids) add rule count distinct split
[enhance](nereids) add rule MultiDistinctSplit
Dec 25, 2024
run p0 |
morrySnow
approved these changes
Dec 26, 2024
PR approved by at least one committer and no changes requested. |
github-actions
bot
added
the
approved
Indicates a PR has been approved by one committer.
label
Dec 26, 2024
PR approved by anyone and no changes requested. |
924060929
approved these changes
Jan 2, 2025
github-actions bot
pushed a commit
that referenced
this pull request
Jan 7, 2025
### What problem does this PR solve? Problem Summary: This pr add a rewrite rule, which can do this 2 type of rewrite: 1. This rewrite can greatly improve the execution speed of multiple count(distinct) operations. When 3be, ndv=10000000, the performance can be improved by three to four times. select count(distinct a),count(distinct b),count(distinct c) from t; -> with tmp as (select * from t) select * from (select count(distinct a) from tmp) t1 cross join (select count(distinct b) from tmp) t2 cross join (select count(distinct c) from tmp) t3 2.Before this PR, the following SQL statement would fail to execute due to an error: "The query contains multi count distinct or sum distinct, each can't have multi columns". This PR rewrites this type of SQL statement as follows, making it executable without an error. select count(distinct a,d),count(distinct b,c),count(distinct c) from t; -> with tmp as (select * from t) select * from (select count(distinct a,d) from tmp) t1 cross join (select count(distinct b,c) from tmp) t2 cross join (select count(distinct c) from tmp) t3 ### Release note Support multi count distinct with different parameters
16 tasks
morrySnow
pushed a commit
that referenced
this pull request
Jan 10, 2025
### What problem does this PR solve? related pr: #45209 If split cout distinct and group by are used, inner join will be used to link the results, but if the group by column has a null value, inner join will not be linked, so we use `NullSafeEqual` as the link condition --------- Co-authored-by: garenshi <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
This pr add a rewrite rule, which can do this 2 type of rewrite:
2.Before this PR, the following SQL statement would fail to execute due to an error: "The query contains multi count distinct or sum distinct, each can't have multi columns". This PR rewrites this type of SQL statement as follows, making it executable without an error.
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)