forked from apache/doris
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[enhancement](Nereids) support 4 phases distinct aggregate with full …
…distribution (apache#35871) The origin implementation of 4 phases distinct aggregate only support the pattern which not contains `group by`, and only one distinct aggregate function for example: ```sql select count(distinct sex), sum(age) from student ``` This pr complement the 4 phases distinct aggregate with full distribution, to avoid data skew in the `group by`. for example ```sql select sex, sum(distinct age) from student group by sex; ``` The sex only contains two distinct values, `male` and `female`, and the table store millions rows. Shuffle by the `sex` cause the data skew and lots of instances process empty rows. The 4 phase aggregate shuffle `sex, age` to distinct rows first, so more instances can do parallel distinct, the plan shape will like this: ``` PhysicalAggregate(groupBy=[sex], output=[sex, sum(partial_sum(age))], mode=BUFFER_TO_RESULT) | PhysicalDistribute(columns=[sex]) | PhysicalAggregate(groupBy=[sex], output=[sex, partial_sum(age)], mode=INPUT_TO_BUFFER) | PhysicalAggregate(groupBy=[sex, age], output=[sex, age], mode=BUFFER_TO_BUFFER) | PhysicalDistribute(columns=[sex, age]) # more columns to shuffle avoid data skew | PhysicalAggregate(groupBy=[sex, age], output=[sex, age], mode=INPUT_TO_BUFFER) | PhysicalOlapScan(name=student) ``` (cherry picked from commit 03f1cbd)
- Loading branch information
Showing
7 changed files
with
144 additions
and
38 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -685,3 +685,7 @@ TESTING AGAIN | |
|
||
-- !having_with_limit -- | ||
7 -32767.0 | ||
|
||
-- !four_phase_full_distribute -- | ||
hello 1 1 | ||
world 1 1 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters