Support Grouping functions with Group By CUBE/ROLLUP/GROUPING SETS #5647

mingmwang · 2023-03-20T08:54:43Z

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

PostgreSQL, SparkSQL and Oracle support using GROUPING functions to specify the null is from subtotal or from original
data.
https://www.postgresql.org/docs/15/functions-aggregate.html#FUNCTIONS-HYPOTHETICAL-TABLE

Databricks SparkSQL
https://docs.databricks.com/sql/language-manual/functions/grouping.html

Oracle
https://oracle-base.com/articles/misc/rollup-cube-grouping-functions-and-grouping-sets#grouping

Describe the solution you'd like

Describe alternatives you've considered

Additional context

The text was updated successfully, but these errors were encountered:

mingmwang · 2023-03-21T15:11:59Z

I'm working on it now.

l1t1 · 2024-04-26T03:09:10Z

where is the document of these features?

bgjackma · 2024-09-18T01:28:02Z

take

bgjackma · 2024-09-18T01:37:14Z

I don't think this works as a aggregate function in the physical plan. It depends on the grouping rather than the data, so the current abstraction doesn't have access to the necessary information.

I think this case (along with GROUPING_ID, any others?) will need special handling in GroupHashedAggregateStream.

alamb · 2024-09-19T20:34:48Z

Note there is a PR with a proposed implementation from @JasonLi-cn in #10208

bgjackma · 2024-09-19T20:55:46Z

Thanks for the heads up, that's helpful however I think I have a slightly nicer solution that doesn't force changes on existing accumulators.

eejbyfeldt · 2024-09-20T17:53:36Z

Note that @mingmwang (the author of this ticket) had an alternative approach here: #5749 but it seems like it was never pushed above the finish line. It is based on how it is implemented in Spark.

What is good with that approach is that it also helps address other issues in the current implementation of grouping sets. For example the current implementation will produce incorrect results when there are null values in the actual columns. I also looked into creating PR for only fixing that bug in a similar way to #5749 but need to spend some more time on it.

mingmwang added the enhancement New feature or request label Mar 20, 2023

This was referenced Mar 22, 2023

improve: support combining multiple grouping expressions #5559

Merged

Add ResolveGroupingAnalytics analyzer rule #5749

Closed

eejbyfeldt mentioned this issue Sep 13, 2024

Implement grouping aggregate function #12377

Closed

github-actions bot assigned bgjackma Sep 18, 2024

alamb linked a pull request Sep 19, 2024 that will close this issue

feat: support grouping aggregate function #10208

Draft

bgjackma linked a pull request Sep 21, 2024 that will close this issue

Implement GROUPING aggregate function (following Postgres behavior.) #12565

Open

eejbyfeldt mentioned this issue Sep 21, 2024

fix: Correct results for grouping sets when columns contain nulls #12571

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Grouping functions with Group By CUBE/ROLLUP/GROUPING SETS #5647

Support Grouping functions with Group By CUBE/ROLLUP/GROUPING SETS #5647

mingmwang commented Mar 20, 2023 •

edited

Loading

mingmwang commented Mar 21, 2023

l1t1 commented Apr 26, 2024

bgjackma commented Sep 18, 2024

bgjackma commented Sep 18, 2024

alamb commented Sep 19, 2024

bgjackma commented Sep 19, 2024

eejbyfeldt commented Sep 20, 2024 •

edited

Loading

Support Grouping functions with Group By CUBE/ROLLUP/GROUPING SETS #5647

Support Grouping functions with Group By CUBE/ROLLUP/GROUPING SETS #5647

Comments

mingmwang commented Mar 20, 2023 • edited Loading

mingmwang commented Mar 21, 2023

l1t1 commented Apr 26, 2024

bgjackma commented Sep 18, 2024

bgjackma commented Sep 18, 2024

alamb commented Sep 19, 2024

bgjackma commented Sep 19, 2024

eejbyfeldt commented Sep 20, 2024 • edited Loading

mingmwang commented Mar 20, 2023 •

edited

Loading

eejbyfeldt commented Sep 20, 2024 •

edited

Loading