Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] (Part2) Support _state/_union/_merge aggregate funciton combinator #50425

Merged

Conversation

LiShuMing
Copy link
Contributor

@LiShuMing LiShuMing commented Aug 29, 2024

Why I'm doing:

  1. To support common aggregate functions transforms, add _state/_union/_merge aggregate funciton combinator.
  • _state combinator
/**
 * DESC: immediate_type {agg_func}_state(arg_types)
 *  input type  : aggregate function's argument types
 *  return type : aggregate function's immediate type
 */
  • _union combinator

/**
 * Union combinator for aggregate function to union the agg state to return the immediate result of aggregate function.
 * DESC: immediate_type {agg_func}_union(immediate_type)
 *  input type          : aggregate function's immediate_type
 *  intermediate type   : aggregate function's immediate_type
 *  return type         : aggregate function's immediate_type
 */
  • _merge combinator
/**
 * Merge combinator for aggregate function to merge the agg state to return the final result of aggregate function.
 * DESC: return_type {agg_func}_merge(immediate_type)
 *  input type          : aggregate function's immediate_type
 *  intermediate type   : aggregate function's immediate_type
 *  return type         : aggregate function's return type
 */
  1. Support common aggregate functions in aggregate key models, so we can store aggregate's state into storage and can be used for merging later.

What I'm doing:

  • Add _state/_union/_merge aggregate funciton combinator
  • Support common aggregate functions in aggregate key models
  • Fix some little bugs in some aggregate functions.
  • Add creating/merging/querying all functions tests in sql-tester.

Fixes #49000

What type of PR is this:

  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Does this PR entail a change in behavior?

  • Yes, this PR will result in a change in behavior.
  • No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • Parameter changes: default values, similar parameters but with different default values
  • Policy changes: use new policy to replace old one, functionality automatically enabled
  • Feature removed
  • Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr needs user documentation (for new or modified features or behaviors)
    • I have added documentation for my new feature or new function
  • This is a backport pr

Bugfix cherry-pick branch check:

  • I have checked the version labels which the pr will be auto-backported to the target branch
    • 3.3
    • 3.2
    • 3.1
    • 3.0
    • 2.5

@LiShuMing LiShuMing requested review from a team as code owners August 29, 2024 12:31
@LiShuMing LiShuMing force-pushed the feat/main/support_agg_func_combinator branch from c85e66c to e00b08f Compare August 29, 2024 12:31
@LiShuMing LiShuMing force-pushed the feat/main/support_agg_func_combinator branch 2 times, most recently from 3569d80 to deb408d Compare August 29, 2024 14:45
@LiShuMing LiShuMing requested review from a team as code owners September 2, 2024 14:05
@LiShuMing LiShuMing force-pushed the feat/main/support_agg_func_combinator branch 6 times, most recently from 008c23c to 21f5e67 Compare September 4, 2024 07:21
@LiShuMing
Copy link
Contributor Author

@mER�gify rebase

@LiShuMing
Copy link
Contributor Author

@mergify rebase

@LiShuMing LiShuMing force-pushed the feat/main/support_agg_func_combinator branch from 21f5e67 to 11ec2df Compare September 10, 2024 03:34
Copy link
Contributor

mergify bot commented Sep 10, 2024

rebase

✅ Branch has been successfully rebased

.add(COUNT) // Functions with pseudo types are not supported.
.add(COUNT_IF)
.add(FLAT_JSON_META) // Internal functions are not supported.
.add(HLL_RAW)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

COUNT, COUNT_IF is used frequently, so we need to support it.

NDV/APPROX_COUNT_DISTINCT(c) = HLL_CARDINALITY(HLL_RAW(c));
HLL_CARDINALITY(HLL_UNION(HLL_HASH(c))) = HLL_CARDINALITY(HLL_RAW(cast(c as STRING)))

so, in fact, we only need to implement two hll agg functions, hll_raw和hll_union, other functions derived agg functions need not be re-implemented, just use other functions to convert input/output parameters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. count/array_agg have been supported
  2. hll_raw is not supported because it uses hll as input which has been supported in hll_union in aggregate table.
  3. you are right, it's only true that we only need to implement some agg functions' state(eg: array_agg) to recompute other functions state which may be like implementions in the streaming system. But that's a huge work to recompute agg functions by the limited agg function state and we can discuss this later.

@LiShuMing LiShuMing force-pushed the feat/main/support_agg_func_combinator branch from e4c497b to 5087127 Compare September 13, 2024 03:04
satanson
satanson previously approved these changes Sep 13, 2024
Seaven
Seaven previously approved these changes Sep 13, 2024
@LiShuMing LiShuMing dismissed stale reviews from Seaven and satanson via d5a73b6 September 14, 2024 07:59
@LiShuMing LiShuMing force-pushed the feat/main/support_agg_func_combinator branch from 5087127 to d5a73b6 Compare September 14, 2024 07:59
@LiShuMing
Copy link
Contributor Author

@mergify rebase

Copy link
Contributor

mergify bot commented Sep 17, 2024

rebase

✅ Branch has been successfully rebased

@LiShuMing LiShuMing force-pushed the feat/main/support_agg_func_combinator branch from d5a73b6 to d9c4715 Compare September 17, 2024 02:11
Copy link

sonarcloud bot commented Sep 17, 2024

Quality Gate Failed Quality Gate failed

Failed conditions
3.9% Duplication on New Code (required ≤ 3%)

See analysis details on SonarCloud

Copy link

[Java-Extensions Incremental Coverage Report]

pass : 0 / 0 (0%)

Copy link

[BE Incremental Coverage Report]

fail : 65 / 222 (29.28%)

file detail

path covered_line new_line coverage not_covered_line_detail
🔵 be/src/storage/column_aggregate_func.cpp 0 6 00.00% [259, 260, 261, 262, 460, 461]
🔵 be/src/exprs/agg/agg_state_merge.h 0 34 00.00% [30, 32, 33, 34, 36, 38, 40, 42, 44, 46, 48, 49, 50, 52, 54, 55, 57, 58, 59, 61, 63, 64, 65, 67, 69, 70, 72, 74, 75, 76, 78, 80, 81, 83]
🔵 be/src/exprs/agg/array_union_agg.h 0 2 00.00% [159, 160]
🔵 be/src/exprs/agg_state_function.h 0 34 00.00% [40, 41, 42, 43, 44, 45, 47, 48, 49, 51, 54, 56, 57, 58, 60, 61, 62, 65, 66, 67, 68, 69, 70, 71, 72, 73, 75, 76, 78, 81, 82, 83, 84, 85]
🔵 be/src/exprs/agg/agg_state_union.h 0 30 00.00% [30, 32, 33, 34, 36, 38, 40, 42, 44, 46, 48, 49, 50, 52, 54, 55, 57, 58, 59, 61, 63, 64, 66, 68, 69, 70, 72, 74, 75, 77]
🔵 be/src/types/hll_sketch.cpp 0 1 00.00% [57]
🔵 be/src/exec/aggregator.cpp 40 78 51.28% [161, 459, 460, 462, 463, 465, 466, 467, 517, 518, 519, 521, 522, 523, 525, 526, 527, 528, 529, 531, 532, 533, 534, 536, 537, 538, 539, 540, 542, 543, 544, 545, 546, 547, 554, 565, 567, 568]
🔵 be/src/exprs/function_call_expr.cpp 21 33 63.64% [64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75]
🔵 be/src/exprs/agg/aggregate.h 3 3 100.00% []
🔵 be/src/exprs/agg/factory/aggregate_resolver_minmaxany.cpp 1 1 100.00% []

Copy link

[FE Incremental Coverage Report]

pass : 339 / 379 (89.45%)

file detail

path covered_line new_line coverage not_covered_line_detail
🔵 com/starrocks/sql/ast/ColumnDef.java 2 3 66.67% [381]
🔵 com/starrocks/catalog/Column.java 7 9 77.78% [791, 828]
🔵 com/starrocks/sql/analyzer/FunctionAnalyzer.java 39 48 81.25% [162, 163, 166, 167, 172, 173, 176, 177, 971]
🔵 com/starrocks/catalog/combinator/AggStateUtils.java 88 104 84.62% [36, 66, 88, 89, 98, 121, 122, 123, 124, 126, 154, 157, 171, 177, 207, 210]
🔵 com/starrocks/catalog/combinator/AggStateUnionCombinator.java 35 39 89.74% [62, 63, 64, 96]
🔵 com/starrocks/catalog/combinator/AggStateDesc.java 9 10 90.00% [103]
🔵 com/starrocks/catalog/combinator/AggStateMergeCombinator.java 36 40 90.00% [62, 63, 64, 96]
🔵 com/starrocks/catalog/combinator/AggStateCombinator.java 40 43 93.02% [62, 63, 64]
🔵 com/starrocks/catalog/AggregateFunction.java 10 10 100.00% []
🔵 com/starrocks/catalog/FunctionSet.java 25 25 100.00% []
🔵 com/starrocks/sql/analyzer/DecimalV3FunctionAnalyzer.java 12 12 100.00% []
🔵 com/starrocks/catalog/ScalarFunction.java 11 11 100.00% []
🔵 com/starrocks/sql/analyzer/PolymorphicFunctionAnalyzer.java 15 15 100.00% []
🔵 com/starrocks/catalog/Type.java 8 8 100.00% []
🔵 com/starrocks/catalog/Function.java 2 2 100.00% []

@kangkaisen kangkaisen merged commit 5b5c277 into StarRocks:main Sep 18, 2024
51 of 53 checks passed
LiShuMing added a commit to LiShuMing/starrocks that referenced this pull request Oct 8, 2024
renzhimin7 pushed a commit to renzhimin7/starrocks that referenced this pull request Nov 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support Generic Aggregate Function State
5 participants