-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Sketch for aggregation intermediate results blocked management #11943
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Rachelint
wants to merge
112
commits into
apache:main
from
Rachelint:sketch-blocked-aggr-state-management
Closed
Changes from all commits
Commits
Show all changes
112 commits
Select commit
Hold shift + click to select a range
eb74278
re-design the sketch.
Rachelint 728a62e
disable blocked optimization when hash agg is swtched to streaming ag…
Rachelint a1f5e2d
fix style.
Rachelint d7d22f6
impl blocked GroupValuesRows.
Rachelint e62172a
impl simple blocked mode for Count.
Rachelint 99cd66a
impl simple blocked mod for prim_op and avg.
Rachelint 520b2eb
fix init.
Rachelint aae2a3b
fix tests.
Rachelint ed7c1b7
support blocked mode in NullState.
Rachelint 8d9d0c0
define the `blocked_accumulate`, so that we wont change the logic for…
Rachelint cb53724
impl blocked version prim_op and avg accumulators.
Rachelint cce049d
fix streming tests.
Rachelint 89da481
impl blocked version count accumulator.
Rachelint b2fc6d3
fix the special cast that total groups num is zero.
Rachelint fc5d05b
move `BlockedNullState` to accumulate.rs.
Rachelint 6092943
fix tests.
Rachelint 6fcc831
define the `Blocks` to replace `VecDeque`.
Rachelint 00cfa06
introduce Block to accumulators.
Rachelint 261b0f1
make `Blocks` more general.
Rachelint 53950b9
use `Blocks` in `BlockedNullState`.
Rachelint 6de7fc7
use `Blocks` in `GroupValuesRows`.
Rachelint 1567057
use debug_assert instead of assert.
Rachelint a780401
refactor `Blocks`.
Rachelint 5fc7bf5
rename `CurrentBlock` to `NextBlock`, and add more comments.
Rachelint e143c59
minor optimization.
Rachelint 5fb1748
fix comments.
Rachelint 564e6d3
add todos.
Rachelint 7e607eb
reduce repeated codes.
Rachelint ec9bf21
disable blocked optimization in spilling case, and add comments.
Rachelint cb8da87
add more comments and remove stale codes.
Rachelint e054c8b
not try to support spilling in blocked mode in currently.
Rachelint 478627d
improve error messages.
Rachelint ead0076
add comment for `ProducingBlocks`.
Rachelint db1adbe
add comments for GroupStatesMode.
Rachelint 03189a7
remove unused import.
Rachelint 49c5e5e
fix clippy.
Rachelint 50e8958
fix clippy.
Rachelint 13e74b0
add comments.
Rachelint f5684a0
improve comments.
Rachelint 3b63eaa
eliminate the unnecessary `mode check + current_mut`.
Rachelint 5602ded
use index to replace get + unwrap.
Rachelint 8da7806
add comments for some interanl functions.
Rachelint e8ce09b
remove more unnecessary mode checks.
Rachelint 48c7e4f
fix some comments.
Rachelint 0bbad3a
improve comments about `Block`.
Rachelint 0ab53dc
move `Blocks`, `BlockedIndex`, some functions of `Emit` to `datafusio…
Rachelint 78c8e82
add test for `BlockedNullState`.
Rachelint f54878f
fix `BlockedNullState`'s unit test.
Rachelint 56b0bcf
add unit tests for blocks.
Rachelint 94af694
add unit test for `ensure_enough_room_for_blocked_nulls`.
Rachelint 1064a72
test take needed.
Rachelint 2b0796a
fix clippy.
Rachelint 46b10b4
merge two modes to one.
Rachelint aef6c49
experiment.
Rachelint 127a6e7
use function point to replace trait.
Rachelint 9bffb4a
simplify function pointer.
Rachelint c8c0fee
init the function point during new.
Rachelint 3e409ba
tmp.
Rachelint 25269f4
tmp3
Rachelint 4e2b9bc
tmp4
Rachelint 07deb39
tmp5.
Rachelint 5101165
just keep the if else.
Rachelint e10a4bb
remove some unnecessary codes.
Rachelint cc117d4
adapt the new great comments.
Rachelint c65b808
add option and disable the blocked optimization by default.
Rachelint 0a26de8
remove codes about outdated `switch_to_mdoe`.
Rachelint 96f8be8
fix test.
Rachelint 6535b93
fix clippy.
Rachelint 781d00c
try to eliminate more dynamic dispatch.
Rachelint 3316f8f
continue to eliminate more dynamic dispatch.
Rachelint 8a8e799
test.
Rachelint 921cad7
fix clippy.
Rachelint 8c5afa9
fix sql logic tests.
Rachelint e4dd31a
add options to enable blocked apporach in benmarks.
Rachelint db431cb
fix fmt and clippy.
Rachelint f2e316a
fix tpch opts.
Rachelint 8b8da5e
fix comments of `BlockedGroupIndex`.
Rachelint 21e5fdf
update config.
Rachelint 8d5cb7f
fix docs.
Rachelint b33c6f9
use the right way to check if spilling enalbed, and support `emit_ear…
Rachelint fd54e1c
unify ensure_enough_room_for_xxx and add tests.
Rachelint addcc13
fix ensure_enough_room_for_xxx.
Rachelint 5a2292d
add physical level test for blocked approach.
Rachelint 207a777
extract `run_aggregate_test_internal` for resuing later.
Rachelint 6a4cf5b
add simple fuzz test for blocked approach.
Rachelint f1855ee
Merge branch 'main' into sketch-blocked-aggr-state-management
Rachelint 551690d
don't support the `blocked approach` in bench until it compatible wit…
Rachelint 4a97d35
fix clippy.
Rachelint 6f877e3
Merge branch 'main' into sketch-blocked-aggr-state-management
Rachelint 189b4c3
merge main and fix compile.
Rachelint 11870cb
fix clippy.
Rachelint 6ecd81b
fix clippy.
Rachelint 36e0791
add comments to architecture about blocked approach.
Rachelint 4426307
fix comments.
Rachelint 886bb20
fix typo.
Rachelint c2cb573
fix docs.
Rachelint ef91012
a unified and low cost way to compute the different type `BlockedGrou…
Rachelint 31356d8
add test to `BlockedGroupIndexBuilder`.
Rachelint 5d2ac01
add more inlines.
Rachelint 0a7b52b
fix clippy.
Rachelint 318c650
improve config comments.
Rachelint 3d82094
remove deprecated function.
Rachelint b7a443a
improve docs.
Rachelint 0cff3be
rename the on/off option to enable_aggregation_intermediate_states_bl…
Rachelint a2d81a5
fix doc.
Rachelint 1db8633
fix fmt and tests.
Rachelint 5907b8b
Merge branch 'main' into sketch-blocked-aggr-state-management
Rachelint 6613288
update docs.
Rachelint cbafbc5
fix fmt.
Rachelint d258ea9
fix compile.
Rachelint 4a48d3a
Merge branch 'main' into sketch-blocked-aggr-state-management
Rachelint 7b61328
Merge branch 'main' into sketch-blocked-aggr-state-management
Rachelint File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Making it default / not a config should reduce complexity...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am just not sure if we should enable it by default at the beginning.
But as I see, current tests may be enough to keep the correctness of this optimization?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think coverage is high enough, so if we can show that the test pass, and benchmarks don't show (large) regressions, there is no real downside for enabling it.
Maybe we can add some extra tests for e.g. testing the reduced memory usage with this approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pr is too stale to continue developing, I am pushing it forward in #15591 (mainly copy the necessary codes from here)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect! Let's close this one