-
Notifications
You must be signed in to change notification settings - Fork 1.5k
perf(array-agg): add fast path for array agg for merge_batch
#14299
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf(array-agg): add fast path for array agg for merge_batch
#14299
Conversation
I ran the newly added
I am not sure why all the runs aren't listed. I am looking into that |
Here is my entire benchmark run
TLDR is that this PR appears to have a very significant performance improvement. 🚀 If no one beats me to it I will give it a good look in the upcoming week |
Nice work @rluvaton |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @rluvaton -- I reviewed the code and this makes sense to me and the benchmark numbers are 👌 very nice
BTW if we want to make AGG_ARRAY
really fast we probably need to implement a GroupsAccumlator
for it -- @Rachelint has an example of doing this in #13681
Already has a local implementation of array aggregation distinct with groups, that will try to upstream soon |
// If no nulls than just use the fast path | ||
// This is ok as the state is a ListArray rather than a ListViewArray so all the values are consecutive | ||
if null_count == 0 { | ||
// According to Arrow specification, the first offset can be non-zero |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact we saw it here too:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while writing this code I first found out that offsets does not have to start from 0
Thanks again @rluvaton |
Which issue does this PR close?
N/A
Rationale for this change
We want faster DataFusion
What changes are included in this PR?
Added fast path for
array_agg
accumulator formerge_batch
that return the underlying values of the list in case there is no null that point to the non empty list starting from the first valid index and until last valid indexAre these changes tested?
Existing tests
Are there any user-facing changes?
Nope
This depend on:
as we want to see how this change affect the performance.
Running locally giving promising results (almost 2 times faster for the optimized cases and no regression for not optimized):
Local benchmark result