Vectorized aggregation with grouping by one fixed-size column #7341

akuzm · 2024-10-14T12:07:46Z

The implementation uses the Postgres simplehash hash table for by-value fixed-size compressed columns.

The biggest improvement on a "sensible" query is about 90%, and a couple of queries show bigger improvements but these are very synthetic cases that don't make much sense:
https://grafana.ops.savannah-dev.timescale.com/d/fasYic_4z/compare-akuzm?orgId=1&var-branch=All&var-run1=3815&var-run2=3816&var-threshold=0.02&var-use_historical_thresholds=true&var-threshold_expression=2%20%2A%20percentile_cont%280.90%29&var-exact_suite_version=false&from=now-2d&to=now

some experiments

This reverts commit 795ef6b.

This reverts commit 166d0e8.

codecov · 2024-10-14T12:16:59Z

Codecov Report

Attention: Patch coverage is 92.30769% with 28 lines in your changes missing coverage. Please review.

Project coverage is 82.27%. Comparing base (59f50f2) to head (10e66ad).
Report is 660 commits behind head on main.

Files with missing lines	Patch %	Lines
tsl/src/nodes/vector_agg/grouping_policy_hash.c	91.13%	3 Missing and 11 partials ⚠️
tsl/src/nodes/vector_agg/plan.c	80.00%	4 Missing and 5 partials ⚠️
...nodes/vector_agg/function/agg_many_vector_helper.c	95.00%	0 Missing and 1 partial ⚠️
...rc/nodes/vector_agg/hashing/batch_hashing_params.h	85.71%	0 Missing and 1 partial ⚠️
...rc/nodes/vector_agg/hashing/hash_strategy_common.c	94.44%	0 Missing and 1 partial ⚠️
.../src/nodes/vector_agg/hashing/hash_strategy_impl.c	98.18%	0 Missing and 1 partial ⚠️
..._agg/hashing/hash_strategy_impl_single_fixed_key.c	95.65%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #7341      +/-   ##
==========================================
+ Coverage   80.06%   82.27%   +2.21%     
==========================================
  Files         190      237      +47     
  Lines       37181    43668    +6487     
  Branches     9450    10957    +1507     
==========================================
+ Hits        29770    35930    +6160     
- Misses       2997     3403     +406     
+ Partials     4414     4335      -79

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

erimatnor · 2024-12-18T08:04:01Z

tsl/src/nodes/vector_agg/exec.c

 		}
 	}

 	/*
-	 * Currently the only grouping policy we use is per-batch grouping.
+	 * Determine which grouping policy we are going to use.


Out of curiosity: Why is the grouping policy decided at execution time and not plan time? Should it not affect the plan and cost calc?

I moved it all to plan time, although as we discussed today on call, it doesn't affect the costs yet.

tsl/src/nodes/vector_agg/grouping_policy_hash.c

tsl/src/nodes/vector_agg/grouping_policy_hash.h

Co-authored-by: Erik Nordström <[email protected]> Signed-off-by: Alexander Kuzmenkov <[email protected]>

akuzm added 23 commits October 2, 2024 10:32

Vectorized hash grouping on one column

b92e622

some experiments

Merge remote-tracking branch 'origin/main' into HEAD

4ce0e99

benchmark vectorized grouping (2024-10-02 no. 6)

74d4419

fixes

baedf7f

benchmark vectorized grouping (2024-10-02 no. 7)

35dbd36

some ugly stuff

74fffd3

benchmark vectorized grouping (2024-10-02 no. 9)

f8db454

someething

00a9d11

reduce indirections

339f91a

skip null bitmap words

f075589

cleanup

88f325d

crc32

15ab443

license

ff16ec8

benchmark vectorized hash grouping (2024-10-09 no. 10)

4291b17

test deltadelta changes

795ef6b

some speedups and simplehash simplifications

1fabb22

Revert "test deltadelta changes"

717abc4

This reverts commit 795ef6b.

test deltadelta changes

b03bd6b

work with signed types

166d0e8

Revert "work with signed types"

7f578b4

This reverts commit 166d0e8.

bulk stuff specialized to element type

e70cb0b

roll back the delta delta stuff

0040844

use simplehash

694faf6

akuzm added 6 commits October 14, 2024 13:31

cleanup

3d05674

benchmark vectorized hash grouping (simple) (2024-10-14 no. 11)

d90a90f

add more tests

4a93549

remove modified simplehash

3e06b92

offsets

a7942ed

cleanup

6fb517f

akuzm added 25 commits November 19, 2024 12:00

Vectorize aggregate FILTER clause

9e51c19

Merge remote-tracking branch 'origin/main' into HEAD

480d0fe

cleanups after merge

9b0ee38

cleanup

effa7eb

Merge remote-tracking branch 'origin/main' into HEAD

533be01

changelog

8e6c6d2

constify stable expressions

b717f74

Merge commit '155ca6f7ef2925735c7063cd9178edd185c17009' into HEAD

4df06d9

updates

47bcaa9

remove extras

b6cee02

ref

ecb1aec

fixes

f64676f

benchmark single fixed-column hash grouping (2024-12-03 no. 11)

fab11fb

cleanup

dff6dff

planning fixes for pg 17

831cadd

benchmark fixed-size hash grouping (2024-12-04 no. 152)

66403f2

remove some (yet) unused code

99e5b04

Merge remote-tracking branch 'origin/main' into HEAD

de22a22

ref

9fccab9

Merge remote-tracking branch 'akuzm/vector-filter' into HEAD

8e97c2f

add test

f5b648a

Merge remote-tracking branch 'origin/main' into HEAD

ecd9cb2

typo

dc6001d

disable parallel

0ea397a

add order

ea4dab1

erimatnor approved these changes Dec 18, 2024

View reviewed changes

akuzm and others added 4 commits December 18, 2024 17:51

Update tsl/src/nodes/vector_agg/grouping_policy_hash.h

4b98e46

Co-authored-by: Erik Nordström <[email protected]> Signed-off-by: Alexander Kuzmenkov <[email protected]>

Update tsl/src/nodes/vector_agg/grouping_policy_hash.h

b615dbe

Co-authored-by: Erik Nordström <[email protected]> Signed-off-by: Alexander Kuzmenkov <[email protected]>

determine the grouping type at plan time

045f59a

Merge remote-tracking branch 'origin/main' into HEAD

10e66ad

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vectorized aggregation with grouping by one fixed-size column #7341

Vectorized aggregation with grouping by one fixed-size column #7341

akuzm commented Oct 14, 2024 •

edited

Loading

codecov bot commented Oct 14, 2024 •

edited

Loading

erimatnor Dec 18, 2024

akuzm Dec 18, 2024

Vectorized aggregation with grouping by one fixed-size column #7341

Are you sure you want to change the base?

Vectorized aggregation with grouping by one fixed-size column #7341

Conversation

akuzm commented Oct 14, 2024 • edited Loading

codecov bot commented Oct 14, 2024 • edited Loading

Codecov Report

erimatnor Dec 18, 2024

Choose a reason for hiding this comment

akuzm Dec 18, 2024

Choose a reason for hiding this comment

akuzm commented Oct 14, 2024 •

edited

Loading

codecov bot commented Oct 14, 2024 •

edited

Loading