-
Notifications
You must be signed in to change notification settings - Fork 916
Draft Poc for Unified select (Enum for bitmap and range) #7454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
🤖 |
I kicked off running the benchmarks on this PR |
🤖: Benchmark completed Details
|
Thank you @alamb , from the benchmark it seems this PR only improve the Utf8ViewNonEmpty cases, and regression for some cases. |
But from clickbench result, it seems performance better than the original default push down: ./bench.sh compare older_push_down test_default_parquet_push_down
Comparing older_push_down and test_default_parquet_push_down
--------------------
Benchmark clickbench_1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query ┃ older_push_down ┃ test_default_parquet_push_down ┃ Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0 │ 0.29ms │ 0.40ms │ 1.39x slower │
│ QQuery 1 │ 50.39ms │ 46.59ms │ +1.08x faster │
│ QQuery 2 │ 77.56ms │ 75.08ms │ no change │
│ QQuery 3 │ 77.95ms │ 79.83ms │ no change │
│ QQuery 4 │ 579.45ms │ 504.25ms │ +1.15x faster │
│ QQuery 5 │ 560.83ms │ 541.37ms │ no change │
│ QQuery 6 │ 0.33ms │ 0.33ms │ no change │
│ QQuery 7 │ 63.95ms │ 56.08ms │ +1.14x faster │
│ QQuery 8 │ 718.15ms │ 663.54ms │ +1.08x faster │
│ QQuery 9 │ 786.44ms │ 751.61ms │ no change │
│ QQuery 10 │ 206.47ms │ 214.36ms │ no change │
│ QQuery 11 │ 213.81ms │ 239.89ms │ 1.12x slower │
│ QQuery 12 │ 679.51ms │ 705.69ms │ no change │
│ QQuery 13 │ 1002.85ms │ 868.43ms │ +1.15x faster │
│ QQuery 14 │ 763.19ms │ 689.47ms │ +1.11x faster │
│ QQuery 15 │ 649.33ms │ 612.41ms │ +1.06x faster │
│ QQuery 16 │ 1390.12ms │ 1339.99ms │ no change │
│ QQuery 17 │ 1219.49ms │ 1134.36ms │ +1.08x faster │
│ QQuery 18 │ 2932.03ms │ 2566.26ms │ +1.14x faster │
│ QQuery 19 │ 63.44ms │ 58.59ms │ +1.08x faster │
│ QQuery 20 │ 752.05ms │ 679.10ms │ +1.11x faster │
│ QQuery 21 │ 927.06ms │ 811.46ms │ +1.14x faster │
│ QQuery 22 │ 1558.98ms │ 1461.95ms │ +1.07x faster │
│ QQuery 23 │ 3402.25ms │ 2604.51ms │ +1.31x faster │
│ QQuery 24 │ 471.04ms │ 445.48ms │ +1.06x faster │
│ QQuery 25 │ 420.43ms │ 423.96ms │ no change │
│ QQuery 26 │ 522.57ms │ 482.58ms │ +1.08x faster │
│ QQuery 27 │ 1375.58ms │ 1304.33ms │ +1.05x faster │
│ QQuery 28 │ 8767.85ms │ 8484.28ms │ no change │
│ QQuery 29 │ 458.33ms │ 446.29ms │ no change │
│ QQuery 30 │ 689.41ms │ 450.94ms │ +1.53x faster │
│ QQuery 31 │ 758.38ms │ 574.79ms │ +1.32x faster │
│ QQuery 32 │ 2981.93ms │ 2339.84ms │ +1.27x faster │
│ QQuery 33 │ 2886.54ms │ 2587.16ms │ +1.12x faster │
│ QQuery 34 │ 3298.54ms │ 2760.24ms │ +1.20x faster │
│ QQuery 35 │ 834.81ms │ 893.25ms │ 1.07x slower │
│ QQuery 36 │ 40.41ms │ 36.52ms │ +1.11x faster │
│ QQuery 37 │ 35.59ms │ 33.71ms │ +1.06x faster │
│ QQuery 38 │ 39.18ms │ 36.37ms │ +1.08x faster │
│ QQuery 39 │ 35.07ms │ 36.88ms │ 1.05x slower │
│ QQuery 40 │ 36.55ms │ 34.98ms │ no change │
│ QQuery 41 │ 36.18ms │ 36.72ms │ no change │
│ QQuery 42 │ 37.00ms │ 34.64ms │ +1.07x faster │
└──────────────┴─────────────────┴────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary ┃ ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (older_push_down) │ 42401.30ms │
│ Total Time (test_default_parquet_push_down) │ 38148.50ms │
│ Average Time (older_push_down) │ 986.08ms │
│ Average Time (test_default_parquet_push_down) │ 887.17ms │
│ Queries Faster │ 26 │
│ Queries Slower │ 4 │
│ Queries with No Change │ 13 │
└───────────────────────────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query ┃ older_push_down ┃ test_default_parquet_push_down ┃ Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0 │ 1.12ms │ 1.10ms │ no change │
│ QQuery 1 │ 25.44ms │ 25.93ms │ no change │
│ QQuery 2 │ 54.65ms │ 55.48ms │ no change │
│ QQuery 3 │ 56.82ms │ 61.02ms │ 1.07x slower │
│ QQuery 4 │ 484.52ms │ 470.33ms │ no change │
│ QQuery 5 │ 524.38ms │ 546.59ms │ no change │
│ QQuery 6 │ 1.24ms │ 1.16ms │ +1.07x faster │
│ QQuery 7 │ 43.00ms │ 38.75ms │ +1.11x faster │
│ QQuery 8 │ 632.15ms │ 593.78ms │ +1.06x faster │
│ QQuery 9 │ 658.82ms │ 680.13ms │ no change │
│ QQuery 10 │ 154.14ms │ 184.81ms │ 1.20x slower │
│ QQuery 11 │ 183.86ms │ 198.64ms │ 1.08x slower │
│ QQuery 12 │ 664.23ms │ 678.02ms │ no change │
│ QQuery 13 │ 916.31ms │ 771.84ms │ +1.19x faster │
│ QQuery 14 │ 717.15ms │ 637.99ms │ +1.12x faster │
│ QQuery 15 │ 561.85ms │ 562.13ms │ no change │
│ QQuery 16 │ 1430.77ms │ 1346.40ms │ +1.06x faster │
│ QQuery 17 │ 1249.19ms │ 1117.41ms │ +1.12x faster │
│ QQuery 18 │ 2739.07ms │ 2427.62ms │ +1.13x faster │
│ QQuery 19 │ 41.63ms │ 45.73ms │ 1.10x slower │
│ QQuery 20 │ 724.72ms │ 694.00ms │ no change │
│ QQuery 21 │ 819.39ms │ 790.52ms │ no change │
│ QQuery 22 │ 1551.09ms │ 1517.14ms │ no change │
│ QQuery 23 │ 3565.46ms │ 2681.59ms │ +1.33x faster │
│ QQuery 24 │ 457.95ms │ 386.72ms │ +1.18x faster │
│ QQuery 25 │ 381.44ms │ 370.83ms │ no change │
│ QQuery 26 │ 526.80ms │ 435.73ms │ +1.21x faster │
│ QQuery 27 │ 1415.36ms │ 1354.63ms │ no change │
│ QQuery 28 │ 8528.35ms │ 9813.62ms │ 1.15x slower │
│ QQuery 29 │ 401.68ms │ 396.41ms │ no change │
│ QQuery 30 │ 728.25ms │ 431.94ms │ +1.69x faster │
│ QQuery 31 │ 744.53ms │ 528.87ms │ +1.41x faster │
│ QQuery 32 │ 3257.23ms │ 2323.61ms │ +1.40x faster │
│ QQuery 33 │ 3187.75ms │ 2517.60ms │ +1.27x faster │
│ QQuery 34 │ 3735.45ms │ 2840.70ms │ +1.31x faster │
│ QQuery 35 │ 877.66ms │ 728.25ms │ +1.21x faster │
│ QQuery 36 │ 26.23ms │ 22.64ms │ +1.16x faster │
│ QQuery 37 │ 23.04ms │ 21.68ms │ +1.06x faster │
│ QQuery 38 │ 21.55ms │ 21.89ms │ no change │
│ QQuery 39 │ 22.02ms │ 22.69ms │ no change │
│ QQuery 40 │ 21.54ms │ 21.98ms │ no change │
│ QQuery 41 │ 21.64ms │ 21.95ms │ no change │
│ QQuery 42 │ 23.77ms │ 22.51ms │ +1.06x faster │
└──────────────┴─────────────────┴────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary ┃ ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (older_push_down) │ 42203.20ms │
│ Total Time (test_default_parquet_push_down) │ 38412.36ms │
│ Average Time (older_push_down) │ 981.47ms │
│ Average Time (test_default_parquet_push_down) │ 893.31ms │
│ Queries Faster │ 20 │
│ Queries Slower │ 5 │
│ Queries with No Change │ 18 │
└───────────────────────────────────────────────┴────────────┘ |
And here is the result compared with main(No push down), still some regression: ./bench.sh compare main test_default_parquet_push_down
Comparing main and test_default_parquet_push_down
--------------------
Benchmark clickbench_1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query ┃ main ┃ test_default_parquet_push_down ┃ Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0 │ 0.32ms │ 0.40ms │ 1.24x slower │
│ QQuery 1 │ 46.96ms │ 46.59ms │ no change │
│ QQuery 2 │ 75.59ms │ 75.08ms │ no change │
│ QQuery 3 │ 74.12ms │ 79.83ms │ 1.08x slower │
│ QQuery 4 │ 556.03ms │ 504.25ms │ +1.10x faster │
│ QQuery 5 │ 563.52ms │ 541.37ms │ no change │
│ QQuery 6 │ 0.31ms │ 0.33ms │ 1.06x slower │
│ QQuery 7 │ 52.23ms │ 56.08ms │ 1.07x slower │
│ QQuery 8 │ 720.15ms │ 663.54ms │ +1.09x faster │
│ QQuery 9 │ 741.10ms │ 751.61ms │ no change │
│ QQuery 10 │ 171.95ms │ 214.36ms │ 1.25x slower │
│ QQuery 11 │ 187.66ms │ 239.89ms │ 1.28x slower │
│ QQuery 12 │ 597.16ms │ 705.69ms │ 1.18x slower │
│ QQuery 13 │ 877.71ms │ 868.43ms │ no change │
│ QQuery 14 │ 605.11ms │ 689.47ms │ 1.14x slower │
│ QQuery 15 │ 630.66ms │ 612.41ms │ no change │
│ QQuery 16 │ 1422.47ms │ 1339.99ms │ +1.06x faster │
│ QQuery 17 │ 1221.90ms │ 1134.36ms │ +1.08x faster │
│ QQuery 18 │ 2773.23ms │ 2566.26ms │ +1.08x faster │
│ QQuery 19 │ 66.30ms │ 58.59ms │ +1.13x faster │
│ QQuery 20 │ 682.62ms │ 679.10ms │ no change │
│ QQuery 21 │ 800.86ms │ 811.46ms │ no change │
│ QQuery 22 │ 1521.09ms │ 1461.95ms │ no change │
│ QQuery 23 │ 4223.95ms │ 2604.51ms │ +1.62x faster │
│ QQuery 24 │ 286.83ms │ 445.48ms │ 1.55x slower │
│ QQuery 25 │ 274.47ms │ 423.96ms │ 1.54x slower │
│ QQuery 26 │ 320.45ms │ 482.58ms │ 1.51x slower │
│ QQuery 27 │ 945.72ms │ 1304.33ms │ 1.38x slower │
│ QQuery 28 │ 8206.32ms │ 8484.28ms │ no change │
│ QQuery 29 │ 459.59ms │ 446.29ms │ no change │
│ QQuery 30 │ 493.35ms │ 450.94ms │ +1.09x faster │
│ QQuery 31 │ 585.82ms │ 574.79ms │ no change │
│ QQuery 32 │ 2436.43ms │ 2339.84ms │ no change │
│ QQuery 33 │ 2916.52ms │ 2587.16ms │ +1.13x faster │
│ QQuery 34 │ 2975.16ms │ 2760.24ms │ +1.08x faster │
│ QQuery 35 │ 866.75ms │ 893.25ms │ no change │
│ QQuery 36 │ 104.22ms │ 36.52ms │ +2.85x faster │
│ QQuery 37 │ 62.50ms │ 33.71ms │ +1.85x faster │
│ QQuery 38 │ 107.57ms │ 36.37ms │ +2.96x faster │
│ QQuery 39 │ 167.64ms │ 36.88ms │ +4.55x faster │
│ QQuery 40 │ 46.49ms │ 34.98ms │ +1.33x faster │
│ QQuery 41 │ 45.49ms │ 36.72ms │ +1.24x faster │
│ QQuery 42 │ 42.51ms │ 34.64ms │ +1.23x faster │
└──────────────┴───────────┴────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary ┃ ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (main) │ 39956.84ms │
│ Total Time (test_default_parquet_push_down) │ 38148.50ms │
│ Average Time (main) │ 929.23ms │
│ Average Time (test_default_parquet_push_down) │ 887.17ms │
│ Queries Faster │ 17 │
│ Queries Slower │ 12 │
│ Queries with No Change │ 14 │
└───────────────────────────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query ┃ main ┃ test_default_parquet_push_down ┃ Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0 │ 1.10ms │ 1.10ms │ no change │
│ QQuery 1 │ 24.27ms │ 25.93ms │ 1.07x slower │
│ QQuery 2 │ 58.03ms │ 55.48ms │ no change │
│ QQuery 3 │ 58.91ms │ 61.02ms │ no change │
│ QQuery 4 │ 478.56ms │ 470.33ms │ no change │
│ QQuery 5 │ 549.38ms │ 546.59ms │ no change │
│ QQuery 6 │ 1.18ms │ 1.16ms │ no change │
│ QQuery 7 │ 40.05ms │ 38.75ms │ no change │
│ QQuery 8 │ 645.79ms │ 593.78ms │ +1.09x faster │
│ QQuery 9 │ 671.10ms │ 680.13ms │ no change │
│ QQuery 10 │ 133.89ms │ 184.81ms │ 1.38x slower │
│ QQuery 11 │ 159.81ms │ 198.64ms │ 1.24x slower │
│ QQuery 12 │ 561.74ms │ 678.02ms │ 1.21x slower │
│ QQuery 13 │ 750.14ms │ 771.84ms │ no change │
│ QQuery 14 │ 525.03ms │ 637.99ms │ 1.22x slower │
│ QQuery 15 │ 553.88ms │ 562.13ms │ no change │
│ QQuery 16 │ 1417.77ms │ 1346.40ms │ +1.05x faster │
│ QQuery 17 │ 1104.70ms │ 1117.41ms │ no change │
│ QQuery 18 │ 3037.46ms │ 2427.62ms │ +1.25x faster │
│ QQuery 19 │ 45.81ms │ 45.73ms │ no change │
│ QQuery 20 │ 733.71ms │ 694.00ms │ +1.06x faster │
│ QQuery 21 │ 789.70ms │ 790.52ms │ no change │
│ QQuery 22 │ 1299.84ms │ 1517.14ms │ 1.17x slower │
│ QQuery 23 │ 3952.24ms │ 2681.59ms │ +1.47x faster │
│ QQuery 24 │ 273.40ms │ 386.72ms │ 1.41x slower │
│ QQuery 25 │ 274.14ms │ 370.83ms │ 1.35x slower │
│ QQuery 26 │ 320.12ms │ 435.73ms │ 1.36x slower │
│ QQuery 27 │ 900.06ms │ 1354.63ms │ 1.51x slower │
│ QQuery 28 │ 7812.82ms │ 9813.62ms │ 1.26x slower │
│ QQuery 29 │ 390.07ms │ 396.41ms │ no change │
│ QQuery 30 │ 420.68ms │ 431.94ms │ no change │
│ QQuery 31 │ 571.58ms │ 528.87ms │ +1.08x faster │
│ QQuery 32 │ 2585.80ms │ 2323.61ms │ +1.11x faster │
│ QQuery 33 │ 2621.59ms │ 2517.60ms │ no change │
│ QQuery 34 │ 3144.83ms │ 2840.70ms │ +1.11x faster │
│ QQuery 35 │ 855.71ms │ 728.25ms │ +1.18x faster │
│ QQuery 36 │ 80.10ms │ 22.64ms │ +3.54x faster │
│ QQuery 37 │ 35.71ms │ 21.68ms │ +1.65x faster │
│ QQuery 38 │ 79.06ms │ 21.89ms │ +3.61x faster │
│ QQuery 39 │ 123.78ms │ 22.69ms │ +5.46x faster │
│ QQuery 40 │ 29.03ms │ 21.98ms │ +1.32x faster │
│ QQuery 41 │ 27.67ms │ 21.95ms │ +1.26x faster │
│ QQuery 42 │ 25.98ms │ 22.51ms │ +1.15x faster │
└──────────────┴───────────┴────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary ┃ ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (main) │ 38166.21ms │
│ Total Time (test_default_parquet_push_down) │ 38412.36ms │
│ Average Time (main) │ 887.59ms │
│ Average Time (test_default_parquet_push_down) │ 893.31ms │
│ Queries Faster │ 16 │
│ Queries Slower │ 11 │
│ Queries with No Change │ 16 │
└───────────────────────────────────────────────┴────────────┘ |
It helps part of the regression about the read record/skip record too dense, which is the original regression: Here is the result for page cache without this PR: Q30 / Q31 no regression now for current PR: │ QQuery 30 │ 420.68ms │ 431.94ms │ no change │
│ QQuery 31 │ 571.58ms │ 528.87ms │ +1.08x faster │ But Q24 -> Q 28 still have regression, same with original result: │ QQuery 24 │ 273.40ms │ 386.72ms │ 1.41x slower │
│ QQuery 25 │ 274.14ms │ 370.83ms │ 1.35x slower │
│ QQuery 26 │ 320.12ms │ 435.73ms │ 1.36x slower │
│ QQuery 27 │ 900.06ms │ 1354.63ms │ 1.51x slower │
│ QQuery 28 │ 7812.82ms │ 9813.62ms │ 1.26x slower │ |
BTW there is a benchmark for RowSelection: https://github.com/apache/arrow-rs/blob/a2cc42639b4ad5579d052e6f2317a413e2407e0f/parquet/benches/row_selector.rs#L1-L0 |
These queries have a predicate like WHERE "SearchPhrase" <> '' But For example SELECT "SearchEngineID", "ClientIP", COUNT(*) AS c, SUM("IsRefresh"), AVG("ResolutionWidth") FROM hits WHERE "SearchPhrase" <> '' GROUP BY "SearchEngineID", "ClientIP" ORDER BY c DESC LIMIT 10;
These queries have the same predicate WHERE "SearchPhrase" <> '' But in this case For example SELECT "SearchPhrase" FROM hits WHERE "SearchPhrase" <> '' ORDER BY "EventTime" LIMIT 10; |
Thank you @alamb , good finding, so in theory we can combine the unified select(this PR) and also the page cache, in theory we can get the best performance until now. I will try to do a poc. |
/// Unlike intersection, the `other` [`BooleanRowSelection`] must have exactly as many set bits as `self`. | ||
/// This method will keep only the bits in `self` that are also set in `other` | ||
/// at the positions corresponding to `self`'s set bits. | ||
pub fn and_then(&self, other: &Self) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be able to use bitwise and
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will refactor all this new file to the enum file, and we can remove this separate boolean selector.
Updated, after investigation, i found the root cause for page cache PR use more time to decode pages, i will try to update the polish_page_cache PR to address the changes, i hope we can solve all the regression for the page cache. |
Which issue does this PR close?
Rationale for this change
What changes are included in this PR?
Are there any user-facing changes?