Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce PrimitiveValueDecoder to enable batch decoding of values #125

Merged
merged 1 commit into from
Sep 23, 2024

Conversation

Jefffrey
Copy link
Collaborator

Though the benchmarks are kinda simple, they identified (via flamegraph) that there was a significant amount of time being spent on next() calls in RleReaderV2. So introduce a new PrimitiveValueDecoder to enable batch decoding support, currently only properly implemented for integer arrays with no null buffer. Even so, benchmarks show significant improvement:

datafusion-orc$ cargo bench
   Compiling orc-rust v0.3.1 (/home/jeffrey/Code/datafusion-orc)
    Finished `bench` profile [optimized + debuginfo] target(s) in 11.65s
     Running unittests src/lib.rs (/media/jeffrey/1tb_860evo_ssd/.cargo_target_cache/release/deps/orc_rust-53e462447a03afd6)

...

     Running benches/arrow_reader.rs (/media/jeffrey/1tb_860evo_ssd/.cargo_target_cache/release/deps/arrow_reader-419fc98a1d304edb)
Benchmarking sync reader: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.0s, or reduce sample count to 60.
sync reader             time:   [79.523 ms 79.618 ms 79.721 ms]
                        change: [-42.234% -42.072% -41.919%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  5 (5.00%) high mild
  1 (1.00%) high severe

Benchmarking async reader: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.0s, or reduce sample count to 60.
async reader            time:   [80.297 ms 81.019 ms 81.933 ms]
                        change: [-42.186% -41.637% -40.978%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
  3 (3.00%) high mild
  7 (7.00%) high severe

Next steps would be to use this in more places (over current Iterator::next() approach), with accompanying benchmarks to prove its a performance improvement.

@WenyXu WenyXu merged commit a6e7ca9 into main Sep 23, 2024
11 checks passed
@WenyXu WenyXu deleted the primitive-value-decoder branch September 23, 2024 02:23
@WenyXu
Copy link
Collaborator

WenyXu commented Sep 23, 2024

Thanks 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants