-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support selection pruning #17
Comments
Take inspiration from how parquet handles exposing the necessary information/behaviour: https://docs.rs/parquet/latest/parquet/arrow/arrow_reader/type.ParquetRecordBatchReaderBuilder.html |
I'd like to do some research for this PR. |
It seems that there is no implementation of predicate pushdown in arrow-rs. I think the implementation in DataFusion can be used as a reference to determine what APIs need to be provided for ORC. |
Yeah this issue is a port from when this repo was split from the datafusion code; I think it should still be applicable for having APIs in this crate that'll enable pushdown from the datafusion code |
Make use of file statistics, stripe statistics, column statistics, row group indexes, and bloom filters
Need way to expose this functionality so users (like datafusion) can utilize to efficiently query large ORC files, e.g. via predicate pushdown
The text was updated successfully, but these errors were encountered: