-
Notifications
You must be signed in to change notification settings - Fork 1.5k
feat: support uint data page extraction #11018
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -1352,7 +1352,7 @@ async fn test_uint() { | |||
expected_null_counts: UInt64Array::from(vec![0, 0, 0, 0, 0]), | |||
expected_row_counts: Some(UInt64Array::from(vec![4, 4, 4, 4, 4])), | |||
column_name: "u8", | |||
check: Check::RowGroup, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Turning this to Both
w/o adding the type above causes a test failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks @tshauck 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which issue does this PR close?
Closes #10952
Rationale for this change
Supports additional data types for data page stats.
What changes are included in this PR?
Parse the uints for data page stats. It's adapted from how it works for row groups...
datafusion/datafusion/core/src/datasource/physical_plan/parquet/statistics.rs
Lines 321 to 336 in 5bfc11b
Are these changes tested?
Yes, updates the unit tests to cover row groups and data pages.
Are there any user-facing changes?
No