Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(rust,python): use actual number of read rows for hive materialization #11690

Merged
merged 8 commits into from
Oct 13, 2023

Conversation

nameexhaustion
Copy link
Collaborator

@nameexhaustion nameexhaustion commented Oct 12, 2023

Fixes #11682

md.num_rows() cannot be used as column_idx_to_series may not read the entire row group when there is a SELECT->LIMIT

@github-actions github-actions bot added fix Bug fix python Related to Python Polars rust Related to Rust Polars labels Oct 12, 2023
@nameexhaustion nameexhaustion marked this pull request as ready for review October 12, 2023 13:23
@@ -99,9 +99,11 @@ pub(super) fn array_iter_to_series(
fn materialize_hive_partitions(
df: &mut DataFrame,
hive_partition_columns: Option<&[Series]>,
num_rows: usize,
num_rows: Option<usize>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this Some passed? I only see it called with None.

Copy link
Collaborator Author

@nameexhaustion nameexhaustion Oct 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, nowhere actually - will remove the parameter.

There was a note on top of the function We have a special num_rows arg, as df can be empty., but there doesn't seem to be any need to special handling in this case - I added a test that loads .head(0) and it seems to work fine.

@nameexhaustion nameexhaustion marked this pull request as draft October 12, 2023 23:04
@nameexhaustion nameexhaustion marked this pull request as ready for review October 13, 2023 00:00
Copy link
Member

@ritchie46 ritchie46 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the quick fix @nameexhaustion. Much appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fix Bug fix python Related to Python Polars rust Related to Rust Polars
Projects
None yet
2 participants