Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Support loading from datasets where the hive columns are also stored in the file #17203

Merged
merged 11 commits into from
Jun 26, 2024

Conversation

nameexhaustion
Copy link
Collaborator

@nameexhaustion nameexhaustion commented Jun 26, 2024

This PR makes it so that we don't load hive partitioned columns contained in the files.

Resolves #12041

@github-actions github-actions bot added enhancement New feature or an improvement of an existing feature python Related to Python Polars rust Related to Rust Polars labels Jun 26, 2024
Copy link

codecov bot commented Jun 26, 2024

Codecov Report

Attention: Patch coverage is 97.72727% with 2 lines in your changes missing coverage. Please review.

Project coverage is 80.81%. Comparing base (332e40a) to head (0d77247).
Report is 13 commits behind head on main.

Files Patch % Lines
crates/polars-io/src/parquet/read/read_impl.rs 96.29% 1 Missing ⚠️
...ates/polars-plan/src/plans/conversion/dsl_to_ir.rs 95.45% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #17203      +/-   ##
==========================================
- Coverage   80.81%   80.81%   -0.01%     
==========================================
  Files        1464     1466       +2     
  Lines      192019   192293     +274     
  Branches     2743     2745       +2     
==========================================
+ Hits       155185   155402     +217     
- Misses      36323    36386      +63     
+ Partials      511      505       -6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ritchie46 ritchie46 merged commit 4d2f928 into pola-rs:main Jun 26, 2024
26 checks passed
@nameexhaustion nameexhaustion deleted the hive-cols-in-file branch July 8, 2024 12:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or an improvement of an existing feature python Related to Python Polars rust Related to Rust Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Provide a way to de-conflict columns that come from hive partitioning vs what's in a physical file
2 participants