Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(python): DataFrame rows_by_key returning key tuples with elements in wrong order #19486

Merged
merged 7 commits into from
Nov 18, 2024

Conversation

lukemanley
Copy link
Contributor

@lukemanley lukemanley commented Oct 27, 2024

Fixes the following and simplifies the implementation a bit.

In [1]: import polars as pl

In [2]: df = pl.DataFrame(
   ...:    ...:     {
   ...:    ...:         "a": ["a", "a"],
   ...:    ...:         "b": ["b", "b"],
   ...:    ...:         "c": [1, 2],
   ...:    ...:     }
   ...:    ...: )

In [3]: df.rows_by_key(["a", "b"])
Out[3]: defaultdict(list, {('a', 'b'): [1, 2]})   # <-- key tuple ordered as (a, b) as requested

In [4]: df.rows_by_key(["b", "a"])
Out[4]: defaultdict(list, {('a', 'b'): [1, 2]})   # <-- key tuple ordered as (a, b) rather than (b, a) as requested

@github-actions github-actions bot added fix Bug fix python Related to Python Polars rust Related to Rust Polars labels Oct 27, 2024
Copy link

codecov bot commented Oct 28, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 79.55%. Comparing base (5f11dd9) to head (c378400).
Report is 32 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff             @@
##             main   #19486       +/-   ##
===========================================
+ Coverage   59.87%   79.55%   +19.68%     
===========================================
  Files        1545     1545               
  Lines      213433   213405       -28     
  Branches     2442     2429       -13     
===========================================
+ Hits       127791   169782    +41991     
+ Misses      85092    43075    -42017     
+ Partials      550      548        -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@alexander-beedie
Copy link
Collaborator

Thanks; I'll take a look at this one later tonight (or tomorrow).
Looks much simplified indeed 👍

@alexander-beedie alexander-beedie self-assigned this Oct 28, 2024
@lukemanley
Copy link
Contributor Author

@alexander-beedie - sorry for the ping, just checking in on this.

@alexander-beedie
Copy link
Collaborator

alexander-beedie commented Nov 18, 2024

Apologies, got distracted and let this slip through the cracks - taking a look now.

Copy link
Collaborator

@alexander-beedie alexander-beedie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested various parameter permutations on a large-scale frame, and 5 of 7 combinations were faster with this new approach. Really like the simplification you made here.

I might take a look later to see if we can get the two permutations that were marginally slower (~15%) back to the same speed, but as the other 5 parameter combinations were about the same amount faster, this PR looks like a clear win to me (in addition to the correctness fix). Great job 👌

@alexander-beedie alexander-beedie merged commit f893f75 into pola-rs:main Nov 18, 2024
14 checks passed
@alexander-beedie alexander-beedie changed the title fix: DataFrame.rows_by_key returning key tuples with elements in wrong order fix(python): DataFrame rows_by_key returning key tuples with elements in wrong order Nov 18, 2024
@alexander-beedie alexander-beedie removed the rust Related to Rust Polars label Nov 18, 2024
@lukemanley lukemanley deleted the fix-rows-by-key branch December 6, 2024 11:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fix Bug fix python Related to Python Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants