Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(python): raise if DF is input to context (with_columns, select, etc) #15598

Closed

Conversation

deanm0000
Copy link
Collaborator

addresses #15588

@github-actions github-actions bot added fix Bug fix python Related to Python Polars labels Apr 11, 2024
@MarcoGorelli
Copy link
Collaborator

nice - fancy adding a little test too?

@deanm0000
Copy link
Collaborator Author

A test to verify that it errors?

@MarcoGorelli
Copy link
Collaborator

Yup, you can use the pytest.raises context manager

@deanm0000
Copy link
Collaborator Author

well that was a good callout b/c there was a way that is snuck by that is now fixed.

Copy link

codecov bot commented Apr 11, 2024

Codecov Report

Attention: Patch coverage is 62.50000% with 3 lines in your changes are missing coverage. Please review.

Project coverage is 81.20%. Comparing base (31df06d) to head (c42184c).
Report is 61 commits behind head on main.

Files Patch % Lines
py-polars/polars/functions/lit.py 62.50% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #15598      +/-   ##
==========================================
+ Coverage   81.14%   81.20%   +0.06%     
==========================================
  Files        1362     1370       +8     
  Lines      174951   175696     +745     
  Branches     2533     2533              
==========================================
+ Hits       141960   142677     +717     
- Misses      32508    32543      +35     
+ Partials      483      476       -7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@stinodego
Copy link
Contributor

Maybe this should be parsed as a struct column instead of raising?

@deanm0000
Copy link
Collaborator Author

Maybe this should be parsed as a struct column instead of raising?

If they want a struct they can just df.to_struct(). I have three reasons for thinking it's not a great idea to do this for people behind the scenes:

  1. If they're putting one df as the input to another df's with_columns/select then they're doing an anti-pattern and they should go ask for help so the error is the best thing for them
  2. People are likely to be more befuddled by getting a struct where they once got individual columns so it'll feel more like changing behavior without notice than fixing the bug that let them accidentally do the thing.
  3. It's a very pandas thing to do to try to anticipate and fix the way people do things incorrectly.

I was scratching my head on suggestions to include in the error message. I was thinking of "hey maybe you need a join or an hstack or to just refactor generally" but there are too many possibilities of what they're doing wrong that I just went with the short one that you see.

@deanm0000 deanm0000 requested a review from stinodego April 13, 2024 19:40
@stinodego
Copy link
Contributor

stinodego commented Apr 17, 2024

Thanks for the PR, but after thinking on this for a bit, this should not raise at all. The existing behavior is correct. Please see the linked issue.

@stinodego stinodego closed this Apr 17, 2024
@deanm0000 deanm0000 deleted the disallow_df_input_to_with_columns branch August 26, 2024 14:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fix Bug fix python Related to Python Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants