Skip to content

feat(python)!: Introduce CopyNotAllowedError #14350

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from
Closed

Conversation

stinodego
Copy link
Contributor

@stinodego stinodego commented Feb 7, 2024

Ref #14334

This error already existed in the interchange protocol. I am making it a proper Polars error so we can re-use it in our various interop methods.

I updated Series.to_numpy using this error & improved the error messages. Example:

import polars as pl

pl.Series([True, False]).to_numpy(use_pyarrow=False, zero_copy_only=True)
# polars.exceptions.CopyNotAllowedError: bit packed boolean buffer must be converted into byte packed boolean buffer

This is breaking because Series.to_numpy currently raises a ValueError on copy.

I am open to some bikeshedding on the name of the new error type.

@github-actions github-actions bot added enhancement New feature or an improvement of an existing feature python Related to Python Polars labels Feb 7, 2024
@stinodego stinodego marked this pull request as ready for review February 7, 2024 19:19
@mcrumiller
Copy link
Contributor

mcrumiller commented Feb 7, 2024

I saw the other issue about this regarding re-chunking in place, and I agree that the zero_copy_only probably doesn't capture the desirable path of "I would prefer a zero-copy if possible, but if you have to then okay, copy it." An allow_copy parameter allows for this and also helps with the rechunking issue, which is to say that calling to_numpy may require a rechunk.

If I had to boil down all of the possibilities when performing something like to_numpy, it would be the following:

  1. The user prefers that no data be copied, and the operation can do this.
  2. The user prefers that no data be copied, but the operation requires copying data.
  3. The user demands that no data be copied, and the operation can do this.
  4. The user demands that no data be copied, but the operation requires copying data.
  5. The user demands that data be copied, and the operation can always do this.

allow_copy=True covers (1) and (2) if the op tries to do no copy, and (3) and (4) when False. (5) may not need a parameter, the user can just rechunk first.

@stinodego
Copy link
Contributor Author

stinodego commented Feb 7, 2024

I think you misunderstood what I was trying to say. allow_copy would be exactly the same as what zero_copy_only is now.

We will always try to do things without copying. Sometimes the user wants to be certain no copy will happen, which is what the parameter is for. And for the case where the user wants to be certain they have a writable result, they can use the other parameter writable, which will copy if it wasn't copied already. Together, this covers all cases.

@stinodego stinodego added the A-exceptions Area: exception handling label Feb 7, 2024
@stinodego stinodego changed the title feat(python): Introduce CopyNotAllowedError feat(python)!: Introduce CopyNotAllowedError Mar 1, 2024
@stinodego stinodego added this to the 1.0.0 milestone Mar 1, 2024
@github-actions github-actions bot added the breaking Change that breaks backwards compatibility label Mar 1, 2024
@stinodego stinodego force-pushed the zero-copy-warning branch from 84fe575 to f683875 Compare March 1, 2024 15:46
Copy link

codecov bot commented Mar 1, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 80.93%. Comparing base (fd27223) to head (f683875).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #14350      +/-   ##
==========================================
- Coverage   80.93%   80.93%   -0.01%     
==========================================
  Files        1325     1325              
  Lines      171960   171961       +1     
  Branches     2452     2452              
==========================================
- Hits       139184   139175       -9     
- Misses      32305    32315      +10     
  Partials      471      471              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@deanm0000
Copy link
Collaborator

Maybe call it CopyRequiredError or CopyUnavoidableError.

@stinodego
Copy link
Contributor Author

Maybe call it CopyRequiredError or CopyUnavoidableError.

CopyRequiredError sounds quite nice but it's a bit confusing in the sense that the fact that a copy is required isn't the cause of the error - the direct cause is that such a copy was explicitly disallowed by the user. If the default was allow_copy=False, then I would be in favor. But since it's the other way around, I am leaning in favor of CopyNotAllowedError.

@stinodego
Copy link
Contributor Author

stinodego commented May 22, 2024

This approach no longer works since the errors are now raised from Rust. Unfortunately, our Rust exceptions cannot subclass multiple exceptions - I would like CopyNotAllowedError to subclass both a RuntimeError and a PolarsError.

I'll close this for now, perhaps we can solve this nicely in the future when exception hierarchies are supported in PyO3 with the abi3 feature: PyO3/pyo3#3475

For now, these will raise a RuntimeError.

@stinodego stinodego closed this May 22, 2024
@stinodego stinodego removed this from the 1.0.0 milestone May 23, 2024
@stinodego stinodego deleted the zero-copy-warning branch June 7, 2024 21:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-exceptions Area: exception handling breaking Change that breaks backwards compatibility enhancement New feature or an improvement of an existing feature python Related to Python Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants