Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

astype(copy=False) is ambiguous when only changing signedness #867

Open
crusaderky opened this issue Dec 6, 2024 · 1 comment
Open

astype(copy=False) is ambiguous when only changing signedness #867

crusaderky opened this issue Dec 6, 2024 · 1 comment

Comments

@crusaderky
Copy link

crusaderky commented Dec 6, 2024

XREFs

astype specifies:

copy (bool) – specifies whether to copy an array when the specified dtype matches the data type of the input array x. If True, a newly allocated array must always be returned. If False and the specified dtype matches the data type of the input array, the input array must be returned; otherwise, a newly allocated array must be returned. Default: True.

I think that the definition of copy=False is unclear when two dtypes only differ in signedness (e.g. int64 vs uint64) so one could be a view of the other. This is particularly true if you consider that astype has no option for elementwise validation vs. over/underflow.

"A newly allocated array" could be interpreted either as

  • new python object around a new (deep-copied) buffer, OR
  • new python object, possibly pointing to the same buffer (a "view" in numpy speech).

In numpy, astype unnecessarily deep-copies. I suggested changing its behaviour at numpy/numpy#27509 but the feedback was that the behaviour of numpy is unlikely to change as it would be a breaking change. However, there is no reason why other libraries ascribing to the array API would need to replicate numpy's behaviour.

My proposal here is to add a clause

If False and the specified dtype matches the data type of the input array, the input array must be returned; otherwise, a newly allocated array must be returned, which may or may not share memory with the input array.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants