-
Notifications
You must be signed in to change notification settings - Fork 932
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Series construction from numpy array with non-native byte order #18151
base: branch-25.04
Are you sure you want to change the base?
Conversation
@@ -2719,8 +2719,16 @@ def as_column( | |||
return as_column(arbitrary, dtype=dtype, nan_as_null=nan_as_null) | |||
elif arbitrary.dtype.kind in "biuf": | |||
from_pandas = nan_as_null is None or nan_as_null | |||
try: | |||
pa_array = pa.array(arbitrary, from_pandas=from_pandas) | |||
except pa.ArrowNotImplementedError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we check if an array is byte swapped? Instead of relying on generic not implemented error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, switch to check if the byte order is non native.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for starting on this so quickly! Left two small comments, feel free to ignore.
Note that this doesn't fully resolve #18149 since cupy
can also have byteswapped arrays and those still error (though in a different location, as documented in the original issue).
@@ -2719,8 +2731,15 @@ def as_column( | |||
return as_column(arbitrary, dtype=dtype, nan_as_null=nan_as_null) | |||
elif arbitrary.dtype.kind in "biuf": | |||
from_pandas = nan_as_null is None or nan_as_null | |||
return as_column( | |||
if not is_np_native_byteorder(arbitrary.dtype): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can just use dtype.isnative
here I believe, no need for a helper function: https://numpy.org/doc/2.2/reference/generated/numpy.dtype.isnative.html
if not is_np_native_byteorder(arbitrary.dtype): | |
if not arbitrary.dtype.isnative: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah great, didn't know about this API. Thanks!
return as_column( | ||
if not is_np_native_byteorder(arbitrary.dtype): | ||
# Not supported by pyarrow | ||
arbitrary = arbitrary.astype(arbitrary.dtype.newbyteorder()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's not native byteorder then swapping byteorder will get you native byteorder - I'd find specifying native byteorder explicitly would be a bit more explicit though:
arbitrary = arbitrary.astype(arbitrary.dtype.newbyteorder()) | |
arbitrary = arbitrary.astype(arbitrary.dtype.newbyteorder("=")) |
Description
closes #18149
Checklist