Avoid unsafe casts from float to unsigned int #9964

QuLogic · 2025-01-19T10:57:10Z

~~Note that this adds .view to duck_array_ops; I'm not certain if this is new user API or not.~~

This works across all non-x86_64 architectures in Fedora. I did not test RISC-V, as that is not a primary architecture, but it is likely the same issue and maybe @U2FsdGVkX1 can give it a try.

Fixes Test failure on RISC-V platform #9815
Tests are fixed by the changes Avoid unsafe casts from float to unsigned int #9964 (comment)
User visible changes (including notable bug fixes) are documented in whats-new.rst
New functions/methods are listed in api.rst

Fixes pydata#9815

Illviljan · 2025-01-19T15:31:52Z

xarray/core/duck_array_ops.py

+        if xp == np:
+            # numpy currently doesn't have a view:
+            return data.view(*args, **kwargs)
+        return xp.view(data, *args, **kwargs)


Which package supports this? I can't find xp.view in the standard https://data-apis.org/array-api/draft/API_specification/index.html

Oh, I didn't realize this was implementing a standard when @kmuehlbauer brought it up in #9815. As in #9815, it works for NumPy and Dask. I don't know about any other packages other than tests don't seem to fail about it specifically.

I didn't check this fully, but to be honest, I also tried it the same way locally.

@Illviljan can you suggest a way forward here, so that it is usable for numpy and dask?

I can also try view and fall back to astype(copy=False) if that's more portable. And also not expose view so publicly if desired.

@dcherian Could you have a look here, too? This would unblock @QuLogic for some Fedora builds. Is the implementation of view the right way?

Using hasattr(__array_namespace__) implies the method is included in the array api standard. But the standard does not include a .view. That's why I simply don't think the __array_namespace__ should be used in this case.

But using .astype(copy=False) does indeed seem like a nice way forward, no?

view and astype are very different things no? view simply reinterprets the bytes as a different type, astype preserves the value and may change number of bytes to do so, for example. I'm not really sure what to do here. Perhaps just xfail? Or else you'll have to track down what netcdf-java does, which is the ultimate source of this _Unsigned convention IIRC

Just checking in here again to see if you had an idea which way to move forward.

Sorry @QuLogic for the delay. I'm not sure what's the best way forward. I've consulted GDAL's netcdf driver and they use some construct like static_cast<int>/static_cast<uint>.

@QuLogic According to the suggestions of @Illviljan and @dcherian (and while looking again numpy/numpy#22951) I've pushed a fix which I believe might fix the issue, or at least avoid the undefined cast. Would you mind checking this on your workflows? Thanks!

If this works we could clean up the PR and go ahead and merge.

xarray/coding/variables.py

for more information, see https://pre-commit.ci

kmuehlbauer · 2025-05-30T06:05:29Z

xarray/coding/variables.py

@@ -345,7 +345,14 @@ def encode(self, variable: Variable, name: T_Name = None):
        if fill_value is not None and has_unsigned:
            pop_to(encoding, attrs, "_Unsigned")
            # XXX: Is this actually needed? Doesn't the backend handle this?
-            data = duck_array_ops.astype(duck_array_ops.around(data), dtype)
+            signed_dtype = np.dtype(f"i{data.itemsize}")


@dcherian @Illviljan Does this agree with your suggestion? This first casts the rounded float to int (of same itemsize) and in a second step the int to the final intN (where N is the wanted itemsize).

@QuLogic Does this work on your machine type?

Yes, this passes tests on all architectures.

@QuLogic Great! Thanks for checking. How should we take it from here? From my perspective, we can drop the changes in duck_array_ops and add a code comment why there is the two stage casting. An entry to whats-new.rst would be good for visibility. Let me know, if you have the bandwidth atm? Otherwise I can take car of this.

add comment

kmuehlbauer · 2025-06-03T06:07:42Z

Thanks @QuLogic, thanks all!

QuLogic · 2025-06-04T04:04:51Z

Thanks for finishing this up; I didn't have much chance to look at it in depth the past week.

Avoid unsafe casts from float to unsigned int

fefab07

Fixes pydata#9815

QuLogic force-pushed the float-uint-casts branch from 6b366de to fefab07 Compare January 19, 2025 11:00

Illviljan reviewed Jan 19, 2025

View reviewed changes

TomNicholas added the topic-arrays related to flexible array support label Jan 19, 2025

Xeonacid mentioned this pull request Jan 23, 2025

updpatch: python-xarray 2025.01.1-2 felixonmars/archriscv-packages#4479

Merged

kmuehlbauer reviewed May 27, 2025

View reviewed changes

xarray/coding/variables.py Outdated Show resolved Hide resolved

Update xarray/coding/variables.py

7813826

github-actions bot added the topic-documentation label May 27, 2025

kmuehlbauer and others added 2 commits May 27, 2025 11:54

Merge branch 'main' into float-uint-casts

2e5a93c

[pre-commit.ci] auto fixes from pre-commit.com hooks

dabed1d

for more information, see https://pre-commit.ci

kmuehlbauer reviewed May 30, 2025

View reviewed changes

kmuehlbauer added 5 commits June 2, 2025 10:29

Update duck_array_ops.py

22edc3b

Update duck_array_ops.py

d82300a

Merge branch 'main' into float-uint-casts

d1c7d9b

Update whats-new.rst

89a50b6

Update variables.py

63e784e

add comment

kmuehlbauer added the plan to merge Final call for comments label Jun 2, 2025

kmuehlbauer merged commit d1fee09 into pydata:main Jun 3, 2025
38 checks passed

QuLogic deleted the float-uint-casts branch June 4, 2025 04:04

Uh oh!

Avoid unsafe casts from float to unsigned int #9964

Avoid unsafe casts from float to unsigned int #9964

Uh oh!

Conversation

QuLogic commented Jan 19, 2025 • edited by kmuehlbauer Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kmuehlbauer commented Jun 3, 2025

Uh oh!

QuLogic commented Jun 4, 2025

Uh oh!

Uh oh!

QuLogic commented Jan 19, 2025 •

edited by kmuehlbauer

Loading