-
Notifications
You must be signed in to change notification settings - Fork 908
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API: Check for integer overflows when creating scalar form python int #16140
API: Check for integer overflows when creating scalar form python int #16140
Conversation
/okay to test |
1 similar comment
/okay to test |
cfbbefb
to
d8d324d
Compare
/okay to test |
1 similar comment
/okay to test |
def test_scalar_out_of_bounds_pyint_fails(value, dtype): | ||
# Test that we align with NumPy on scalar creation behavior from | ||
# Python integers. | ||
np_ver = np.lib.NumpyVersion(np.__version__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use packaging.version.parse
here if you don't want to dip into np.lib
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed also here now, thanks. (And rebased.)
@@ -47,7 +47,9 @@ | |||
) | |||
|
|||
|
|||
_DEFAULT_CATEGORICAL_VALUE = -1 | |||
# Using np.int8(-1) to allow silent wrap-around when casting to uint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing we should disallow this cast if this happens, but not for this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for not following up. NumPy allows np.int8(-1)
to be cast to silently to other integers pretty easily including unsigned, which is what we rely on here.
If NumPy wants to change that, the _DEFAULT_CATEGORICAL_VALUE
might have to be some mapping/function based on the actual dtype used, but for now this works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An optional fix https://github.com/rapidsai/cudf/pull/16140/files#r1671415931 but overall looks good
This aligns with NumPy, which deprecated this since a while and raises an error now, for example for `Scalar(-1, dtype=np.uint8)`. Necessary to ensure we give the right errors when promotion doesn't up-cast based on values.
7b569a1
to
39d240f
Compare
/okay to test |
Friendly ping, can we put these in soon? Or is there some release note update/dicussion needed? |
/merge |
This aligns with NumPy, which deprecated this since a while and raises an error now on NumPy 2, for example for
Scalar(-1, dtype=np.uint8)
.Since it aligns with NumPy, the DeprecationWarning of earlier NumPy versions is inherited for those.
This (or similar handling) is required to be compatible with NumPy 2/pandas, since the default needs to be to reject operation when values are out of bounds for e.g.
uint8_series + 1000
, the 1000 should not be silently cast to auint8
.Split from gh-15897
xref: rapidsai/build-planning#38