-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: set functions and tests #57
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implements the four unique functions in a way that flattens all nested list structure before returning a result that consists of just the unique values.
- Does it work for scalars, such as
ragged(3.14)
? (Note that there are no square brackets in that expression.) - I've noticed that these functions are returning tuples. The specification requires namedtuples.
The fact that the output is based on flattened arrays makes me wonder what will happen for unique_inverse, since this needs to return inverse indices that would reconstruct the array, with all of its nested structure. Using NumPy as an example:
>>> import numpy as np
>>> array = np.array([[4, 3, 4, 2], [5, 2, 4, 4], [3, 3, 3, 2]])
>>> values, inverse_indices = np.unique_inverse(array)
>>> values
array([2, 3, 4, 5])
>>> inverse_indices
array([[2, 1, 2, 0],
[3, 0, 2, 2],
[1, 1, 1, 0]])
>>> values[inverse_indices]
array([[4, 3, 4, 2],
[5, 2, 4, 4],
[3, 3, 3, 2]])
where values[inverse_indices]
reconstructs the original array
. Our arrays can do that kind of slice only if they have regular dimensions (the shape
has no None
values in it). Perhaps ragged.array
should support the multidimensional unique_inverse
case only if all of its dimensions are regular? Other cases should be detected and raise NotImplementedError
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ohrechykha - good start! Please, do follow up on Jim's comments. Please, also run pre-commit run --all
to fix the CI errors locally.
@jpivarski - I'm not sure why the tests with pytest fail in CI. Do we need to update awkward
version? or pin numpy
to 1.x?
This is funny: when doing [err=f"Expected ragged type but got {type(x)}", raise TypeError(err)] in specified lines, mypy says it's an unreachable statement. When I switch this back to [raise TypeError(f"Expected ragged type but got {type(x)}")] ruff says exception must be assigned to variable first... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In all, it's a good suite of tests and a good implementation for the common case—more than zero dimensions and more than length 1. (A question that just occurred to me: does it work for length 0 inputs?)
@ianna, we should check this for CUDA arrays, too. Code like
ragged.array([x])
where x
is a zero-dimensional NumPy array (from ragged.array._impl
when the ragged array's number of dimensions is also zero) would work but it would convert it to a Python object because it's inside of a Python list:
>>> x = np.array(3.14, np.float32)
>>> y = ragged.array([x])
>>> x.dtype, y.dtype
(dtype('float32'), dtype('float64'))
(The above can be a regular test.) If it's a CuPy array, the implicit conversion would also copy the data from the GPU to the CPU in order to make the Python object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ohrechykha - please, check the failing jobs in the CI: it gives some indication how to fix them. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking good! My previous comments were all addressed, and my new comments are all inline:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ohrechykha - looks great! All comments were addressed and the tests pass. I'll merge it now.
No description provided.