-
Hello, so suppose that I want to do this with an awkward array:
How would I do this? Is there any current support for this operation? Thanks!! |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 6 replies
-
In-place assignment isn't supported as a design choice. There are two corners of "parameter space" we could have chosen:
I chose the latter (after initial experience with an early version that did allow in-place assignment). Defensive copies would be prohibitive for large data structures, such as records with many fields (which are common). The choice to make everything immutable was made for performance (both speed and memory), which might sound surprising, considering that in-place mutations are used in NumPy for performance reasons (both speed and memory). NumPy has this problem, too, but not as acutely. Some operations cause views, some cause copies: >>> import numpy as np
>>> original = np.array([1.1, 2.2, 3.3, 4.4, 5.5, 6.6, 7.7, 8.8, 9.9])
>>> sliced = original[::2]
>>> advanced = original[[0, 2, 4, 6, 8]]
>>> sliced
array([1.1, 3.3, 5.5, 7.7, 9.9])
>>> advanced
array([1.1, 3.3, 5.5, 7.7, 9.9])
>>> original[6] = 999
>>> sliced # this is a view; it has changed
array([ 1.1, 3.3, 5.5, 999. , 9.9])
>>> advanced # this is a copy; it has NOT changed
array([1.1, 3.3, 5.5, 7.7, 9.9]) (I think PyTorch has taken a positive step by making all the view operations have a different naming convention from the copy ones.) This is a bit of an issue in NumPy because you have to be careful to check for view-vs-copy. It's endemic in Pandas (search for "SettingwithCopyWarning"). But it's harder in Awkward Array because rather than having one buffer that might be a view or might be a copy, almost all operations give you a tree with buffers attached to all the nodes of that tree in which some nodes are views and other nodes are new buffers. Which are views and which are new buffers is subject to change. This PR, for example. Therefore, the Awkward Array library itself does not include any operations that change the values of these buffers in place. You can do this kind of assignment in place: >>> import awkward as ak
>>> original = ak.Array([{"x": 1}, {"x": 2}, {"x": 3}])
>>> original["y"] = 10
>>> original
<Array [{x: 1, y: 10}, ... {x: 3, y: 10}] type='3 * {"x": int64, "y": int64}'> but that's actually not changing any buffers in place: it's creating a new tree structure with the new buffer ( The other exception is that while Awkward Array doesn't define any in-place operations, nothing is stopping you from casting an Awkward Array (or part of one) as a NumPy array and changing it in place. This can potentially have long-range consequences, so if you do this, you'll have to be aware of its history. For example, it's fine to change in place an array that you have just created—you know exactly where it's been. In the online documentation, Mutability of Awkward Arrays from NumPy and Mutability of Awkward Arrays converted to NumPy discusses how to do this and what the issues are. Note that you can also cast Awkward Arrays as NumPy arrays in Numba and assign to them (PR #550). |
Beta Was this translation helpful? Give feedback.
-
I also should have given you a direct answer to your question: >>> a = ak.Array([1000, 2000, 3000, 4000, 5000, 6000])
>>> a
<Array [1000, 2000, 3000, 4000, 5000, 6000] type='6 * int64'>
>>> np.asarray(a)[a > 4000] = 4000
>>> a
<Array [1000, 2000, 3000, 4000, 4000, 4000] type='6 * int64'> |
Beta Was this translation helpful? Give feedback.
In-place assignment isn't supported as a design choice. There are two corners of "parameter space" we could have chosen:
I chose the latter (after initial experience with an early version that did allow in-place assignment). Defensive copies would be prohibitive for large data structures, such as records with many fields (which are common). The choice to make everything immutable was made for performance (both speed and memory), which might sound surprising, consid…