-
Descriptors for class >>> import awkward as ak
>>> ak.__version__
1.2.0
>>> array = ak.repartition([{"x": x, "y": x * 10} for x in range(10)], 2)
>>> array.layout
<IrregularlyPartitionedArray>
<partition start="0" stop="2">
<RecordArray length="2">
<field index="0" key="x">
<NumpyArray format="l" shape="2" data="0 1" at="0x00010f038200"/>
</field>
<field index="1" key="y">
<NumpyArray format="l" shape="2" data="0 10" at="0x00010f03a200"/>
</field>
</RecordArray>
</partition>
<partition start="2" stop="4">
<RecordArray length="2">
<field index="0" key="x">
<NumpyArray format="l" shape="2" data="2 3" at="0x00010f038200"/>
</field>
<field index="1" key="y">
<NumpyArray format="l" shape="2" data="20 30" at="0x00010f03a200"/>
</field>
</RecordArray>
</partition>
</IrregularlyPartitionedArray>
>>> ak.partitions(array)
[2,2]
>>> array.partitions
AttributeError: no field named 'partitions' According to the docs, this should be a valid repartition? The tests do not cover >>> one = ak.from_iter([[1.1, 2.2, 3.3], [], [4.4, 5.5]], highlevel=False)
>>> two = ak.from_iter([[6.6], [], [], [], [7.7, 8.8, 9.9]], highlevel=False)
>>> array = ak.partition.IrregularlyPartitionedArray([one, two])
>>> array.layout
AttributeError: no field named 'layout' According to the docs, I would expect this to work as
|
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Both of these are problems with mixing high-level and low-level arrays. In the first example, In the second example, you created a low-level You might be trying to iterate over partitions. There isn't actually a function for that (other than using the numbers from ak.partitions to make slices in a >>> array = ak.repartition(np.arange(100), 10)
>>> # high-level array
>>> array
<Array [0, 1, 2, 3, 4, ... 95, 96, 97, 98, 99] type='100 * int64'>
>>> # low-level partitions
>>> array.layout.partitions
[
<NumpyArray format="l" shape="10" data="0 1 2 3 4 5 6 7 8 9" at="0x562fa91d9a70"/>,
<NumpyArray format="l" shape="10" data="10 11 12 13 14 15 16 17 18 19" at="0x562fa91d9a70"/>,
<NumpyArray format="l" shape="10" data="20 21 22 23 24 25 26 27 28 29" at="0x562fa91d9a70"/>,
<NumpyArray format="l" shape="10" data="30 31 32 33 34 35 36 37 38 39" at="0x562fa91d9a70"/>,
<NumpyArray format="l" shape="10" data="40 41 42 43 44 45 46 47 48 49" at="0x562fa91d9a70"/>,
<NumpyArray format="l" shape="10" data="50 51 52 53 54 55 56 57 58 59" at="0x562fa91d9a70"/>,
<NumpyArray format="l" shape="10" data="60 61 62 63 64 65 66 67 68 69" at="0x562fa91d9a70"/>,
<NumpyArray format="l" shape="10" data="70 71 72 73 74 75 76 77 78 79" at="0x562fa91d9a70"/>,
<NumpyArray format="l" shape="10" data="80 81 82 83 84 85 86 87 88 89" at="0x562fa91d9a70"/>,
<NumpyArray format="l" shape="10" data="90 91 92 93 94 95 96 97 98 99" at="0x562fa91d9a70"/>
]
>>> # number of entries in each partition
>>> ak.partitions(array)
[10, 10, 10, 10, 10, 10, 10, 10, 10, 10]
>>> # cumbersome way to iterate over partitions
>>> start = 0
>>> for count in ak.partitions(array):
... stop = start + count
... print(repr(array[start:stop]))
... start = stop
...
<Array [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] type='10 * int64'>
<Array [10, 11, 12, 13, 14, ... 16, 17, 18, 19] type='10 * int64'>
<Array [20, 21, 22, 23, 24, ... 26, 27, 28, 29] type='10 * int64'>
<Array [30, 31, 32, 33, 34, ... 36, 37, 38, 39] type='10 * int64'>
<Array [40, 41, 42, 43, 44, ... 46, 47, 48, 49] type='10 * int64'>
<Array [50, 51, 52, 53, 54, ... 56, 57, 58, 59] type='10 * int64'>
<Array [60, 61, 62, 63, 64, ... 66, 67, 68, 69] type='10 * int64'>
<Array [70, 71, 72, 73, 74, ... 76, 77, 78, 79] type='10 * int64'>
<Array [80, 81, 82, 83, 84, ... 86, 87, 88, 89] type='10 * int64'>
<Array [90, 91, 92, 93, 94, ... 96, 97, 98, 99] type='10 * int64'> The biggest difference that partitions make is that every Awkward operation applies separately to each partition, returning a new partitioned array. the statement in the documentation is that these are not interface-visible differences (in the high-level view), but can be performance differences. |
Beta Was this translation helpful? Give feedback.
Both of these are problems with mixing high-level and low-level arrays.
In the first example,
array
is a high-level array whoselayout
is partitioned. The fact thatarray
is partitioned is not visible from thearray
level; you'd only know it if you delved into thelayout
(or used ak.partitions to get the length of each). To access the actual partitions, you could doarray.layout.partitions
, but this would be a low-level view.In the second example, you created a low-level
IrregularlyPartitionedArray
, which has nolayout
because it is a layout. If wrapped in anak.Array
constructor, it would behave like an unpartitioned high-level array.You might be trying to iterate over partitions. Ther…