Closed
Description
It is great that xarray
supports dimensions without coords, but sometimes I think it would be useful to be able to easily opt into autogenerated coords from 0 to n-1
. This can be useful to obtain DataArrays for pointwise indexing:
import xarray as xr
ds = xr.Dataset()
# a list of selected indices for each layer
ds['selected'] = (['layer', 'selected-i'], [
[0, 1, 2],
[1, 5, 3],
])
# normally, concatenation would drop layer data
print(xr.concat(ds['selected'], dim='selected-i'))
# <xarray.DataArray 'selected' (selected-i: 6)>
# array([0, 1, 2, 1, 5, 3])
# Dimensions without coordinates: selected-i
# if you generate coords from 0 to n-1 for layer, however, the resulting DataArray
# contains 'layer' indices for use in pointwise indexing
print(xr.concat(ds
.assign_coords(layer=list(range(ds.sizes['layer'])))
['selected'], dim='selected-i'))
# <xarray.DataArray 'selected' (selected-i: 6)>
# array([0, 1, 2, 1, 5, 3])
# Coordinates:
# layer (selected-i) int64 0 0 0 1 1 1
# Dimensions without coordinates: selected-i
My issue with the above is that layer=list(range(ds.sizes['layer']))
is verbose and fails to be DRY. My thought for such an API is that xarray could maybe have a special constant for auto-assignment, usable in any method that takes input coords:
print(xr.concat(ds
.assign_coords(layer=xr.AUTO)
['selected'], dim='selected-i'))
(Additionally, perhaps xr.AUTO
could be a function/class, so that xr.AUTO(start)
produces indices starting at start
.)