Skip to content

Fit bounding box to coarser resolution #2793

Open
@davidbrochart

Description

@davidbrochart

When using coarsen, we often need to align the original DataArray with the coarser coordinates. For instance:

import xarray as xr
import numpy as np

da = xr.DataArray(np.arange(4*4).reshape(4, 4), coords=[np.arange(4, 0, -1) + 0.5, np.arange(4) + 0.5], dims=['lat', 'lon'])
# <xarray.DataArray (lat: 4, lon: 4)>
# array([[ 0,  1,  2,  3],
#        [ 4,  5,  6,  7],
#        [ 8,  9, 10, 11],
#        [12, 13, 14, 15]])
# Coordinates:
#   * lat      (lat) float64 4.5 3.5 2.5 1.5
#   * lon      (lon) float64 0.5 1.5 2.5 3.5

da.coarsen(lat=2, lon=2).mean()
# <xarray.DataArray (lat: 2, lon: 2)>
# array([[ 2.5,  4.5],
#        [10.5, 12.5]])
# Coordinates:
#   * lat      (lat) float64 4.0 2.0
#   * lon      (lon) float64 1.0 3.0

But if the coarser coordinates are aligned like:

lat: ... 5 3 1 ...
lon: ... 1 3 5 ...

Then directly applying coarsen will not work (here on the lat dimension). The following function extends the original DataArray so that it is aligned with the coarser coordinates:

def adjust_bbox(da, dims):
    """Adjust the bounding box of a DaskArray to a coarser resolution.

    Args:
        da: the DaskArray to adjust.
        dims: a dictionary where keys are the name of the dimensions on which to adjust, and the values are of the form [unsigned_coarse_resolution, signed_original_resolution]
    Returns:
        The DataArray bounding box adjusted to the coarser resolution.
    """
    coords = {}
    for k, v in dims.items():
        every, step = v
        offset = step / 2
        dim0 = da[k].values[0] - offset
        dim1 = da[k].values[-1] + offset
        if step < 0: # decreasing coordinate
            dim0 = dim0 + (every - dim0 % every) % every
            dim1 = dim1 - dim1 % every
        else: # increasing coordinate
            dim0 = dim0 - dim0 % every
            dim1 = dim1 + (every - dim1 % every) % every
        coord0 = np.arange(dim0+offset, da[k].values[0]-offset, step)
        coord1 = da[k].values
        coord2 = np.arange(da[k].values[-1]+step, dim1, step)
        coord = np.hstack((coord0, coord1, coord2))
        coords[k] = coord
    return da.reindex(**coords).fillna(0)

da = adjust_bbox(da, {'lat': (2, -1), 'lon': (2, 1)})
# <xarray.DataArray (lat: 6, lon: 4)>
# array([[ 0.,  0.,  0.,  0.],
#        [ 0.,  1.,  2.,  3.],
#        [ 4.,  5.,  6.,  7.],
#        [ 8.,  9., 10., 11.],
#        [12., 13., 14., 15.],
#        [ 0.,  0.,  0.,  0.]])
# Coordinates:
#   * lat      (lat) float64 5.5 4.5 3.5 2.5 1.5 0.5
#   * lon      (lon) float64 0.5 1.5 2.5 3.5

da.coarsen(lat=2, lon=2).mean()
# <xarray.DataArray (lat: 3, lon: 2)>
# array([[0.25, 1.25],
#        [6.5 , 8.5 ],
#        [6.25, 7.25]])
# Coordinates:
#   * lat      (lat) float64 5.0 3.0 1.0
#   * lon      (lon) float64 1.0 3.0

Now coarsen gives the right result. But adjust_bbox is rather complicated and specific to this use case (evenly spaced coordinate points...). Do you know of a better/more general way of doing it?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions