Skip to content

Unify regridders #4754

Open
Open
@stephenworsley

Description

@stephenworsley

📰 Custom Issue

There are a collection of regridders in iris which each work slightly differently, but are similar enough that they could potentially benefit from unifying some of their behaviour and perhaps have them inherit from the same abstract class. This would also benefit regridders built outside of iris such as in iris-esmf-regrid. The current set of iris regridding schemes is:

  • Linear
  • Nearest
  • AreaWeighted
  • UnstructuredNearest
  • PointInCell

There is also the deprecated regridder in iris.experimental.regrid_conservative which is functionally replaced by iris-esmf-regrid.


The current set of regridders have the following quirks:

Linear and Nearest

  • The Linear and Nearest regridding schemes create a RectilinearRegridder (from analysis._regrid). This uses _RegularGridInterpolator (from analysis._scipy_interpolate) to calculate weights and then use those weights, this derives from scipy and is under a scipy copyright, though the code has developed significantly since it was copied from scipy.
  • Source and target grid must be lat/lon DimCoords.
  • Weights are not cached on RectilinearRegridder, though _RegularGridInterpolator does offer weight caching.
  • Weights matrix is a sparse csr_matrix for Linear, for Nearest, weights take the form of an index rather than a matrix.
  • Loops through 2D slices for calculations (rather than using transposition and reshaping like in iris-esmf-regrid).
  • Has interpolator method.
  • Has powerful _create_cube method which handles derived coordinates. Even handles derived coordinates covering the grid dimension via the use of a regridder function callback.
  • Lazy regridding is supported via map_complete_blocks.
  • Additional keywords: extrapolation_mode

AreaWeighted

  • Creates AreaWeightedRegridder (from analysis._area_weighted).
  • Source and target grid must be lat/lon DimCoords.
  • Weights are cached on __init__.
  • Uses python for loop for calculations rather than a sparse matrix.
  • Uses RectilinearRegridder._create_cube with the callback being equivalent to Linear. This means that a different regridding method is acting on the AuxCoords and the data.
  • The regridder function equivalent regrid_area_weighted_rectilinear_src_and_grid handles scalar grid coords, though the regridder class itself does not.
  • Requires special handling of masked data in calculations.
  • Lazy regridding is supported via map_complete_blocks.
  • Additional keywords: mdtol

UnstructuredNearest

  • Creates UnstructuredNearestNeighbourRegridder (from analysis.trajectory) which uses analysis.trajectory.interpolate for calculations.
  • Target grid must be lat/lon DimCoords, Source grid must be lat/lon AuxCoords of any dimensionality which map over the same cube dimensions.
  • Weights are not cached.
  • In the current calculation, "weights" are fancy indices.
  • Mask handling is not currently proper (see Regridding with iris.analysis.UnstructuredNearest does not preserve dtype and masks #4463), but could be made similar to Nearest.
  • Cube creation handled mostly by trajectory.interpolate rather than RectilinearRegridder._create_cube. This handles derived coords but not those which cover the grid dimensions, so a callback is not used.
  • Lazy regridding not supported.

PointInCell

  • Creates CurvilinearRegridder (from analysis._regrid)
  • Target grid must be lat/lon DimCoords, Source grid must be 2D lat/lon AuxCoords which map over the same cube dimensions.
  • Weights aren't cached on __init__ but are cached on __call__.
  • Weights matrix is a sparse csc_matrix (rather than the csr_matrix used in Linear).
  • Loops through 2D slices for calculations (rather than using transposition and reshaping like in iris-esmf-regrid).
  • Cube creation happens during perform inside loop over 2D slices. The resulting cube is a result of merging. Derived coordinates are not handled.
  • Unlike other regridders, caching happens within the __call__ function.
  • Lazy regridding not supported.
  • Additional keywords: weights

Suggestions

Currently, the most sophisticated regridder for metadata handling is RectilinearRegridder. The problem seems to be that other regridders have slightly different use cases so they can't fully take advantage of the infrastructure in RectilinearRegridder. These regridders could be improved by having them derive from a regridder class which is more generic than RectilinearRegridder. This could also help the creation of future regridders (like in iris-esmf-regrid). Some ideas for how this regridder might be structured:

_AbstractRegridder

  • Is an abstract class.
  • Source/target grid Coords can be 1D lat/lon DimCoords over two separate dimeansions or lat/lon AuxCoords of any dimensionality (likely just 1D or 2D) over common dimensions.
  • Additional kwargs are stored and passed into relevent methods.
  • Methods for extracting (and checking validity of) grid Coords implemented for DimCoords and AuxCoords with a view that this could be extended for Mesh in a subclass.
  • Store source and target grid.
  • Cache weights on __init__.
  • Methods for checking compatibility of cubes in call and fetching grid dimensions are implemented for DimCoords and AuxCoords with a view that this could be extended for Mesh in a subclass.
  • Methods for calculating and applying weights are probably not implemented here, though the application of weights is likely to be common for some regridders so it may be appropriate to have helper functions for common calculations.
  • More generic version of RectilinearRegridder._create_cube.
  • The new _create_cube method should call on methods for adding the grid Coords with a view that this could be extended for Mesh in a subclass.

Rough proposed common structure of regridders:

The following is copied from a comment in #4807 where it demonstrates a structure to aim for which would allow the possibility to derive from a common class. The aim here is to bring all regridder closer to this structure, improving their functionality along the way, and then refactor this structure in terms of class inheritance when that becomes possible.

def regrid(data, dims, regrid_info):
  ...
  return new_data

def create_cube(data, src, src_dims, tgt_coords, callback):
  ...
  return result_cube

def _prepare(src, tgt):
  ...
  return regrid_info

def _perform(cube, regrid_info):
  ...
  dims = get_dims(cube)
  data = regrid(cube.data, dims, regrid_info)
  callback = functools.partial(regrid, regrid_info=regrid_info)
  return create_cube(data, src, dims, tgt_coords, callback)

class Regridder:
  def __init__(src, tgt):
    ...
    self._regrid_info = _prepare(src, tgt)
  def __call__(src):
    ...
    return _perform(src, self._regrid_info)

Sub-tasks/Side-tasks

Aside from the larger task of refactoring all the regridders to derive from an abstract class, there are several smaller tasks which could help to unify the behaviour of regridders. Bear in mind some of these tasks may have redundancy with the larger refactoring task.

Metadata

Metadata

Labels

Type

No type

Projects

Status

No status

Status

🏗 In progress

Status

⚔ In Development

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions