Description
📰 Custom Issue
There are a collection of regridders in iris which each work slightly differently, but are similar enough that they could potentially benefit from unifying some of their behaviour and perhaps have them inherit from the same abstract class. This would also benefit regridders built outside of iris such as in iris-esmf-regrid. The current set of iris regridding schemes is:
Linear
Nearest
AreaWeighted
UnstructuredNearest
PointInCell
There is also the deprecated regridder in iris.experimental.regrid_conservative which is functionally replaced by iris-esmf-regrid.
The current set of regridders have the following quirks:
Linear and Nearest
- The
Linear
andNearest
regridding schemes create aRectilinearRegridder
(from analysis._regrid). This uses_RegularGridInterpolator
(from analysis._scipy_interpolate) to calculate weights and then use those weights, this derives from scipy and is under a scipy copyright, though the code has developed significantly since it was copied from scipy. - Source and target grid must be lat/lon DimCoords.
- Weights are not cached on
RectilinearRegridder
, though_RegularGridInterpolator
does offer weight caching. - Weights matrix is a sparse csr_matrix for Linear, for Nearest, weights take the form of an index rather than a matrix.
- Loops through 2D slices for calculations (rather than using transposition and reshaping like in iris-esmf-regrid).
- Has
interpolator
method. - Has powerful
_create_cube
method which handles derived coordinates. Even handles derived coordinates covering the grid dimension via the use of a regridder function callback. - Lazy regridding is supported via
map_complete_blocks
. - Additional keywords:
extrapolation_mode
AreaWeighted
- Creates
AreaWeightedRegridder
(from analysis._area_weighted). - Source and target grid must be lat/lon DimCoords.
- Weights are cached on
__init__
. - Uses python for loop for calculations rather than a sparse matrix.
- Uses
RectilinearRegridder._create_cube
with the callback being equivalent to Linear. This means that a different regridding method is acting on the AuxCoords and the data. - The regridder function equivalent
regrid_area_weighted_rectilinear_src_and_grid
handles scalar grid coords, though the regridder class itself does not. - Requires special handling of masked data in calculations.
- Lazy regridding is supported via
map_complete_blocks
. - Additional keywords:
mdtol
UnstructuredNearest
- Creates
UnstructuredNearestNeighbourRegridder
(from analysis.trajectory) which usesanalysis.trajectory.interpolate
for calculations. - Target grid must be lat/lon DimCoords, Source grid must be lat/lon AuxCoords of any dimensionality which map over the same cube dimensions.
- Weights are not cached.
- In the current calculation, "weights" are fancy indices.
- Mask handling is not currently proper (see Regridding with
iris.analysis.UnstructuredNearest
does not preservedtype
and masks #4463), but could be made similar toNearest
. - Cube creation handled mostly by
trajectory.interpolate
rather thanRectilinearRegridder._create_cube
. This handles derived coords but not those which cover the grid dimensions, so a callback is not used. - Lazy regridding not supported.
PointInCell
- Creates
CurvilinearRegridder
(from analysis._regrid) - Target grid must be lat/lon DimCoords, Source grid must be 2D lat/lon AuxCoords which map over the same cube dimensions.
- Weights aren't cached on
__init__
but are cached on__call__
. - Weights matrix is a sparse csc_matrix (rather than the csr_matrix used in Linear).
- Loops through 2D slices for calculations (rather than using transposition and reshaping like in iris-esmf-regrid).
- Cube creation happens during perform inside loop over 2D slices. The resulting cube is a result of merging. Derived coordinates are not handled.
- Unlike other regridders, caching happens within the
__call__
function. - Lazy regridding not supported.
- Additional keywords:
weights
Suggestions
Currently, the most sophisticated regridder for metadata handling is RectilinearRegridder
. The problem seems to be that other regridders have slightly different use cases so they can't fully take advantage of the infrastructure in RectilinearRegridder
. These regridders could be improved by having them derive from a regridder class which is more generic than RectilinearRegridder
. This could also help the creation of future regridders (like in iris-esmf-regrid). Some ideas for how this regridder might be structured:
_AbstractRegridder
- Is an abstract class.
- Source/target grid Coords can be 1D lat/lon DimCoords over two separate dimeansions or lat/lon AuxCoords of any dimensionality (likely just 1D or 2D) over common dimensions.
- Additional kwargs are stored and passed into relevent methods.
- Methods for extracting (and checking validity of) grid Coords implemented for
DimCoords
andAuxCoords
with a view that this could be extended forMesh
in a subclass. - Store source and target grid.
- Cache weights on
__init__
. - Methods for checking compatibility of cubes in call and fetching grid dimensions are implemented for
DimCoords
andAuxCoords
with a view that this could be extended forMesh
in a subclass. - Methods for calculating and applying weights are probably not implemented here, though the application of weights is likely to be common for some regridders so it may be appropriate to have helper functions for common calculations.
- More generic version of
RectilinearRegridder._create_cube
. - The new
_create_cube
method should call on methods for adding the grid Coords with a view that this could be extended forMesh
in a subclass.
Rough proposed common structure of regridders:
The following is copied from a comment in #4807 where it demonstrates a structure to aim for which would allow the possibility to derive from a common class. The aim here is to bring all regridder closer to this structure, improving their functionality along the way, and then refactor this structure in terms of class inheritance when that becomes possible.
def regrid(data, dims, regrid_info):
...
return new_data
def create_cube(data, src, src_dims, tgt_coords, callback):
...
return result_cube
def _prepare(src, tgt):
...
return regrid_info
def _perform(cube, regrid_info):
...
dims = get_dims(cube)
data = regrid(cube.data, dims, regrid_info)
callback = functools.partial(regrid, regrid_info=regrid_info)
return create_cube(data, src, dims, tgt_coords, callback)
class Regridder:
def __init__(src, tgt):
...
self._regrid_info = _prepare(src, tgt)
def __call__(src):
...
return _perform(src, self._regrid_info)
Sub-tasks/Side-tasks
Aside from the larger task of refactoring all the regridders to derive from an abstract class, there are several smaller tasks which could help to unify the behaviour of regridders. Bear in mind some of these tasks may have redundancy with the larger refactoring task.
- Rewrite
AreaWeightedRegridder
to use sparse matrices (similar to iris-esmf-regrid, assuming this improves performance). Performance improvements to AreaWeighted with sparse matrices #5365 - Decide between
csc_matrix
andcsr_matrix
for consistency, comparing performance. - Rewrite the calculations in
CurvilinearRegridder
andRectilinearRegridder
to work on whole cubes rather than slices. (Regridder unification) improve curvilinear regridding, generalise _create_cube #4807 - Fix
UnstructuredNearest
mask handling. Fix handling of data in "nearest" trajectory interpolate #5062 - Rewrite the way callbacks work in
_create_cube
so that they can be consistent with the regridder they are being used in. (Regridder unification) improve curvilinear regridding, generalise _create_cube #4807 - Replace
_RegularGridInterpolator
with direct access to weights matrix. - Make calculation lazy where possible. This may have to be done via a different method for nearest neighbour regridders which seem to work by indexing, so there may be a more appropriate method than
map_complete_blocks
. - Make the handling of scalar coordinates consistent.