diff --git a/doc/duckarrays.rst b/doc/duckarrays.rst new file mode 100644 index 00000000000..ba13d5160ae --- /dev/null +++ b/doc/duckarrays.rst @@ -0,0 +1,65 @@ +.. currentmodule:: xarray + +Working with numpy-like arrays +============================== + +.. warning:: + + This feature should be considered experimental. Please report any bug you may find on + xarray’s github repository. + +Numpy-like arrays (:term:`duck array`) extend the :py:class:`numpy.ndarray` with +additional features, like propagating physical units or a different layout in memory. + +:py:class:`DataArray` and :py:class:`Dataset` objects can wrap these duck arrays, as +long as they satisfy certain conditions (see :ref:`internals.duck_arrays`). + +.. note:: + + For ``dask`` support see :ref:`dask`. + + +Missing features +---------------- +Most of the API does support :term:`duck array` objects, but there are a few areas where +the code will still cast to ``numpy`` arrays: + +- dimension coordinates, and thus all indexing operations: + + * :py:meth:`Dataset.sel` and :py:meth:`DataArray.sel` + * :py:meth:`Dataset.loc` and :py:meth:`DataArray.loc` + * :py:meth:`Dataset.drop_sel` and :py:meth:`DataArray.drop_sel` + * :py:meth:`Dataset.reindex`, :py:meth:`Dataset.reindex_like`, + :py:meth:`DataArray.reindex` and :py:meth:`DataArray.reindex_like`: duck arrays in + data variables and non-dimension coordinates won't be casted + +- functions and methods that depend on external libraries or features of ``numpy`` not + covered by ``__array_function__`` / ``__array_ufunc__``: + + * :py:meth:`Dataset.ffill` and :py:meth:`DataArray.ffill` (uses ``bottleneck``) + * :py:meth:`Dataset.bfill` and :py:meth:`DataArray.bfill` (uses ``bottleneck``) + * :py:meth:`Dataset.interp`, :py:meth:`Dataset.interp_like`, + :py:meth:`DataArray.interp` and :py:meth:`DataArray.interp_like` (uses ``scipy``): + duck arrays in data variables and non-dimension coordinates will be casted in + addition to not supporting duck arrays in dimension coordinates + * :py:meth:`Dataset.rolling_exp` and :py:meth:`DataArray.rolling_exp` (uses + ``numbagg``) + * :py:meth:`Dataset.rolling` and :py:meth:`DataArray.rolling` (uses internal functions + of ``numpy``) + * :py:meth:`Dataset.interpolate_na` and :py:meth:`DataArray.interpolate_na` (uses + :py:class:`numpy.vectorize`) + * :py:func:`apply_ufunc` with ``vectorize=True`` (uses :py:class:`numpy.vectorize`) + +- incompatibilities between different :term:`duck array` libraries: + + * :py:meth:`Dataset.chunk` and :py:meth:`DataArray.chunk`: this fails if the data was + not already chunked and the :term:`duck array` (e.g. a ``pint`` quantity) should + wrap the new ``dask`` array; changing the chunk sizes works. + + +Extensions using duck arrays +---------------------------- +Here's a list of libraries extending ``xarray`` to make working with wrapped duck arrays +easier: + +- `pint-xarray `_ diff --git a/doc/index.rst b/doc/index.rst index e3cbb331285..ee44d0ad4d9 100644 --- a/doc/index.rst +++ b/doc/index.rst @@ -60,6 +60,7 @@ Documentation * :doc:`io` * :doc:`dask` * :doc:`plotting` +* :doc:`duckarrays` .. toctree:: :maxdepth: 1 @@ -80,6 +81,7 @@ Documentation io dask plotting + duckarrays **Help & reference** diff --git a/doc/internals.rst b/doc/internals.rst index aa9e1dedc68..b1678f00bdd 100644 --- a/doc/internals.rst +++ b/doc/internals.rst @@ -42,21 +42,24 @@ xarray objects via the (readonly) :py:attr:`Dataset.variables ` and :py:attr:`DataArray.variable ` attributes. -Duck arrays ------------ + +.. _internals.duck_arrays: + +Integrating with duck arrays +---------------------------- .. warning:: This is a experimental feature. -xarray can wrap custom `duck array`_ objects as long as they define numpy's +xarray can wrap custom :term:`duck array` objects as long as they define numpy's ``shape``, ``dtype`` and ``ndim`` properties and the ``__array__``, ``__array_ufunc__`` and ``__array_function__`` methods. In certain situations (e.g. when printing the collapsed preview of -variables of a ``Dataset``), xarray will display the repr of a `duck array`_ +variables of a ``Dataset``), xarray will display the repr of a :term:`duck array` in a single line, truncating it to a certain number of characters. If that -would drop too much information, the `duck array`_ may define a +would drop too much information, the :term:`duck array` may define a ``_repr_inline_`` method that takes ``max_width`` (number of characters) as an argument: @@ -71,8 +74,6 @@ argument: ... -.. _duck array: https://numpy.org/neps/nep-0022-ndarray-duck-typing-overview.html - Extending xarray ---------------- diff --git a/doc/terminology.rst b/doc/terminology.rst index a85837bafbc..3cfc211593f 100644 --- a/doc/terminology.rst +++ b/doc/terminology.rst @@ -104,3 +104,11 @@ complete examples, please consult the relevant documentation.* one, it has 0 dimensions. That means that, e.g., :py:class:`int`, :py:class:`float`, and :py:class:`str` objects are "scalar" while :py:class:`list` or :py:class:`tuple` are not. + + duck array + `Duck arrays`__ are array implementations that behave + like numpy arrays. They have to define the ``shape``, ``dtype`` and + ``ndim`` properties. For integration with ``xarray``, the ``__array__``, + ``__array_ufunc__`` and ``__array_function__`` protocols are also required. + + __ https://numpy.org/neps/nep-0022-ndarray-duck-typing-overview.html diff --git a/doc/whats-new.rst b/doc/whats-new.rst index 864d57f0e04..e28f66b6afd 100644 --- a/doc/whats-new.rst +++ b/doc/whats-new.rst @@ -61,6 +61,8 @@ Bug fixes Documentation ~~~~~~~~~~~~~ +- document the API not supported with duck arrays (:pull:`4530`). + By `Justus Magin `_. - Update the docstring of :py:class:`DataArray` and :py:class:`Dataset`. (:pull:`4532`);