Skip to content

Commit a219215

Browse files
authored
document the duck array integration status (#4530)
* document the missing features of duck array integration * add a list of extension libraries * some rewording * include in the toctree * rename the label * change the heading * properly reference numpy.vectorize * rewrite a few headings * add apply_ufunc with vectorize=True to the unsupported features * update whats-new.rst * move the definition of a duck array to the terminology page * use a less technical heading and rewrite the introduction * fix a broken link * reword the warning * mention that dask is handled differently * also note that chunk does not working with some duck arrays i.e. those which, like pint, are higher in the type hierarchy than dask. * add pint as an example for duck arrays for which chunk fails * rename a link label * remove the indirection * use the double underscore syntax instead
1 parent 9c02c61 commit a219215

File tree

5 files changed

+85
-7
lines changed

5 files changed

+85
-7
lines changed

doc/duckarrays.rst

+65
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
.. currentmodule:: xarray
2+
3+
Working with numpy-like arrays
4+
==============================
5+
6+
.. warning::
7+
8+
This feature should be considered experimental. Please report any bug you may find on
9+
xarray’s github repository.
10+
11+
Numpy-like arrays (:term:`duck array`) extend the :py:class:`numpy.ndarray` with
12+
additional features, like propagating physical units or a different layout in memory.
13+
14+
:py:class:`DataArray` and :py:class:`Dataset` objects can wrap these duck arrays, as
15+
long as they satisfy certain conditions (see :ref:`internals.duck_arrays`).
16+
17+
.. note::
18+
19+
For ``dask`` support see :ref:`dask`.
20+
21+
22+
Missing features
23+
----------------
24+
Most of the API does support :term:`duck array` objects, but there are a few areas where
25+
the code will still cast to ``numpy`` arrays:
26+
27+
- dimension coordinates, and thus all indexing operations:
28+
29+
* :py:meth:`Dataset.sel` and :py:meth:`DataArray.sel`
30+
* :py:meth:`Dataset.loc` and :py:meth:`DataArray.loc`
31+
* :py:meth:`Dataset.drop_sel` and :py:meth:`DataArray.drop_sel`
32+
* :py:meth:`Dataset.reindex`, :py:meth:`Dataset.reindex_like`,
33+
:py:meth:`DataArray.reindex` and :py:meth:`DataArray.reindex_like`: duck arrays in
34+
data variables and non-dimension coordinates won't be casted
35+
36+
- functions and methods that depend on external libraries or features of ``numpy`` not
37+
covered by ``__array_function__`` / ``__array_ufunc__``:
38+
39+
* :py:meth:`Dataset.ffill` and :py:meth:`DataArray.ffill` (uses ``bottleneck``)
40+
* :py:meth:`Dataset.bfill` and :py:meth:`DataArray.bfill` (uses ``bottleneck``)
41+
* :py:meth:`Dataset.interp`, :py:meth:`Dataset.interp_like`,
42+
:py:meth:`DataArray.interp` and :py:meth:`DataArray.interp_like` (uses ``scipy``):
43+
duck arrays in data variables and non-dimension coordinates will be casted in
44+
addition to not supporting duck arrays in dimension coordinates
45+
* :py:meth:`Dataset.rolling_exp` and :py:meth:`DataArray.rolling_exp` (uses
46+
``numbagg``)
47+
* :py:meth:`Dataset.rolling` and :py:meth:`DataArray.rolling` (uses internal functions
48+
of ``numpy``)
49+
* :py:meth:`Dataset.interpolate_na` and :py:meth:`DataArray.interpolate_na` (uses
50+
:py:class:`numpy.vectorize`)
51+
* :py:func:`apply_ufunc` with ``vectorize=True`` (uses :py:class:`numpy.vectorize`)
52+
53+
- incompatibilities between different :term:`duck array` libraries:
54+
55+
* :py:meth:`Dataset.chunk` and :py:meth:`DataArray.chunk`: this fails if the data was
56+
not already chunked and the :term:`duck array` (e.g. a ``pint`` quantity) should
57+
wrap the new ``dask`` array; changing the chunk sizes works.
58+
59+
60+
Extensions using duck arrays
61+
----------------------------
62+
Here's a list of libraries extending ``xarray`` to make working with wrapped duck arrays
63+
easier:
64+
65+
- `pint-xarray <https://github.com/xarray-contrib/pint-xarray>`_

doc/index.rst

+2
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,7 @@ Documentation
6060
* :doc:`io`
6161
* :doc:`dask`
6262
* :doc:`plotting`
63+
* :doc:`duckarrays`
6364

6465
.. toctree::
6566
:maxdepth: 1
@@ -80,6 +81,7 @@ Documentation
8081
io
8182
dask
8283
plotting
84+
duckarrays
8385

8486
**Help & reference**
8587

doc/internals.rst

+8-7
Original file line numberDiff line numberDiff line change
@@ -42,21 +42,24 @@ xarray objects via the (readonly) :py:attr:`Dataset.variables
4242
<xarray.Dataset.variables>` and
4343
:py:attr:`DataArray.variable <xarray.DataArray.variable>` attributes.
4444

45-
Duck arrays
46-
-----------
45+
46+
.. _internals.duck_arrays:
47+
48+
Integrating with duck arrays
49+
----------------------------
4750

4851
.. warning::
4952

5053
This is a experimental feature.
5154

52-
xarray can wrap custom `duck array`_ objects as long as they define numpy's
55+
xarray can wrap custom :term:`duck array` objects as long as they define numpy's
5356
``shape``, ``dtype`` and ``ndim`` properties and the ``__array__``,
5457
``__array_ufunc__`` and ``__array_function__`` methods.
5558

5659
In certain situations (e.g. when printing the collapsed preview of
57-
variables of a ``Dataset``), xarray will display the repr of a `duck array`_
60+
variables of a ``Dataset``), xarray will display the repr of a :term:`duck array`
5861
in a single line, truncating it to a certain number of characters. If that
59-
would drop too much information, the `duck array`_ may define a
62+
would drop too much information, the :term:`duck array` may define a
6063
``_repr_inline_`` method that takes ``max_width`` (number of characters) as an
6164
argument:
6265

@@ -71,8 +74,6 @@ argument:
7174
7275
...
7376
74-
.. _duck array: https://numpy.org/neps/nep-0022-ndarray-duck-typing-overview.html
75-
7677
7778
Extending xarray
7879
----------------

doc/terminology.rst

+8
Original file line numberDiff line numberDiff line change
@@ -104,3 +104,11 @@ complete examples, please consult the relevant documentation.*
104104
one, it has 0 dimensions. That means that, e.g., :py:class:`int`,
105105
:py:class:`float`, and :py:class:`str` objects are "scalar" while
106106
:py:class:`list` or :py:class:`tuple` are not.
107+
108+
duck array
109+
`Duck arrays`__ are array implementations that behave
110+
like numpy arrays. They have to define the ``shape``, ``dtype`` and
111+
``ndim`` properties. For integration with ``xarray``, the ``__array__``,
112+
``__array_ufunc__`` and ``__array_function__`` protocols are also required.
113+
114+
__ https://numpy.org/neps/nep-0022-ndarray-duck-typing-overview.html

doc/whats-new.rst

+2
Original file line numberDiff line numberDiff line change
@@ -89,6 +89,8 @@ Bug fixes
8989

9090
Documentation
9191
~~~~~~~~~~~~~
92+
- document the API not supported with duck arrays (:pull:`4530`).
93+
By `Justus Magin <https://github.com/keewis>`_.
9294

9395
- Update the docstring of :py:class:`DataArray` and :py:class:`Dataset`.
9496
(:pull:`4532`);

0 commit comments

Comments
 (0)