Skip to content

Commit 80e4c87

Browse files
committed
doc update
1 parent 6cc1144 commit 80e4c87

File tree

3 files changed

+41
-3
lines changed

3 files changed

+41
-3
lines changed

doc/api.rst

+2
Original file line numberDiff line numberDiff line change
@@ -109,6 +109,7 @@ Computation
109109
Dataset.apply
110110
Dataset.reduce
111111
Dataset.groupby
112+
Dataset.groupby_bins
112113
Dataset.resample
113114
Dataset.diff
114115

@@ -245,6 +246,7 @@ Computation
245246

246247
DataArray.reduce
247248
DataArray.groupby
249+
DataArray.groupby_bins
248250
DataArray.rolling
249251
DataArray.resample
250252
DataArray.get_axis_num

doc/groupby.rst

+35
Original file line numberDiff line numberDiff line change
@@ -64,6 +64,33 @@ You can also iterate over over groups in ``(label, group)`` pairs:
6464
Just like in pandas, creating a GroupBy object is cheap: it does not actually
6565
split the data until you access particular values.
6666

67+
Binning
68+
~~~~~~~
69+
70+
Sometimes you don't want to use all the unique values to determine the groups
71+
but instead want to "bin" the data into coarser groups. You could always create
72+
a customized coordinate, but xarray facilitates this via the
73+
:py:meth:`~xarray.Dataset.groupby_bins` method.
74+
75+
.. ipython:: python
76+
77+
x_bins = [0,25,50]
78+
ds.groupby_bins('x', x_bins).groups
79+
80+
The binning is implemented via `pandas.cut`__, whose documentation details how
81+
the bins are assigned. As seen in the example above, by default, the bins are
82+
labeled with strings using set notation to precisely identify the bin limits. To
83+
override this behavior, you can specify the bin labels explicitly. Here we
84+
choose `float` labels which identify the bin centers:
85+
86+
.. ipython:: python
87+
88+
x_bin_labels = [12.5,37.5]
89+
ds.groupby_bins('x', x_bins, labels=x_bin_labels).groups
90+
91+
__ http://pandas.pydata.org/pandas-docs/version/0.17.1/generated/pandas.cut.html
92+
93+
6794
Apply
6895
~~~~~
6996

@@ -170,3 +197,11 @@ __ http://cfconventions.org/cf-conventions/v1.6.0/cf-conventions.html#_two_dimen
170197
da
171198
da.groupby('lon').sum()
172199
da.groupby('lon').apply(lambda x: x - x.mean(), shortcut=False)
200+
201+
Because multidimensional groups have the ability to generate a very large
202+
number of bins, coarse-binning via :py:meth:`~xarray.Dataset.groupby_bins`
203+
may be desirable:
204+
205+
.. ipython:: python
206+
207+
da.groupby_bins('lon', [0,45,50]).sum()

xarray/core/common.py

+4-3
Original file line numberDiff line numberDiff line change
@@ -345,8 +345,9 @@ def groupby(self, group, squeeze=True):
345345

346346
def groupby_bins(self, group, bins, right=True, labels=None, precision=3,
347347
include_lowest=False, squeeze=True):
348-
"""Returns a GroupBy object for performing grouped operations. Rather
349-
than using all unique values of `group`, the values are discretized
348+
"""Returns a GroupBy object for performing grouped operations.
349+
350+
Rather than using all unique values of `group`, the values are discretized
350351
first by applying `pandas.cut` [1]_ to `group`.
351352
352353
Parameters
@@ -361,7 +362,7 @@ def groupby_bins(self, group, bins, right=True, labels=None, precision=3,
361362
sequence it defines the bin edges allowing for non-uniform bin
362363
width. No extension of the range of x is done in this case.
363364
right : boolean, optional
364-
I ndicates whether the bins include the rightmost edge or not. If
365+
Indicates whether the bins include the rightmost edge or not. If
365366
right == True (the default), then the bins [1,2,3,4] indicate
366367
(1,2], (2,3], (3,4].
367368
labels : array or boolean, default None

0 commit comments

Comments
 (0)