Skip to content

Do not convert subclasses of ndarray unless required #1118

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 8 commits into from
1 change: 1 addition & 0 deletions ci/requirements-py27-cdat+pynio.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,3 +14,4 @@ dependencies:
- cyordereddict
- pip:
- coveralls
- quantities
1 change: 1 addition & 0 deletions ci/requirements-py27-netcdf4-dev.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,4 +12,5 @@ dependencies:
- coveralls
- pytest-cov
- h5netcdf
- quantities
- git+https://github.com/Unidata/netcdf4-python.git
1 change: 1 addition & 0 deletions ci/requirements-py27-pydap.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,4 @@ dependencies:
- coveralls
- pytest-cov
- pydap
- quantities
1 change: 1 addition & 0 deletions ci/requirements-py33.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,4 @@ dependencies:
- pip:
- coveralls
- pytest-cov
- quantities
1 change: 1 addition & 0 deletions ci/requirements-py34.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,4 @@ dependencies:
- pip:
- coveralls
- pytest-cov
- quantities
1 change: 1 addition & 0 deletions ci/requirements-py35-dask-dev.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,5 @@ dependencies:
- pip:
- coveralls
- pytest-cov
- quantities
- git+https://github.com/blaze/dask.git
1 change: 1 addition & 0 deletions ci/requirements-py35-pandas-dev.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,5 @@ dependencies:
- coveralls
- pytest-cov
- dask
- quantities
- git+https://github.com/pydata/pandas.git
1 change: 1 addition & 0 deletions ci/requirements-py35.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,3 +15,4 @@ dependencies:
- coveralls
- pytest-cov
- h5netcdf
- quantities
15 changes: 14 additions & 1 deletion xarray/core/variable.py
Original file line number Diff line number Diff line change
Expand Up @@ -193,6 +193,19 @@ def _as_array_or_item(data):
data = np.timedelta64(data, 'ns')
return data

def _as_any_array_or_item(data):
"""Return the given values as a numpy array subclass instance, or as an
individual item if it's a 0d datetime64 or timedelta64 array.

The same caveats as for :py:meth:`_as_array_or_item` apply.
"""
data = np.asanyarray(data)
if data.ndim == 0:
if data.dtype.kind == 'M':
data = np.datetime64(data, 'ns')
elif data.dtype.kind == 'm':
data = np.timedelta64(data, 'ns')
return data

class Variable(common.AbstractArray, common.SharedMethodsMixin,
utils.NdimSizeLenMixin):
Expand Down Expand Up @@ -267,7 +280,7 @@ def data(self):
if isinstance(self._data, dask_array_type):
return self._data
else:
return self.values
return _as_any_array_or_item(self._data_cached())

@data.setter
def data(self, data):
Expand Down
106 changes: 106 additions & 0 deletions xarray/test/test_quantities.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
""" test_quantities: Test storing and using instances of
:py:class:`quantities.Quantity` inside :py:class`xarray.DataArray`.

It can be considered a stand-in for other :py:class:`numpy.ndarray`
subclasses, particularly other units implementations such as
astropy's. As preservation of subclasses is not a guaranteed feature of
`xarray`, some operations will discard subclasses. This test
also serves as documnetation which operations do preserve subclasses
and which don't.
"""

import numpy as np
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a docstring explaining that this module uses quantities as a concrete example, but really it's just checking for subclass compatibility.

Also, I might call it something more genericl ike test_numpy_subclasses.py.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I was rather thinking of this test-file as specialised towards physical units handling with quantities (i.e. as a regression-test database geared towards #525). I'd rather make a separate test-case for general subclasses, maybe with a dummy subclass.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, test_units_subclass.py also seems reasonable.

import pandas as pd

from xarray import (align, broadcast, Dataset, DataArray, Variable)

from xarray.test import (TestCase, unittest)

try:
import quantities as pq

has_quantities = True
except ImportError:
has_quantities = False


def requires_quantities(test):
return (
test if has_quantities else
unittest.skip('requires python-quantities')(test)
)


@requires_quantities
class TestWithQuantities(TestCase):
def setUp(self):
self.x = np.arange(10) * pq.A
self.y = np.arange(20)
self.xp = np.arange(10) * pq.J
self.v = np.arange(10 * 20).reshape(10, 20) * pq.V
self.da = DataArray(self.v, dims=['x', 'y'],
coords=dict(x=self.x, y=self.y, xp=(['x'], self.xp)))

def assertEqualWUnits(self, a, b):
# DataArray's are supposed to preserve Quantity instances
# but they (currently?) do not expose their behaviour.
# We thus need to extract the contained subarray via .data
if isinstance(a, DataArray):
a = a.data
if isinstance(b, DataArray):
b = b.data
self.assertIsNotNone(getattr(a, 'units', None))
self.assertIsNotNone(getattr(b, 'units', None))
self.assertEqual(a.units, b.units)
np.testing.assert_allclose(a.magnitude, b.magnitude)

def test_units_in_data_and_coords(self):
da = self.da
self.assertEqualWUnits(da.xp.data, self.xp)
self.assertEqualWUnits(da.data, self.v)

def test_arithmetics(self):
x = self.x
y = self.y
v = self.v
da = self.da

f = np.arange(10 * 20).reshape(10, 20) * pq.A
g = DataArray(f, dims=['x', 'y'], coords=dict(x=x, y=y))
self.assertEqualWUnits(da * g, v * f)

# swapped dimension order
f = np.arange(20 * 10).reshape(20, 10) * pq.V
g = DataArray(f, dims=['y', 'x'], coords=dict(x=x, y=y))
self.assertEqualWUnits(da + g, v + f.T)

# broadcasting
f = np.arange(10) * pq.m
g = DataArray(f, dims=['x'], coords=dict(x=x))
self.assertEqualWUnits(da / g, v / f[:,None])

def test_unit_checking(self):
da = self.da
f = np.arange(10 * 20).reshape(10, 20) * pq.A
g = DataArray(f, dims=['x', 'y'], coords=dict(x=self.x, y=self.y))
with self.assertRaisesRegexp(ValueError,
'Unable to convert between units'):
da + g

@unittest.expectedFailure
def test_units_in_indexes(self):
""" Test if units survive through xarray indexes.

Indexes are borrowed from Pandas, and Pandas does not support units.
Therefore, we currently don't intend to support units on indexes either.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

usually we use comments instead of docstrings on test methods

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? I do not see how docstrings could be worse than comments under any circumstance... On the other hand, docstrings allow for introspection from unit-testing tools. But then, I don't really care.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, not important, but I'm really surprised by this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the convention for NumPy. To be honest I'm not sure why, but I recall being told so at one point by @charris.

I see now that this isn't followed for pandas, so I guess we can break it here. I don't feel too strongly either way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remember well, the test docstrings were changing the way nose printed out the test names at runtime, which was a bit annoying when not all tests had a consequent doctstring. I don't know about pytest though

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pytest doesn't seem to with either -v or -vv

"""
da = self.da
self.assertEqualWUnits(da.x, self.x)

@unittest.expectedFailure
def test_sel(self):
self.assertEqualWUnits(self.da.sel(y=self.y[0]), self.v[:, 0])

@unittest.expectedFailure
def test_mean(self):
self.assertEqualWUnits(self.da.mean('x'), self.v.mean(0))