-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Return a scalar instead of DataArray when the return value is a scalar #987
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I agree that this can be annoying. The downside in making this switch is that we would lose xarray specific fields like
Also, strictly from a simplicity point of view for xarray, it's nice for every function to return fixed types. NumPy solved this problem by creating it's own scalar types (e.g., |
I see - thanks a lot for the quick response. I knew there was a good reason for this. I wonder if it is reasonable to return a scalar when there is neither I think this might be reasonable because I only get into this issue when I'm doing an array-wide operation and I know I'm going to get an aggregate scalar and forget to use |
This is a bad path to go down :). Now your code might suddenly break when you add a metadata field! In principle, we could pick some subset of operations for which to always do this and others for which to never do this (e.g., aggregating out all dimensions, but not indexing out all dimensions), but I think this inconsistency would be even more surprising. It's pretty easy to see how this could lead to bugs, too. At least now you know you always need to type |
@joonro, I think there's a strong case to be made about returning a It might be more prudent to add this attribute whenever we apply these operations to a I can whip up a working example/pull request if people think this is a direction to go. I'd probably build a decorator which handles inspection of the operator name and arguments and uses that to add the cell_methods attribute, that way people can add the same functionality to homegrown methods/operators. |
Thanks a lot for the discussions. I agree it is very important to be consistent and explicit. Another thing was that sometimes Currently I do not have a good idea about how to improve this - I will report back if one occurs to me. Thanks again! |
Can you give an example of how you need to use .values in xarray
|
Sure. My actual usage is usually much more complicated, but basically, with import numpy as np
import xarray as xr
X = xr.DataArray(np.random.normal(size=(10, 10)),
coords=[range(10), range(10)],) if I want to choose only values larger than 0 from X, it seems I cannot do X.loc[:, :, :, 'variable'].values[X.loc[:, :, :, 'variable'].values > 0] = Y.loc[:, :, :, 'variable'].values[Y.loc[:, :, :, 'variable'].values > 0] Maybe I'm mistaken and there is a way to do this more nicely, but I haven't been able to figure it out. Thank you! |
@joonro Yes, this does get messy. We'll eventually support indexing like In the meantime, you can still break things up onto multiple lines by saving temporary variables:
Using abbreviations like |
@shoyer I think I saw Btw, I must say not only that xarray is just so useful for many of my research, but also the devs' responses on the issues have been superb. Definitely one of the most pleasant experiences I have had with developers. Thank you. |
Thanks @joonro, you are very kind! I'm going to close this issue since I think we resolved the original question. |
Hi,
I'm not sure how devs will feel about this, but I wanted to ask because I'm getting into this issue frequently.
Currently many methods such as
.min()
,.max()
,.mean()
returns a DataArray even for the cases where the return value is a scaler. For example,which makes a lot of other things break down and I have to use
test.min().values
orfloat(test.min())
.I think it would be great that these methods return a scalar when the return value is a scaler. For example,
Thank you!
The text was updated successfully, but these errors were encountered: