-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable resampling on PeriodIndex #2687
Conversation
Let me know any thoughts / alternative approaches here. To emphasize - the issue with coercing the labels to a |
group = DataArray(dim_coord, coords=dim_coord.coords, | ||
dims=dim_coord.dims, name=RESAMPLE_DIM) | ||
group = IndexVariable( | ||
data=self.indexes[dim], dims=[dim], name=RESAMPLE_DIM) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could alternatively fix this by replace dim_coord
in the data
argument with dim_coord.to_index()
, e.g., compare:
ipdb> DataArray(dim_coord, coords=dim_coord.coords, dims=dim_coord.dims, name=RESAMPLE_DIM).to_index()
Index([2000-01-01 00:00, 2000-01-01 06:00, 2000-01-01 12:00, 2000-01-01 18:00,
2000-01-02 00:00, 2000-01-02 06:00, 2000-01-02 12:00, 2000-01-02 18:00,
2000-01-03 00:00, 2000-01-03 06:00],
dtype='object', name='time')
ipdb> DataArray(dim_coord.to_index(), coords=dim_coord.coords, dims=dim_coord.dims, name=RESAMPLE_DIM).to_index()
PeriodIndex(['2000-01-01 00:00', '2000-01-01 06:00', '2000-01-01 12:00',
'2000-01-01 18:00', '2000-01-02 00:00', '2000-01-02 06:00',
'2000-01-02 12:00', '2000-01-02 18:00', '2000-01-03 00:00',
'2000-01-03 06:00'],
dtype='period[6H]', name='time', freq='6H')
We go to some effort to preserve pandas.Index classes when passed into DataArray/Variable but apparently don't do it consistently everywhere. So there's also probably a lower level fix somewhere (perhaps in xarray.core.variable.as_compatible_data
?) to ensure that this happens more consistently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! I'll have a look
@@ -1789,12 +1789,17 @@ class IndexVariable(Variable): | |||
unless another name is given. | |||
""" | |||
|
|||
def __init__(self, dims, data, attrs=None, encoding=None, fastpath=False): | |||
def __init__(self, dims, data, attrs=None, encoding=None, fastpath=False, name=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather not add an explicit name
parameter to IndexVariable
.
If anything, I would think about removing the name
property. Long term, I'd like to rename IndexVariable
-> ImmutableVariable
and move the pandas adapter logic into class that we can pass into the data
argument of Variable
.
This is so old that probably it wouldn't even be a good place to start from. Sorry I never pushed it through. I'll close. |
This allows resampling with
PeriodIndex
objects by keeping thegroup
as an index rather than coercing to a DataArray (which coerces any non-native types to objects)I'm still getting one failure around the name of the IndexVariable still being
__resample_dim__
after resample, but wanted to socialize the approach of allowing aname
argument toIndexVariable
- is this reasonable?whats-new.rst
for all changes andapi.rst
for new API