Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xarray open kwargs #48

Open
wants to merge 2 commits into
base: maintenance-1.5.x
Choose a base branch
from

Conversation

larsbuntemeyer
Copy link
Contributor

@larsbuntemeyer larsbuntemeyer commented Dec 21, 2022

This PR adds the xarray_open_kwargs keyword that let's you control the xarray.open_dataset keyword arguments in case of returnXDataset=True. This allows me, e.g., to control parallel reads and chunking.

Use case

I have a non standard calendar and want to select a date into an xarray dataset:

import numpy as np
import pandas as pd
import xarray as xr
from cdo import Cdo

np.random.seed(0)
temperature = 15 + 8 * np.random.randn(1, 2, 2)
precipitation = 10 * np.random.rand(1, 2, 2)
lon = [[-99.83, -99.32], [-99.79, -99.23]]
lat = [[42.25, 42.21], [42.63, 42.59]]
time = xr.cftime_range(start="2000-02-30", periods=1, calendar="360_day")


ds = xr.Dataset(
    data_vars=dict(
        temperature=(["time", "y", "x"], temperature),
        precipitation=(["time", "y", "x"], precipitation),
    ),
    coords=dict(
        lon=(["x", "y"], lon),
        lat=(["x", "y"], lat),
        time=time,
    ),
    attrs=dict(description="Weather related data."),
)

cdo = Cdo(xarray_open_kwargs={'use_cftime':True}, logging=True, debug=False)
cdo.seldate("2000-02-30T00:00:00", input=ds, returnXDataset=True)

This only works with use_cftime=True.

@larsbuntemeyer larsbuntemeyer changed the base branch from master to maintenance-1.5.x December 21, 2022 19:39
@larsbuntemeyer larsbuntemeyer marked this pull request as ready for review December 21, 2022 19:41
@Try2Code
Copy link
Owner

Hi @larsbuntemeyer !

Thanks for the input. I am not a xarray-poweruser, so this is quite valuable for me 👍

But I am not sure, if this done in the best way: On the one hand opening files is done internally - so it's a bit hard to change this behaviour. But on the other hand the whole operation depends on the input very much.
I wonder if this should be better done on an operator call than during the constructor call of the cdo object. What you want to change might differ a lot from one file to the next, right?

@larsbuntemeyer
Copy link
Contributor Author

@Try2Code Thanks for considering this! Sorry for the late response, you are right, this could also go along with the returnX* arguments? So it can be set more dynamically depending on the input variable. I'll have a look into this....

@Try2Code
Copy link
Owner

I think this options is better put into an operator call than in the constructor. That way users can change the options for each data set they create

@Try2Code
Copy link
Owner

I am in the middle of a re-design. Will keep this PR open to keep the feature in mind.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants