Order of dimensions #504

farheen2022 · 2023-03-27T21:34:27Z

Sir kindly tell, how to change the name and order of dimension of NETCDF file. I am using chirps dataset and my dimension name is latitude, I am unable to change it to lat and that is raising error. I have tried using ncpdq in Conda prompt for correct order of dimensions but that is raising error related to size of the internal memory.

bradleyswilson · 2023-03-28T21:43:34Z

You can rename a netCDF dimension with xarray.rename() function, e.g. xarray.rename({'latitude': 'lat'}

If you need to change the ordering of dimensions, you can use xarray.transpose() e.g. data["prcp"].transpose("lat", "lon", "time")

monocongo · 2023-04-05T18:32:35Z

As @bradleyswilson alludes to above you can leverage xarray for this and then write the resulting xarray.Dataset object to file. Then use that new NetCDF file as input to this package's main processing script.

farheen2022 · 2023-04-08T21:39:38Z

I am now able to change the order of dimension of the input file and save it. The problem was arising because the file was too big, almost 7GB. I was using CHIRPS rainfall dataset. I checked using CRU rainfall dataset and I am able to change my input file. Thank you @bradleyswilson @monocongo

maxxpower007 · 2023-05-24T09:45:11Z

To do an automatic conversion, I usually add these lines after every update:

In _compute_write_index ( main.py )
after this line:

`dataset = xr.open_mfdataset(list(set(files)), chunks=chunks)

# Add this ################################
if 'latitude' in dataset.coords:
    dataset.rename({'latitude':'lat','longitude':'lon'})
if 'bnds' in dataset.dims:
    dataset = dataset.drop('time_bnds')
keys = list(dataset.keys())
for key in keys:
    if 'time' in dataset.coords:
        dataset[key] = dataset[key].transpose("lat", "lon", "time")
    else:
        dataset[key] = dataset[key].transpose("lat", "lon")
if 'time' in dataset.coords:
    dataset = dataset[['lat', 'lon', 'time', *keys ]]
else:
    dataset = dataset[['lat', 'lon', *keys ]]
########################################

`

And in _prepare_file ( main.py )
After this line

` ds = xr.open_dataset(netcdf_file)

# Add this ################################
if 'latitude' in ds.coords:
    ds.rename({'latitude':'lat','longitude':'lon'})
if 'bnds' in ds.dims:
    ds = ds.drop('time_bnds')
keys = list(ds.keys())
for key in keys:
    if 'time' in ds.coords:
        ds[key] = ds[key].transpose("lat", "lon", "time")
    else:
        ds[key] = ds[key].transpose("lat", "lon")
if 'time' in ds.coords:
    ds = ds[['lat', 'lon', 'time', *keys ]]
else:
    ds = ds[['lat', 'lon', *keys ]]
########################################

`

maxxpower007 · 2023-05-24T09:48:14Z

One more that is unrelated:

I usually have to change pet in indices.py

From:
if (latitude_degrees is not None) and not np.isnan(latitude_degrees) and (-90.0 < latitude_degrees < 90.0):

To:
if (latitude_degrees is not None) and not np.isnan(latitude_degrees) and (-90.0 <= latitude_degrees <= 90.0):

monocongo · 2023-05-24T17:08:58Z

Thanks for helping @maxxpower007 ! The common fixes you outlined above might be useful for all users -- maybe we should roll these into the main processing script? One limitation, for now, is that there are no proper tests for the main processing script, so harder to be sure we've not broken something if we add code willy-nilly.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Order of dimensions #504

Order of dimensions #504

farheen2022 commented Mar 27, 2023

bradleyswilson commented Mar 28, 2023

monocongo commented Apr 5, 2023

farheen2022 commented Apr 8, 2023

maxxpower007 commented May 24, 2023

maxxpower007 commented May 24, 2023

monocongo commented May 24, 2023

Order of dimensions #504

Order of dimensions #504

Comments

farheen2022 commented Mar 27, 2023

bradleyswilson commented Mar 28, 2023

monocongo commented Apr 5, 2023

farheen2022 commented Apr 8, 2023

maxxpower007 commented May 24, 2023

maxxpower007 commented May 24, 2023

monocongo commented May 24, 2023