Rechunk to an existing store #148

jbeezley · 2023-11-29T18:21:51Z

I have an existing data pipeline where I have data coming in incrementally. I have an existing pipeline performing a naive rechunking to a zarr store whenever new data comes into the source store. Rechunker has a much better algorithm I would like to use, but it doesn't have the ability to target an existing store.

This problem seems related to #8 however, for my use case a simpler implementation would be to optionally skip the call at https://github.com/pangeo-data/rechunker/blob/master/rechunker/api.py#L599 and open the dataset instead.

I would be willing to implement this via an optional kwarg, but I wanted to check if such a change would be accepted or if there are any issues with it that I'm not considering. Clearly, there could be problems if the dimensions/variables of the destination are not compatible. I could check that after opening or just let the exceptions from zarr pass through. Thoughts?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rechunk to an existing store #148

Rechunk to an existing store #148

jbeezley commented Nov 29, 2023

Rechunk to an existing store #148

Rechunk to an existing store #148

Comments

jbeezley commented Nov 29, 2023