Skip to content

Xarray serialization warning when saving dataset #853

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
tomwhite opened this issue May 6, 2022 · 0 comments
Open

Xarray serialization warning when saving dataset #853

tomwhite opened this issue May 6, 2022 · 0 comments
Labels
bug Something isn't working upstream Used when our build breaks due to upstream changes

Comments

@tomwhite
Copy link
Collaborator

tomwhite commented May 6, 2022

From #785:

import sgkit as sg
import sgkit.io.vcf as sgvcf
sgvcf.vcf_to_zarr("sgkit/tests/io/vcf/data/sample.vcf.gz", "sample.vcf.gz.zarr")
ds = sg.load_dataset("sample.vcf.gz.zarr")
sg.save_dataset(ds, "sample2.vcf.gz.zarr", mode="w")

prints the warning:

SerializationWarning: variable None has data in the form of a dask array with dtype=object, which means it is being loaded into memory to determine a data type that can be safely stored on disk. To avoid this, coerce this variable to a fixed-size dtype with astype() before saving it.

There is an upstream xarray issue here: pydata/xarray#5769. #643 is related too.

@tomwhite tomwhite added bug Something isn't working upstream Used when our build breaks due to upstream changes labels May 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working upstream Used when our build breaks due to upstream changes
Projects
None yet
Development

No branches or pull requests

1 participant