Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pandas Integer Type Doesn't Convert in Dataset #9742

Open
5 tasks done
edwardreed81 opened this issue Nov 7, 2024 · 2 comments
Open
5 tasks done

Pandas Integer Type Doesn't Convert in Dataset #9742

edwardreed81 opened this issue Nov 7, 2024 · 2 comments
Labels

Comments

@edwardreed81
Copy link

What happened?

Converted a Pandas DataFrame containing a column of type pandas.Int64Dtype() into an Xarray Dataset. The data variable doesn't get converted to an Xarray compatible type:

Data variables:
    0        (dim_0) Int64 27B <class 'xarray.core.extension_array.PandasExte...

Additionally, this causes an exception if the Dataset is pickled and subsequently loaded:

RecursionError: maximum recursion depth exceeded

What did you expect to happen?

The data variable ends up as int64 type. Pickling the Dataset works properly.

Minimal Complete Verifiable Example

import pandas as pd
import xarray as xr
import pickle

df = pd.DataFrame([1, 2, 3], dtype=pd.Int64Dtype())
ds = xr.Dataset(df)
dsdump = pickle.dumps(ds)
pickle.loads(dsdump)

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.
  • Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

---------------------------------------------------------------------------
RecursionError                            Traceback (most recent call last)
Cell In[1], line 8
      6 ds = xr.Dataset(df)
      7 dsdump = pickle.dumps(ds)
----> 8 pickle.loads(dsdump)

File ~/metis-dev/.venv/lib/python3.12/site-packages/xarray/core/extension_array.py:112, in PandasExtensionArray.__getattr__(self, attr)
    111 def __getattr__(self, attr: str) -> object:
--> 112     return getattr(self.array, attr)

File ~/metis-dev/.venv/lib/python3.12/site-packages/xarray/core/extension_array.py:112, in PandasExtensionArray.__getattr__(self, attr)
    111 def __getattr__(self, attr: str) -> object:
--> 112     return getattr(self.array, attr)

    [... skipping similar frames: PandasExtensionArray.__getattr__ at line 112 (2974 times)]

File ~/metis-dev/.venv/lib/python3.12/site-packages/xarray/core/extension_array.py:112, in PandasExtensionArray.__getattr__(self, attr)
    111 def __getattr__(self, attr: str) -> object:
--> 112     return getattr(self.array, attr)

RecursionError: maximum recursion depth exceeded

Anything else we need to know?

Xarray 2024.9.0 does not exhibit this behavior.

Environment

INSTALLED VERSIONS

commit: None
python: 3.12.7 (main, Oct 1 2024, 11:15:50) [GCC 14.2.1 20240910]
python-bits: 64
OS: Linux
OS-release: 6.6.32-1-lts
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: None
libnetcdf: None

xarray: 2024.10.0
pandas: 2.2.3
numpy: 1.26.4
scipy: 1.14.1
netCDF4: None
pydap: None
h5netcdf: None
h5py: None
zarr: 2.18.3
cftime: None
nc_time_axis: None
iris: None
bottleneck: 1.4.2
dask: 2024.10.0
distributed: None
matplotlib: 3.9.2
cartopy: None
seaborn: None
numbagg: None
fsspec: 2024.10.0
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 75.3.0
pip: 23.3.1
conda: None
pytest: None
mypy: None
IPython: 8.29.0
sphinx: None

@edwardreed81 edwardreed81 added bug needs triage Issue that has not been reviewed by xarray team member labels Nov 7, 2024
Copy link

welcome bot commented Nov 7, 2024

Thanks for opening your first issue here at xarray! Be sure to follow the issue template!
If you have an idea for a solution, we would really welcome a Pull Request with proposed changes.
See the Contributing Guide for more.
It may take us a while to respond here, but we really value your contribution. Contributors like you help make xarray better.
Thank you!

@max-sixty
Copy link
Collaborator

Confirmed as a bug

@max-sixty max-sixty removed the needs triage Issue that has not been reviewed by xarray team member label Nov 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants