Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Polars dies on working code when POLARS_PANIC_ON_ERR=1 is set #20228

Open
2 tasks done
douglas-raillard-arm opened this issue Dec 9, 2024 · 2 comments
Open
2 tasks done
Labels
bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars

Comments

@douglas-raillard-arm
Copy link
Contributor

douglas-raillard-arm commented Dec 9, 2024

Checks

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of Polars.

Reproducible example

import pandas as pd
print(pd.__version__)

import polars as pl
pl.show_versions()

df = pd.DataFrame(dict(a=["a"]), index=[1], dtype='category')
print(df)
pl_df = pl.from_pandas(df, include_index=True)
print(pl_df)

Log output

2.2.3
--------Version info---------
Polars:              1.17.1
Index type:          UInt32
Platform:            Linux-6.8.0-49-generic-x86_64-with-glibc2.39
Python:              3.12.3 (main, Nov  6 2024, 18:32:19) [GCC 13.2.0]
LTS CPU:             False

----Optional dependencies----
adbc_driver_manager  <not installed>
altair               <not installed>
boto3                1.35.48
cloudpickle          <not installed>
connectorx           <not installed>
deltalake            <not installed>
fastexcel            <not installed>
fsspec               <not installed>
gevent               <not installed>
google.auth          <not installed>
great_tables         <not installed>
matplotlib           <not installed>
nest_asyncio         1.6.0
numpy                2.1.2
openpyxl             <not installed>
pandas               2.2.3
pyarrow              17.0.0
pydantic             <not installed>
pyiceberg            <not installed>
sqlalchemy           <not installed>
torch                <not installed>
xlsx2csv             <not installed>
xlsxwriter           <not installed>
   a
1  a
thread '<unnamed>' panicked at /home/runner/work/polars/polars/crates/polars-error/src/lib.rs:45:37:
unexpected value while building Series of type Int64; found value of type String: "None"
stack backtrace:
   0: rust_begin_unwind
   1: core::panicking::panic_fmt
   2: <polars_error::ErrString as core::convert::From<T>>::from::panic_cold_display
   3: <polars_error::ErrString as core::convert::From<T>>::from
   4: polars_core::series::any_value::invalid_value_error
   5: polars_core::series::any_value::<impl polars_core::series::Series>::from_any_values_and_dtype
   6: polars_python::series::construction::<impl polars_python::series::PySeries>::new_from_any_values_and_dtype
   7: polars_python::series::construction::<impl polars_python::series::PySeries>::__pymethod_new_from_any_values_and_dtype__
   8: pyo3::impl_::trampoline::trampoline
   9: polars_python::series::construction::_::__INVENTORY::trampoline
  10: <unknown>
  11: _PyObject_MakeTpCall
  12: _PyEval_EvalFrameDefault
  13: _PyObject_Call_Prepend
  14: <unknown>
  15: <unknown>
  16: _PyObject_MakeTpCall
  17: _PyEval_EvalFrameDefault
  18: <unknown>
  19: _PyEval_EvalFrameDefault
  20: PyEval_EvalCode
  21: <unknown>
  22: <unknown>
  23: _PyRun_SimpleFileObject
  24: _PyRun_AnyFileObject
  25: Py_RunMain
  26: Py_BytesMain
  27: __libc_start_call_main
             at ./csu/../sysdeps/nptl/libc_start_call_main.h:58:16
  28: __libc_start_main_impl
             at ./csu/../csu/libc-start.c:360:3
  29: _start
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
Traceback (most recent call last):
  File "/home/dourai01/Work/projects/lisa/.lisa-venv-3.12/lib/python3.12/site-packages/polars/_utils/construction/series.py", line 319, in _construct_series_with_fallbacks
    return constructor(name, values, strict)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: 'str' object cannot be interpreted as an integer

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/dourai01/Work/projects/lisa/testcat.py", line 10, in <module>
    pl_df = pl.from_pandas(df, include_index=True)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dourai01/Work/projects/lisa/.lisa-venv-3.12/lib/python3.12/site-packages/polars/convert/general.py", line 587, in from_pandas
    pandas_to_pydf(
  File "/home/dourai01/Work/projects/lisa/.lisa-venv-3.12/lib/python3.12/site-packages/polars/_utils/construction/dataframe.py", line 1125, in pandas_to_pydf
    return arrow_to_pydf(
           ^^^^^^^^^^^^^^
  File "/home/dourai01/Work/projects/lisa/.lisa-venv-3.12/lib/python3.12/site-packages/polars/_utils/construction/dataframe.py", line 1238, in arrow_to_pydf
    df = df[names]
         ~~^^^^^^^
  File "/home/dourai01/Work/projects/lisa/.lisa-venv-3.12/lib/python3.12/site-packages/polars/dataframe/frame.py", line 1376, in __getitem__
    return get_df_item_by_key(self, key)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dourai01/Work/projects/lisa/.lisa-venv-3.12/lib/python3.12/site-packages/polars/_utils/getitem.py", line 167, in get_df_item_by_key
    return _select_rows(df, key)  # type: ignore[arg-type]
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/dourai01/Work/projects/lisa/.lisa-venv-3.12/lib/python3.12/site-packages/polars/_utils/getitem.py", line 308, in _select_rows
    s = pl.Series("", key, dtype=Int64)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dourai01/Work/projects/lisa/.lisa-venv-3.12/lib/python3.12/site-packages/polars/series/series.py", line 289, in __init__
    self._s = sequence_to_pyseries(
              ^^^^^^^^^^^^^^^^^^^^^
  File "/home/dourai01/Work/projects/lisa/.lisa-venv-3.12/lib/python3.12/site-packages/polars/_utils/construction/series.py", line 144, in sequence_to_pyseries
    pyseries = _construct_series_with_fallbacks(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/dourai01/Work/projects/lisa/.lisa-venv-3.12/lib/python3.12/site-packages/polars/_utils/construction/series.py", line 334, in _construct_series_with_fallbacks
    return PySeries.new_from_any_values_and_dtype(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pyo3_runtime.PanicException: unexpected value while building Series of type Int64; found value of type String: "None"

Issue description

When pl.from_pandas() is called on a pandas DataFrame with categorical column and POLARS_PANIC_ON_ERR=1, polars panics and raises a pyo3_runtime.PanicException. When the env var is set to 0, everything works as expected.

Expected behavior

Code that works under normal condition should keep working when POLARS_PANIC_ON_ERR=1, otherwise that reduces a lot its usability as a debugging tool as it will divert the attention away from the real problem being investigated.

On top of that, it also prevents setting it as an env var in e.g. a CI system, which could otherwise gather useful info for intermittent problems.

Installed versions

``` --------Version info--------- Polars: 1.17.1 Index type: UInt32 Platform: Linux-6.8.0-49-generic-x86_64-with-glibc2.39 Python: 3.12.3 (main, Nov 6 2024, 18:32:19) [GCC 13.2.0] LTS CPU: False

----Optional dependencies----
adbc_driver_manager
altair
boto3 1.35.48
cloudpickle
connectorx
deltalake
fastexcel
fsspec
gevent
google.auth
great_tables
matplotlib
nest_asyncio 1.6.0
numpy 2.1.2
openpyxl
pandas 2.2.3
pyarrow 17.0.0
pydantic
pyiceberg
sqlalchemy
torch
xlsx2csv
xlsxwriter

</details>
@douglas-raillard-arm douglas-raillard-arm added bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars labels Dec 9, 2024
@douglas-raillard-arm
Copy link
Contributor Author

The issue reported here also seems to come from that cause: #18896 (comment)

thread 'polars-0' panicked at /home/runner/work/polars/polars/crates/polars-error/src/lib.rs:45:37:
unsupported output type for dictionary packing: Duration(Nanosecond)

@cmdlineluser
Copy link
Contributor

The example now runs for me on main. (I think thanks to #20248)

It seems previously there was a try / except involved in the parsing with a fallback in the except branch.

Setting the variable then prevented the fallback from running.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars
Projects
None yet
Development

No branches or pull requests

2 participants