Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

arr.min()/max() panicks when one of the values if None and the number of rows is larger than 2 #14359

Closed
2 tasks done
etiennebacher opened this issue Feb 8, 2024 · 4 comments
Assignees
Labels
A-dtype-list/array Area: list/array data type bug Something isn't working P-medium Priority: medium python Related to Python Polars

Comments

@etiennebacher
Copy link
Contributor

etiennebacher commented Feb 8, 2024

Checks

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of Polars.

Reproducible example

import os
os.environ['POLARS_VERBOSE']='1'
import polars as pl

### Works (2 rows, 1 None value)
pl.DataFrame(
    data={"a": [[1, 2], [3, None]]},
    schema={"a": pl.Array(pl.Int64, 2)},
).select(pl.col("a").arr.max())

### Works (3 rows, 0 None value)
pl.DataFrame(
    data={"a": [[1, 2], [3, 4], [5, 6]]},
    schema={"a": pl.Array(pl.Int64, 2)},
).select(pl.col("a").arr.max())

### Panicks (3 rows, 1 None value)
pl.DataFrame(
    data={"a": [[1, 2], [3, 4], [None, 5]]},
    schema={"a": pl.Array(pl.Int64, 2)},
).select(pl.col("a").arr.max())

Log output

thread '<unnamed>' panicked at crates\polars-arrow\src\bitmap\immutable.rs:156:24:
range end index 2 out of range for slice of length 1
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Traceback (most recent call last):
  File "<stdin>", line 4, in <module>
  File "C:\Python311\Lib\site-packages\polars\dataframe\frame.py", line 8142, in select
    return self.lazy().select(*exprs, **named_exprs).collect(_eager=True)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\polars\lazyframe\frame.py", line 1940, in collect
    return wrap_df(ldf.collect())
                   ^^^^^^^^^^^^^
pyo3_runtime.PanicException: range end index 2 out of range for slice of length 1

Issue description

arr.max() and arr.min() panick if there are more than 2 rows in the data and if one the array values is None.

Expected behavior

┌─────┐
│ a   │
│ --- │
│ i64 │
╞═════╡
│ 2   │
│ 4   │
│ 5   │
└─────┘

Installed versions

--------Version info---------
Polars:               0.20.7
Index type:           UInt32
Platform:             Windows-10-10.0.19044-SP0
Python:               3.11.0 (main, Oct 24 2022, 18:26:48) [MSC v.1933 64 bit (AMD64)]

----Optional dependencies----
adbc_driver_manager:  <not installed>
cloudpickle:          <not installed>
connectorx:           <not installed>
deltalake:            <not installed>
fsspec:               2023.6.0
gevent:               <not installed>
hvplot:               <not installed>
matplotlib:           3.7.1
numpy:                1.24.3
openpyxl:             <not installed>
pandas:               2.0.3
pyarrow:              12.0.1
pydantic:             <not installed>
pyiceberg:            <not installed>
pyxlsb:               <not installed>
sqlalchemy:           <not installed>
xlsx2csv:             <not installed>
xlsxwriter:           <not installed>
@etiennebacher etiennebacher added bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars labels Feb 8, 2024
@etiennebacher etiennebacher changed the title arr.max() panicks when one of the values if None and the number of rows is larger than 2 arr.min()/max() panicks when one of the values if None and the number of rows is larger than 2 Feb 8, 2024
@deanm0000 deanm0000 added P-medium Priority: medium A-dtype-list/array Area: list/array data type and removed needs triage Awaiting prioritization by a maintainer labels Feb 9, 2024
@github-project-automation github-project-automation bot moved this to Ready in Backlog Feb 9, 2024
@deanm0000
Copy link
Collaborator

@reswqa I think you recently fixed one like this.

@reswqa
Copy link
Collaborator

reswqa commented Feb 9, 2024

Yes, I will take a look recently.

@reswqa reswqa self-assigned this Feb 9, 2024
@eitsupi
Copy link
Contributor

eitsupi commented Feb 10, 2024

ref: pola-rs/r-polars#790

It seems this is a bug of rustc will be fixed in 1.77.0 (rust-lang/rust#119352)

@etiennebacher
Copy link
Contributor Author

This is solved as of 0.20.21 (probably in #15654)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-dtype-list/array Area: list/array data type bug Something isn't working P-medium Priority: medium python Related to Python Polars
Projects
Archived in project
Development

No branches or pull requests

4 participants