Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Diagonal concat when lazyframe other than first item contains list dtype for a column not in first item causes incompatible types error #18911

Closed
2 tasks done
jaredcolerosenberg opened this issue Sep 25, 2024 · 1 comment · Fixed by #18916
Assignees
Labels
accepted Ready for implementation bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars

Comments

@jaredcolerosenberg
Copy link

jaredcolerosenberg commented Sep 25, 2024

Checks

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of Polars.

Reproducible example

import polars as pl
a = pl.LazyFrame({"a": [[1]], "b": [1]})
b = pl.LazyFrame({"c": [1, 2]})
df = pl.concat([b, a], how="diagonal", rechunk=True).collect()

Log output

UNION: union is run in parallel
Traceback (most recent call last):
  File "/home/jared/bla/bla-python/bla-entity/bug_report.py", line 42, in <module>
    df = pl.concat([b, a], how="diagonal", rechunk=True).collect()
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jared/bla/bla-python/bla-entity/venv/lib/python3.12/site-packages/polars/lazyframe/frame.py", line 2033, in collect
    return wrap_df(ldf.collect(callback))
                   ^^^^^^^^^^^^^^^^^^^^^
polars.exceptions.SchemaError: type Int64 is incompatible with expected type Null

Issue description

  • changing the order of a and b produces a correct diagonal concat
  • using eager instead of lazy produces the expected result
  • doesn't matter if rechunk=True or rechunk=False. Both produce an exception.

Expected behavior

Expected this to concat the frames vertically and fill in nulls where appropriate.

Installed versions

--------Version info---------
Polars:              1.8.1
Index type:          UInt32
Platform:            Linux-6.8.0-40-generic-x86_64-with-glibc2.35
Python:              3.12.2 (main, May  7 2024, 12:21:38) [GCC 11.4.0]

----Optional dependencies----
adbc_driver_manager  <not installed>
altair               <not installed>
cloudpickle          <not installed>
connectorx           <not installed>
deltalake            0.20.0
fastexcel            <not installed>
fsspec               <not installed>
gevent               <not installed>
great_tables         <not installed>
matplotlib           <not installed>
nest_asyncio         <not installed>
numpy                2.1.1
openpyxl             <not installed>
pandas               <not installed>
pyarrow              17.0.0
pydantic             2.9.2
pyiceberg            <not installed>
sqlalchemy           <not installed>
torch                <not installed>
xlsx2csv             <not installed>
xlsxwriter           <not installed>

@jaredcolerosenberg jaredcolerosenberg added bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars labels Sep 25, 2024
@cmdlineluser
Copy link
Contributor

Even diagonal_relaxed produces the same error.

The example works for me in 1.7.1 - so it seems this is a regression from 1.8.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted Ready for implementation bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

4 participants