`rows_by_key` works with pl.Array #18813

gab23r · 2024-09-18T12:13:35Z

Description

I have this dataframe:

import polars as pl
df = (
    pl.DataFrame([
        pl.Series(name = "a", values = [[0,0], [0,1]], dtype=pl.Array(pl.Int64, 2)),
        pl.Series(name = "b", values = ["a", "b"])
    ])
)
# shape: (2, 2)
# ┌───────────────┬─────┐
# │ a             ┆ b   │
# │ ---           ┆ --- │
# │ array[i64, 2] ┆ str │
# ╞═══════════════╪═════╡
# │ [0, 0]        ┆ a   │
# │ [0, 1]        ┆ b   │
# └───────────────┴─────┘

And I want to transform this dataframe into this dictionnary:
{(0,0): "a", (0, 1): "b"}

The way to do it would be to used df.rows_by_key("a") but it fails with TypeError: unhashable type: 'list'
This is because pl.Array becomes list python object and so cannot be key of a dictionary. But I would be great is pl.Array get transform in tuple.

I think it make a lot of sense overall and this way we can translate any polars dataframe to any builtin python type:

pl.List -> list
pl.Struct -> dict
pl.Array -> tuple

The text was updated successfully, but these errors were encountered:

gab23r · 2024-11-06T14:01:47Z

I just realized that rows_by_key doesn't work with pl.Array but it takes a list of columns as a key so this workaround works fine and is even better:
python

df.select(c.a.arr.to_struct().struct.unnest(), "b").rows_by_key(("field_0", "field_1"), unique=True)
# {(0, 0): 'a', (0, 1): 'b'}

gab23r added the enhancement New feature or an improvement of an existing feature label Sep 18, 2024

gab23r closed this as completed Nov 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`rows_by_key` works with pl.Array #18813

`rows_by_key` works with pl.Array #18813

gab23r commented Sep 18, 2024

gab23r commented Nov 6, 2024

rows_by_key works with pl.Array #18813

rows_by_key works with pl.Array #18813

Comments

gab23r commented Sep 18, 2024

Description

gab23r commented Nov 6, 2024

`rows_by_key` works with pl.Array #18813

`rows_by_key` works with pl.Array #18813