You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of Polars.
Reproducible example
# PROBLEM:# Aggregation should not return different size Series but scalars (see problems and explanation below)pl.Series(['a', 'b'], dtype=pl.Utf8).str.concat() # -> Series (1,) ["a-b"]pl.Series([], dtype=pl.Utf8).str.concat() # -> Series (0,) []# other aggregation (max)pl.Series([1, 2], dtype=pl.UInt32).max() # -> 2pl.Series([], dtype=pl.UInt32).max() # -> None
Log output
No response
Issue description
str.concat should behave like other vertical aggregation functions and allways return a single value.
Otherwise this leads to confusing / problematic behaviour like seen here: #12030
Example Problem from linked issue:
df=pl.DataFrame(
{
"id": ["1", "1", "2"],
"text": ["a", "b", "c"], # df1# "text": ["a", "b", None], # df2: None instead of "c"
}
)
df.group_by("id").agg(
list=pl.col("text").drop_nulls(),
concat=pl.col("text").drop_nulls().str.concat(),
)
# df1: expected behaviour
┌─────┬────────────┬────────┐
│ id ┆ list ┆ concat │
│ --- ┆ --- ┆ --- │
│ str ┆ list[str] ┆ str │
╞═════╪════════════╪════════╡
│ 1 ┆ ["a", "b"] ┆ a-b │
│ 2 ┆ ["c"] ┆ c │
└─────┴────────────┴────────┘
# df2: whhhyyy? =)
┌─────┬────────────┬───────────┐
│ id ┆ list ┆ concat │
│ --- ┆ --- ┆ --- │
│ str ┆ list[str] ┆ list[str] │ # >>>>> expect "str"
╞═════╪════════════╪═══════════╡
│ 1 ┆ ["a", "b"] ┆ ["a-b"] │ # >>>>> expect "a-b" like above
│ 2 ┆ [] ┆ [] │ # >>>>> expect "" because `str.concat` on empyt list should be "" not Shape (0,) Series
└─────┴────────────┴───────────┘
Doing the same where the aggregation always produces a single value works fine:
Checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of Polars.
Reproducible example
Log output
No response
Issue description
str.concat
should behave like other vertical aggregation functions and allways return a single value.Otherwise this leads to confusing / problematic behaviour like seen here: #12030
Example Problem from linked issue:
Doing the same where the aggregation always produces a single value works fine:
Expected behavior
str.concat
should always return a single valueInstalled versions
The text was updated successfully, but these errors were encountered: