Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(python,rust): str.concat on empty list #12066

Merged
merged 3 commits into from
Oct 28, 2023

Conversation

Julian-J-S
Copy link
Contributor

aggregation methods should always return a single value, otherwise it can lead to unexpected behavior and bugs (e.g. #12053)

current behavior:

  • empty list: return Series of shape (0,) >>>>> this is the bug
  • otherwise: return Series of shape (1,) with the concatenated/joined string

PR:

pl.Series([], dtype=pl.Utf8).str.concat()

# OLD
shape: (0,)
Series: '' [str]
[
]

# NEW
shape: (1,)
Series: '' [str]
[
	""
]

Note: this does not address the issue of skipping nulls (#10814) but only fixes the bug for empty lists

@github-actions github-actions bot added fix Bug fix python Related to Python Polars rust Related to Rust Polars labels Oct 27, 2023
@reswqa
Copy link
Collaborator

reswqa commented Oct 27, 2023

I personally +1 for this change and it keep the behavior before #11488.

I will take a look at the remaining issue of skipping nulls.

@Julian-J-S
Copy link
Contributor Author

I will take a look at the remaining issue of skipping nulls.

If you work on this you should probably look into here #6919
I wrote a small summary of the problem and ritchie also said:

I would accept a PR with "ignore" and "raise" as null_behavior arguments.

So I think the goal is to either "ignore" or "raise" if nulls occur (current problem is that they appear as null in text form :D)

@reswqa
Copy link
Collaborator

reswqa commented Oct 27, 2023

So I think the goal is to either "ignore" or "raise" if nulls occur (current problem is that they appear as null in text form :D)

I'm a little hesitant to add the raise option, for the same reasons mentioned in #10814 (comment). And I also support ignore and propogate null(which means ["foo", null, "bar"].str.concat() == null). Of course, I'm open to discussion :)

@orlp
Copy link
Collaborator

orlp commented Oct 27, 2023

@reswqa I would greenlight implementation with ignore and propagate, with ignore as the default. We can always add a raise option later if we do change our mind.

@reswqa
Copy link
Collaborator

reswqa commented Oct 27, 2023

@orlp Ah, make sense. I will draft this recently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fix Bug fix python Related to Python Polars rust Related to Rust Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants