Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infer large_string type as pyarrow_numpy strings #54826

Merged
merged 3 commits into from
Sep 2, 2023
Merged

Conversation

phofl
Copy link
Member

@phofl phofl commented Aug 28, 2023

this covers the case where we don't overflow

@phofl phofl added Strings String extension data type and string data Arrow pyarrow functionality labels Aug 28, 2023
@phofl phofl changed the title Infer large_string as pyarrow_numpy strings Infer large_string type as pyarrow_numpy strings Aug 28, 2023
@phofl phofl added this to the 2.1.1 milestone Sep 1, 2023
@jorisvandenbossche
Copy link
Member

Looks good to me. We should handle the case of a too large array that would need to be split for being able to cast to "string", but that's fine for a follow up. Can you open an issue for that?

@phofl
Copy link
Member Author

phofl commented Sep 2, 2023

I was planning on keeping the other issue open

@phofl phofl merged commit 1539526 into pandas-dev:main Sep 2, 2023
@phofl phofl deleted the 54798 branch September 2, 2023 18:15
meeseeksmachine pushed a commit to meeseeksmachine/pandas that referenced this pull request Sep 2, 2023
phofl added a commit that referenced this pull request Sep 2, 2023
…w_numpy strings) (#54969)

Backport PR #54826: Infer large_string type as pyarrow_numpy strings

Co-authored-by: Patrick Hoefler <[email protected]>
mroeschke pushed a commit to mroeschke/pandas that referenced this pull request Sep 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Arrow pyarrow functionality Strings String extension data type and string data
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants