Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix pickle and unpickling for all objects #17980

Merged
merged 8 commits into from
Feb 14, 2025

Conversation

galipremsagar
Copy link
Contributor

@galipremsagar galipremsagar commented Feb 11, 2025

Description

Fixes: #15459

This PR fixes hangs in pickle and unpickling code by registering custom pickling and unpickling methods for all proxied types.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@galipremsagar galipremsagar added bug Something isn't working non-breaking Non-breaking change labels Feb 11, 2025
@galipremsagar galipremsagar self-assigned this Feb 11, 2025
Copy link

copy-pr-bot bot commented Feb 11, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@github-actions github-actions bot added Python Affects Python cuDF API. cudf.pandas Issues specific to cudf.pandas labels Feb 11, 2025
@galipremsagar
Copy link
Contributor Author

/okay to test

@galipremsagar
Copy link
Contributor Author

/okay to test

@galipremsagar galipremsagar added the 3 - Ready for Review Ready for review by team label Feb 12, 2025
@galipremsagar galipremsagar marked this pull request as ready for review February 12, 2025 23:45
@galipremsagar galipremsagar requested a review from a team as a code owner February 12, 2025 23:45
@galipremsagar galipremsagar requested a review from vyasr February 13, 2025 00:17
Copy link
Contributor

@vyasr vyasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So IIUC this works because when we fall back to pd.read_pickle under the hood pandas is going to call pickle methods on its own types, so we're intercepting them at the lower level of the copyreg module? That seems generally fine in the pickle.load/pickle.dump case, but does it make us reliant on the internals of pd.read_pickle working in a certain way and calling the pickle methods?

python/cudf/cudf/pandas/_wrappers/pandas.py Show resolved Hide resolved
@galipremsagar
Copy link
Contributor Author

That seems generally fine in the pickle.load/pickle.dump case, but does it make us reliant on the internals of pd.read_pickle working in a certain way and calling the pickle methods?

We should generally be fine because pd.read_pickle and to_pickle are right now thin wrappers. If their implementations evolve and become complex we will need to inhouse those implementations. As of now, I don't see a need for it yet.

Copy link
Contributor

@vyasr vyasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation, that makes sense Prem. LGTM.

@galipremsagar galipremsagar added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 3 - Ready for Review Ready for review by team labels Feb 14, 2025
@galipremsagar
Copy link
Contributor Author

/merge

@rapids-bot rapids-bot bot merged commit c3d6b4c into rapidsai:branch-25.04 Feb 14, 2025
108 of 111 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge bug Something isn't working cudf.pandas Issues specific to cudf.pandas non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

[BUG] Unpickling objects with pd.read_pickle() doesn't work with cudf.pandas enabled
2 participants