Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow dependency numpy to be >= 2.0.0 #4882

Open
lorenzwalthert opened this issue Oct 2, 2024 · 11 comments
Open

Allow dependency numpy to be >= 2.0.0 #4882

lorenzwalthert opened this issue Oct 2, 2024 · 11 comments
Assignees
Labels
component: utility apis dependencies Pull requests that update a dependency file

Comments

@lorenzwalthert
Copy link

lorenzwalthert commented Oct 2, 2024

Describe the feature you'd like
In June 2024, numpy 2.0.0 was released. sagemaker-python-sdk depends on numpy>=1.9.0,<2.0. This creates a dependency hell for me, as I have dependencies in my python package that depend on numpy >= 2.0.0.

You could either enforce numpy >= 2.0.0 and make new releases of the package incompatible with numpy < 2.0.0 or keep supporting the currently supported numpy versions, but also add those >= 2.0.0. I.e. depending on whether or not there are breaking changes with numpy >= 2.0.0 in your code base, establish different code paths depending on the installed version of numpy.

How would this feature be used? Please describe.

Ensuring I can resolve dependencies in Python packages that have both this SDK as well as other dependencies with a requirement for numpy >= 2.0.

Describe alternatives you've considered

I don't think there is an alternative, as in the long run, this problem will get worse as more and more other packages depend on numpy >= 2.0.0. I am surprised no one has opened an issue until now.

Additional context
I am also opening a support case with AWS Premium Support.

@seberg
Copy link

seberg commented Oct 8, 2024

Is there anything holding up removal of the pin? From a quick scan, I would think it is basically a few tiny replacements for things like np.NaN which ruff check path/to/code/ --select NPY201 --fix will just do.
There are a few uses np.int which might be fishy if (and only if!) this code ever runs on windows (otherwise, it may be fishy, but there is no change in NumPy 2).

@jakirkham
Copy link

@ellisonbg do you know who we should talk to about NumPy 2 support here?

@victoriarouton
Copy link

Any status on this when we could expect this support to be added?

@radoshi
Copy link

radoshi commented Dec 10, 2024

Any updates here? We're in the same situation.

@aaravind100
Copy link

Hi @nargokul, I see this pr #4955 is merged. Would this be part of the next release?

@wickeat
Copy link

wickeat commented Dec 16, 2024

Seems like pr #4963 reverted the change for numpy. Was it intentional? @nargokul

@amorisot
Copy link

same

@lorenzwalthert
Copy link
Author

lorenzwalthert commented Dec 30, 2024

@knikure and @zhaoqizqwang, you reviewed the PR. numpy compatibility is a big issue for many people here. Any idea?

@wickeat
Copy link

wickeat commented Jan 10, 2025

@nargokul Seems like numpy update PR was actively closed, with a currently open PR existing. Any blocker for the update?

@mufaddal-rohawala
Copy link
Member

There are compatibility challenges in upgrading Amazon SageMaker Python SDK with NumPy 2.0 since pandas library currently lacks support for NumPy 2.0. We will incorporate the latest NumPy version once pandas provides the necessary support. Your patience is appreciated as we await this external dependency.

The update from Pandas has a caveat mentioned below in https://pandas.pydata.org/pandas-docs/stable/whatsnew/v2.2.2.html

One major caveat is that arrays created with numpy 2.0’s new StringDtype will convert to object dtyped arrays upon Series/DataFrame creation. Full support for numpy 2.0’s StringDtype is expected to land in pandas 3.0.

This causes regression in SageMaker PySDK functionalities and hence we will need to wait for pandas 3.0 to make this update.

Please reach out to our support team if you have any further inquiries.

@seberg
Copy link

seberg commented Jan 23, 2025

since pandas library currently lacks support for NumPy 2.0.

Pandas has reasonable support in newer versions since the first release of NumPy (version 2.2.2). There might be pandas versions which miss a numpy <2 pin making it possible to accidentally install incompatible versions.

But, unless you have another dependency that still enforces pandas<2.2.2 that should not stop you from updating.

The update from Pandas has a caveat mentioned below [StringDType not supported]

Sorry, but there seems to be a misunderstanding here: The new StringDType in NumPy should not be relevant. It is new and simply doesn't affect existing code.
(If this somehow is a problem, it should be a 3rd party, i.e. one that immediately started using StringDType, dependency problem that I don't think you should worry about.)

Please reach out to our support team if you have any further inquiries.

@mufaddal-rohawala it would be helpful if there was an issue to discuss the exact problem. If this seems like a NumPy/pandas related difficulty NumPy and pandas maintainers are certainly here to help. Please do reach out to me (or pandas maintainers, e.g. with an issue there).

@nargokul nargokul self-assigned this Feb 3, 2025
@nargokul nargokul added dependencies Pull requests that update a dependency file component: utility apis labels Feb 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: utility apis dependencies Pull requests that update a dependency file
Projects
None yet
Development

No branches or pull requests

10 participants