Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Phi3-mini-INT4 performance speedup with numpy 2.0 or above #899

Closed
bkaruman opened this issue Sep 18, 2024 · 6 comments
Closed

Phi3-mini-INT4 performance speedup with numpy 2.0 or above #899

bkaruman opened this issue Sep 18, 2024 · 6 comments

Comments

@bkaruman
Copy link

Hi!

I am observing significant speedup in tokenization and prompt processing throughput with numpyv2 on Intel CPUs.
Can someone help understand how GenAI makes use of numpy during tokenization/prompt phase?

@natke
Copy link
Contributor

natke commented Sep 23, 2024

Hi @bkaruman, thanks for bringing to our attention - which Intel CPU are you running on?

@bkaruman
Copy link
Author

Hi @bkaruman, thanks for bringing to our attention - which Intel CPU are you running on?

Intel Core Ultra Meteor Lake

@natke
Copy link
Contributor

natke commented Sep 23, 2024

Thanks. And the only difference is you upgraded to 0.4.0?

How long is the prompt you are running? (Or range of lengths)

@bkaruman
Copy link
Author

Thanks. And the only difference is you upgraded to 0.4.0?
Yes, here's the environment I use.
ONNX RT (1.19.0) + ORT GEN(0.4.0) + numpy v2.0.0
I have seen significant difference in perf running numpy 2.0 and numpy 1.24.4
How long is the prompt you are running? (Or range of lengths)
I am running for prompt length = 1024, 2048 and generation length=1024,2048

@natke
Copy link
Contributor

natke commented Sep 23, 2024

To clarify, you use exactly the same configuration and the only difference is the numpy version?

Can you quantify the speedup?

@natke
Copy link
Contributor

natke commented Oct 7, 2024

Hi @bkaruman, we can't reproduce the speedup with numpy 2. Closing this issue for now. Please re-open if you continue to observe this behavior.

@natke natke closed this as completed Oct 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants