-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Phi3-mini-INT4 performance speedup with numpy 2.0 or above #899
Comments
Hi @bkaruman, thanks for bringing to our attention - which Intel CPU are you running on? |
Intel Core Ultra Meteor Lake |
Thanks. And the only difference is you upgraded to 0.4.0? How long is the prompt you are running? (Or range of lengths) |
|
To clarify, you use exactly the same configuration and the only difference is the numpy version? Can you quantify the speedup? |
Hi @bkaruman, we can't reproduce the speedup with numpy 2. Closing this issue for now. Please re-open if you continue to observe this behavior. |
Hi!
I am observing significant speedup in tokenization and prompt processing throughput with numpyv2 on Intel CPUs.
Can someone help understand how GenAI makes use of numpy during tokenization/prompt phase?
The text was updated successfully, but these errors were encountered: