Phi3-mini-INT4 performance speedup with numpy 2.0 or above #899

bkaruman · 2024-09-18T17:09:20Z

Hi!

I am observing significant speedup in tokenization and prompt processing throughput with numpyv2 on Intel CPUs.
Can someone help understand how GenAI makes use of numpy during tokenization/prompt phase?

natke · 2024-09-23T18:40:02Z

Hi @bkaruman, thanks for bringing to our attention - which Intel CPU are you running on?

bkaruman · 2024-09-23T18:44:47Z

Hi @bkaruman, thanks for bringing to our attention - which Intel CPU are you running on?

Intel Core Ultra Meteor Lake

natke · 2024-09-23T18:59:03Z

Thanks. And the only difference is you upgraded to 0.4.0?

How long is the prompt you are running? (Or range of lengths)

bkaruman · 2024-09-23T19:04:20Z

Thanks. And the only difference is you upgraded to 0.4.0?
Yes, here's the environment I use.
ONNX RT (1.19.0) + ORT GEN(0.4.0) + numpy v2.0.0
I have seen significant difference in perf running numpy 2.0 and numpy 1.24.4
How long is the prompt you are running? (Or range of lengths)
I am running for prompt length = 1024, 2048 and generation length=1024,2048

natke · 2024-09-23T19:06:19Z

To clarify, you use exactly the same configuration and the only difference is the numpy version?

Can you quantify the speedup?

natke · 2024-10-07T05:57:15Z

Hi @bkaruman, we can't reproduce the speedup with numpy 2. Closing this issue for now. Please re-open if you continue to observe this behavior.

natke closed this as completed Oct 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Phi3-mini-INT4 performance speedup with numpy 2.0 or above #899

Phi3-mini-INT4 performance speedup with numpy 2.0 or above #899

bkaruman commented Sep 18, 2024

natke commented Sep 23, 2024

bkaruman commented Sep 23, 2024

natke commented Sep 23, 2024

bkaruman commented Sep 23, 2024

natke commented Sep 23, 2024

natke commented Oct 7, 2024

Phi3-mini-INT4 performance speedup with numpy 2.0 or above #899

Phi3-mini-INT4 performance speedup with numpy 2.0 or above #899

Comments

bkaruman commented Sep 18, 2024

natke commented Sep 23, 2024

bkaruman commented Sep 23, 2024

natke commented Sep 23, 2024

bkaruman commented Sep 23, 2024

natke commented Sep 23, 2024

natke commented Oct 7, 2024