-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ipex backend enhancements #272
Conversation
yao-matrix
commented
Sep 23, 2024
- add feature-extraction task mapping for ipex backend, to support embedding models benchmark
- change examples' no_weight to false, no_weight will allocate weight buffers and random initialize them, which will ruin performance in numa cases, 2x perf drop for decoding phases
@IlyasMoutawwakil, pls help review, thx. |
Hi ! are you sure about this optimum-benchmark/optimum_benchmark/backends/transformers_utils.py Lines 190 to 205 in 01e4e59
It'll use the fasterst one of them optimum-benchmark/optimum_benchmark/backends/transformers_utils.py Lines 207 to 208 in 01e4e59
How does that ruin performance ? |
…mb3dding models benchmark 2. change examples' no_weight to false, no_weight will allocate weight buffers and random initialize them, which will ruin performance in numa cases, 2x perf drop for decoding phases Signed-off-by: Yao, Matrix <[email protected]>
seems random init funtions are mostly single thread function(e.g. here), and numa memory allocation strategy somewhat follows a "allocate-while-first-write" way. So in random initialization case, the weights memory will allocate near to the core executing the random initialization logic, which, for example, is done all in numa domain 0. So when model forward computation happens, which may spread across numa domain 0 and 1, the compute in numa domain 1 must far fetch the data from numa domain 0(since the data already allocated while random initialization), which bring far memory access cost. data evidence, using GCP c4-standard-96 instance to run |
very interesting behavior ! thanks for investigating it |