You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This reverts commit 6ff00a8.
The above commit causes Llama3.1 8B fp16 model to generate NaN logits
for prefill/decode.
Issue: #19506
Signed-off-by: archana-ramalingam <[email protected]>
Yes, the revert #19508 fixed it. Left the issue open as @pashu123 wanted to investigate why the reverted #19335 patch generated nan logits in the first place.
What happened?
While running
iree-run-module
Llama 3.1 8B fp16 prefill/decode generates nan logits with vmfb compiled after this commit:6ff00a8
Steps to reproduce your issue
cd iree
git checkout 6ff00a8a008d06b604d4ca4e0ae6e601ae810b4f
git submodule update --init
cmake -G Ninja -B ../iree-build/ -S . -DCMAKE_BUILD_TYPE=RelWithDebInfo -DIREE_ENABLE_RUNTIME_TRACING=ON -DIREE_ENABLE_ASSERTIONS=ON -DIREE_ENABLE_SPLIT_DWARF=ON -DIREE_ENABLE_THIN_ARCHIVES=ON -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DIREE_ENABLE_LLD=ON -DIREE_BUILD_PYTHON_BINDINGS=ON -DPython3_EXECUTABLE="$(which python)" -DIREE_HAL_DRIVER_HIP=ON -DIREE_HIP_TEST_TARGET_CHIP=gfx942 -DIREE_TARGET_BACKEND_ROCM=ON && cmake --build ../iree-build/
/data/llama3.1/8b/
cd iree
../iree-build/tools/iree-compile llama8b_f16_decomposed.mlir --iree-hip-target=gfx942 --iree-hal-target-backends=rocm -o=llama8b_f16_decomposed.vmfb
../iree-build/tools/iree-run-module --hip_use_streams=true --device_allocator=caching --module=llama8b_f16_decomposed.vmfb --parameters=model=8b_f16.irpa --device=hip://0 --function=prefill_bs4 --input=@~/tmp/prefill_args_bs4_128_stride_32/tokens.npy --input=@~/tmp/prefill_args_bs4_128_stride_32/seq_lens.npy --input=@~/tmp/prefill_args_bs4_128_stride_32/seq_block_ids.npy --input=@~/tmp/prefill_args_bs4_128_stride_32/cs_f16.npy
What component(s) does this issue relate to?
Compiler
Version information
commit SHA: 6ff00a8a008d06b604d4ca4e0ae6e601ae810b4f
Additional context
No response
The text was updated successfully, but these errors were encountered: