[CK_TILE] Implement fp8 quant tests/examples for layernorm and rmsnorm #1814

ruanjm · 2025-01-15T04:24:50Z

Compile option --offload-compress is added because the code object is too large. I'm not sure whether this flag would affect performance.
Y scale base of fp8 is set as 240 which is different from that of int8 (127). This base value kMaxY is hard coded in host code because static_cast in host side cannot interpret fp8 to float correctly.
Outputs of check_err() for fp8 is improved.

ruanjm requested review from junliume, illsilin, carlushuang, qianfengz, aosewski, poyenc, geyyer, bartekxk, andriy-ca and afagaj as code owners January 15, 2025 04:24

Implement fp8 quant for layernorm and rmsnorm

d7d27a6

ruanjm force-pushed the amd/dev/jruan/norm_fp8 branch from 054a2ee to d7d27a6 Compare January 15, 2025 04:32

ruanjm changed the title ~~Implement fp8 quant tests/examples for layernorm and rmsnorm~~ [CK_TILE] Implement fp8 quant tests/examples for layernorm and rmsnorm Jan 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CK_TILE] Implement fp8 quant tests/examples for layernorm and rmsnorm #1814

[CK_TILE] Implement fp8 quant tests/examples for layernorm and rmsnorm #1814

ruanjm commented Jan 15, 2025 •

edited

Loading

[CK_TILE] Implement fp8 quant tests/examples for layernorm and rmsnorm #1814

Are you sure you want to change the base?

[CK_TILE] Implement fp8 quant tests/examples for layernorm and rmsnorm #1814

Conversation

ruanjm commented Jan 15, 2025 • edited Loading

ruanjm commented Jan 15, 2025 •

edited

Loading