You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add ability to quantize to FP8. This will clearly need additional issues to be opened. Flags for the C++/Python API, Test cases, updates to our migraphx-driver, New kernels, a FP8 library , etc.
Success of this first issue should be an itemized list of created issues to support FP8
Figure out how to implement FP8 dtype inside MIGraphX.
Need more thoughts on this one.
Add backend libraries suppport rocblas, ck, miopen, MLIR and HIP Kernels (may need to define hip_fp8).
MLIR
CK
rocBlas
MIOpen
JIT Kernels
Write unit-tests for each of those backends.
Update parsing for FP8 models. Need to take into account QAT models with QDQ pairs surrounding convs/gemm only and entire FP8 models.
driver needs to support --fp8
handle fP32->fp16->fp8 already been quantized model
The text was updated successfully, but these errors were encountered:
Closing this issue as all the tasks are already completed for the FNUZ type. Same kind of tasks are required for the OCP FP8. @CharlieL7 you can open an issue for OCP FP8 following this template.
Add ability to quantize to FP8. This will clearly need additional issues to be opened. Flags for the C++/Python API, Test cases, updates to our migraphx-driver, New kernels, a FP8 library , etc.
Success of this first issue should be an itemized list of created issues to support FP8
Figure out how to implement FP8 dtype inside MIGraphX.
Need more thoughts on this one.
Add backend libraries suppport rocblas, ck, miopen, MLIR and HIP Kernels (may need to define hip_fp8).
Write unit-tests for each of those backends.
Update parsing for FP8 models. Need to take into account QAT models with QDQ pairs surrounding convs/gemm only and entire FP8 models.
driver needs to support --fp8
handle fP32->fp16->fp8 already been quantized model
The text was updated successfully, but these errors were encountered: