[DEV][FP8] Improve e4m3 decoding #43

LeiWang1999 · 2024-05-21T11:53:28Z

This pull request primarily focuses on refining the type conversions and adjusting the precision in the testing function. The changes are aimed at improving the efficiency and accuracy of the code.

Here are the key changes:

Type conversion refinement:

python/bitblas/quantization/quantization.py: In the function _tir_u8_to_f8_e4m3_to_f16, the type of the shift operation has been changed from int16 to uint16. Also, the calculation of s_f16 and e_f16 has been modified to use bitwise operations.

Precision adjustment:

testing/python/operators/test_general_matmul_fp8.py: In the function map_torch_type, the relative and absolute tolerances for the torch.testing.assert_close function have been increased from 1e-2 to 1e-1 to adjust the precision of the test.

improve e4m3 decoding.

75d2f3d

LeiWang1999 merged commit c570a76 into microsoft:main May 21, 2024
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DEV][FP8] Improve e4m3 decoding #43

[DEV][FP8] Improve e4m3 decoding #43

LeiWang1999 commented May 21, 2024

[DEV][FP8] Improve e4m3 decoding #43

[DEV][FP8] Improve e4m3 decoding #43

Conversation

LeiWang1999 commented May 21, 2024