Question about dequantization #643
Unanswered
HaoWeiWang
asked this question in
Q&A
Replies: 1 comment
-
It's to center the dequantized range around zero before scaling. You mask two 4-bit values from an 8-bit field to get w_a << 4 and w_b. The former is implicitly divided by 16 (1024/64) instead of shifting the bitfield four places. The 1024 comes about because of how FP16 values are normalized. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
OS
Linux
GPU Library
CUDA 12.x
Python version
3.12
Pytorch version
2.1.1
Model
No response
Describe the bug
I would like to ask why when dequantizing, we need to subtract 1024 or 64 and then add it back.
Reproduction steps
Empty
Expected behavior
Empty
Logs
No response
Additional context
No response
Acknowledgements
Beta Was this translation helpful? Give feedback.
All reactions