Question about dequantization #643

HaoWeiWang · 2024-09-25T01:24:37Z

HaoWeiWang
Sep 25, 2024

OS

Linux

GPU Library

CUDA 12.x

Python version

3.12

Pytorch version

2.1.1

Model

No response

Describe the bug

I would like to ask why when dequantizing, we need to subtract 1024 or 64 and then add it back.

Reproduction steps

Empty

Expected behavior

Empty

Logs

No response

Additional context

No response

Acknowledgements

I have looked for similar issues before submitting this one.
I understand that the developers have lives and my issue will be answered when possible.
I understand the developers of this program are human, and I will ask my questions politely.

turboderp · 2024-09-28T13:45:08Z

turboderp
Sep 28, 2024
Maintainer

It's to center the dequantized range around zero before scaling. You mask two 4-bit values from an 8-bit field to get w_a << 4 and w_b. The former is implicitly divided by 16 (1024/64) instead of shifting the bitfield four places. The 1024 comes about because of how FP16 values are normalized. 0x6400 maps to an FP16 value of 1.0, and 0x6400 | x maps to 1.0 + x if x is an integer between 0 and 511. 0xe400 is the same but with the sign bit set.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about dequantization #643

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Question about dequantization #643

HaoWeiWang Sep 25, 2024

OS

GPU Library

Python version

Pytorch version

Model

Describe the bug

Reproduction steps

Expected behavior

Logs

Additional context

Acknowledgements

Replies: 1 comment

turboderp Sep 28, 2024 Maintainer

HaoWeiWang
Sep 25, 2024

turboderp
Sep 28, 2024
Maintainer