Add Support for Loading Models in 4-bit Quantized Version (Fixes #1798) #3476

02shanks · 2024-08-13T17:54:40Z

Why are these changes needed?

This pull request adds support for loading models in 4-bit quantized versions. This enhancement addresses the need for more efficient model loading and storage, particularly for resource-constrained environments.

Related issue number (if applicable)

Closes #1798

Checks

I've run format.sh to lint the changes in this PR.
I've included any doc changes needed.
I've made sure the relevant tests are passing (if applicable).

02shanks added 4 commits August 13, 2024 22:31

added support for 4bit quantization

8ffd817

added support for 4bit quantization

052d587

added support for 4bit quantization

b9bf4f2

added support for 4bit quantization

99c30b0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Support for Loading Models in 4-bit Quantized Version (Fixes #1798) #3476

Add Support for Loading Models in 4-bit Quantized Version (Fixes #1798) #3476

02shanks commented Aug 13, 2024

Add Support for Loading Models in 4-bit Quantized Version (Fixes #1798) #3476

Are you sure you want to change the base?

Add Support for Loading Models in 4-bit Quantized Version (Fixes #1798) #3476

Conversation

02shanks commented Aug 13, 2024

Why are these changes needed?

Related issue number (if applicable)

Checks