-
Notifications
You must be signed in to change notification settings - Fork 4.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support for 4bit quantization from transfomer library. #1798
Comments
Honestly, it's updating to transformers 4.30, adding one other dependency package, and about 8 changes in the code if I recall correctly. Plus it works with multi-gpus. Unfortunately I lost my changes from my running copy when I updated for the API updates, but I think most of the work is already done in my fork. |
Contributions are welcome |
@merrymercy is this issue still open for contribution? |
@02shanks absolutely!!!! |
@surak as this is my first code contribution, could you please guide me through the process? Where should I start? |
Well, the usual:
Nothing special, really! |
@surak @merrymercy I have just created the PR. Can you please review it? |
Loading a vicuna13B using 4bit quantization from the transformers library is possible load_in_4bit. How difficult could be for Fastach to support it?
The text was updated successfully, but these errors were encountered: