-
Notifications
You must be signed in to change notification settings - Fork 250
Enable torchao.experimental EmbeddingQuantization #1520
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@Jack-Khuu I'd love to have a crack at this if possible! Would you mind assigning it to me? |
Totally, give it a shot |
Hi @dillondesilva, how's the task going? Any questions? |
Hey @Jack-Khuu - I've just been busy with mid-semester exams in the past week. Should have time to start sometime this week and will send questions soon 👍 Thanks for checking in! |
@Jack-Khuu Good news! Here's the PR -> #1525 I don't know if I've oversimplified it so feel free to correct me if I'm wrong but to enable the above experimental quantizers, I'm assuming all that was needed was:
|
Sweet! I'll take a look |
🚀 The feature, motivation and pitch
Quantization is a technique used to reduce the speed, size, or memory requirements of a model and torchao is PyTorch's native quantization library for inference and training
There are new experimental quantizations in torchao that we would like to enable in torchchat. Specifically this task is for enabling EmbeddingQuantizer and SharedEmbeddingQuantizer.
Entrypoint:
torchchat/torchchat/utils/quantize.py
Line 101 in 1384f7d
Task: Using ExecuTorch as a reference (pytorch/executorch#9548) add support for EmbeddingQuantizer and SharedEmbeddingQuantizer.
cc: @metascroy, @manuelcandales
Alternatives
No response
Additional context
No response
RFC (Optional)
No response
The text was updated successfully, but these errors were encountered: