-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[docs] Quantization tip #10249
base: main
Are you sure you want to change the base?
[docs] Quantization tip #10249
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left two broad comments and your plan looks perfect to me!
@a-r-r-o-w WDYT?
I think the following classes are remaining: We could either open this up to the community or tackle it ourselves in this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes!
Two broad comments. Once they are addressed I think we should be good to ship this 🚀
Thanks @sayakpaul! I included the latest batch of pipelines (Allegro, Latte, LTX, etc.) in this PR as well 🙂 |
Follows up on discussion about adding a quantization section for big models. Instead of adding it to the individual model doc (for example,
MochiTransformer3DModel
) and the pipeline doc, I think it's more discoverable/cleaner to add it only to the pipeline doc (for example,MochiPipeline
).Let me know if this works for you, and then I can add it to the other big models!