New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

[REQUEST] GraniteMoeForCausalLM architecture support #722

Open

3 tasks done

cal066 opened this issue Jan 27, 2025 · 0 comments

Open

3 tasks done

[REQUEST] GraniteMoeForCausalLM architecture support #722

cal066 opened this issue Jan 27, 2025 · 0 comments

cal066 commented Jan 27, 2025

Problem

Exllamav2 is currently unable to quantize a MoE version of granite:
https://huggingface.co/ibm-granite/granite-3.1-3b-a800m-instruct

Regular granite seems fine however:
https://huggingface.co/ibm-granite/granite-3.1-8b-instruct

Solution

Exllamav2 supports quantizing GraniteMoeForCausalLM models.

Alternatives

No response

Explanation

Nice to have since regular granite is already supported.

Examples

No response

Additional context

No response

Acknowledgements

I have looked for similar requests before submitting this one.
I understand that the developers have lives and my issue will be answered when possible.
I understand the developers of this program are human, and I will make my requests politely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment