-
Notifications
You must be signed in to change notification settings - Fork 362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
1.78 - cannot load mixtral 8x7b anymore #1219
Comments
I'm also seeing this same error. |
Yes, unfortunately this is because of the backend refactor in ggerganov#10026 See ggerganov#10244 You can requantize the mixtral model or use https://huggingface.co/mradermacher/Mixtral-8x7B-Instruct-v0.1-GGUF/ I will see if I can port back the support for the old quants, but I cannot guarantee it. |
Thanks for the info, I was unaware of this. It seems that updated models are indeed available on HF. If these work, they will be the simplest solution. I'll report back once I have downloaded some and confirmed they work. |
The new quantizations are working for me with 1.78. Thank you! |
I have crafted an ugly hack because I hate losing backwards compatibility Should work again in the next version. |
Can confirm the new quantized models work for me too with 1.78. I kept the old ones for now, to test if the backwards compatibility "ugly hack" works in the next version. |
Hi,
After upgrading to 1.78 today, I can't load mixtral-based 8x7b models anymore.
Other models such as 30b/70b llama-type models work.
I get the same error whether I use vulkan or CLBlast, and with different models that also have different quantizations. (one q8_0, the other q6_m)
The error reads :
Previous versions of KoboldCPP worked with those same models without a problem.
After reverting, can confirm 1.77 works.
Both are "cu12" versions (I still use CUDA for smaller models).
System has 64 GB RAM, 16GB VRAM (3080Ti laptop), Windows 11
Thanks in advance,
The text was updated successfully, but these errors were encountered: