Support for Llama-2-7B-32K-Instruct? #2720
quarterturn
started this conversation in
General
Replies: 2 comments 1 reply
-
Doesn't look like it needs anything special. You might need to set rope scaling. |
Beta Was this translation helpful? Give feedback.
1 reply
-
works well with llama-cpp-python and llama.cpp. I was able to use this on a single 3090:
I was unable to get the model to work properly at anthing other than f16, though. Even 8_0 resulted in broken replies. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
https://huggingface.co/togethercomputer/Llama-2-7B-32K-Instruct
"Model Description
Llama-2-7B-32K-Instruct is an open-source, long-context chat model finetuned from Llama-2-7B-32K, over high-quality instruction and chat data."
Beta Was this translation helpful? Give feedback.
All reactions