Replies: 2 comments 1 reply
-
Tabby does accept a local directory as a parameter for The naming convention, such as q8_0.v2.gguf, is mostly a legacy issue. In fact, you can place a file named q5_k_m into the |
Beta Was this translation helpful? Give feedback.
-
I see this discussion and would like to also ask if there also is a way to limit the context size, such that one can use a 6.7B q5 @6000 context? This will allow me to run a slightly more capable model on my 8GB GPU. Otherwise, support for partial offloading to GPU would also be useful, of course. |
Beta Was this translation helpful? Give feedback.
-
My server can't connect to an extranet, so I'm downloading gguf first and then transferring it to my server.
My folder looks like this:
In
tabby.json
. I just copy the content for deepseek-coder-6.7B.My command to launch tabby is
tabby serve --model /absolute/path/to/coder ...
.I found it instreresting that tabby always try to find the
q8_0.v2.gguf
file under the/absolute/path/to/coder/ggml
folder.Question:
Is the only useful thing in
tabby.json
"prompt_template"/"chat_template" (if i already have the gguf file)?Should I add the
models.json
(If I just want to use the local gguf)?By the way, I can lauch tabby when i modify the gguf file name to
q8_0.v2.gguf
.I am just worried about this, since my gguf is
Q5_K_M
instead ofq8
.Beta Was this translation helpful? Give feedback.
All reactions