-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
llama.cpp unresponsive for 20 seconds #62
Comments
I spent all day with this error X_X The main issue is that the newest version of llama.cpp is INCOMPATIBLE with gpt-lamma.cpp. so you need to go into releases and download one from june or july. Im not sure if august works. I'm using the one from july 14th. The new issue is that the older version does not support gguf files, so you will need to use a .bin file instead for your chat model. you can get those here. https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGML/tree/main another issue I now get, the ai sometimes chats with ITSELF, which is very odd. I do not know how to solve this issue yet, but if anyone figures it out, please let me know :) in the meantime, I am going to try another downgrade to see if it helps. |
Thank you! I will try downgrading my version of llama.cpp tomorrow and see how that goes for me. |
Hello, i have the same issue, but I'm using the dolphin-2.5-mixtral-8x7b.Q5_K_M.gguf. |
I'm trying to use this to run Auto-GPT. As a test, before hooking it up to use Auto-GPT, I tried it with Chatbot-UI. However, gpt-llama.cpp keeps locking up with
LLAMA.CPP UNRESPONSIVE FOR 20 SECS. ATTEMPTING TO RESUME GENERATION
whenever the LLM finishes its response. I'm using gpt4-x-alpaca-13B-GGML which I converted to gguf with the tools in llama.cpp. Using llama.cpp alone the model works fine (albeit not the smartest). What can I do to solve this issue?The text was updated successfully, but these errors were encountered: