-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support [ollama](https://github.com/ollama/ollama) or self-hosted LLMs with more generic interfaces. #37
Comments
Some information:
import ollama
from openai import OpenAI
### Curl query
# curl http://localhost:11434/api/generate -d '{"model": "llama3.1", "prompt": "Why is the sky blue?", "stream": true}'
# curl http://localhost:11434/api/chat -d '{ "model": "llama3", "messages": [{ "role": "user", "content": "why is the sky blue?" }]}'
### pull models via API
#ollama.pull('gemma:latest')
#ollama.pull('llama3.1:latest')
### OpenAI API compatibility
client = OpenAI(
api_key = 'dummy',
base_url = 'http://localhost:11434/v1'
)
completion = client.chat.completions.create(
model = 'llama3.1',
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
]
)
print(completion.choices[0].message.content)
print('--------------------------------------------------------------')
### Single Short API
response = ollama.chat(
model='gemma2:latest',
messages=[
{
'role': 'user',
'content': 'Why is the sky blue?',
},
]
)
print(response['message']['content'])
print('--------------------------------------------------------------')
### Stream API Example
stream = ollama.chat(
model='gemma:latest',
messages=[
{
'role': 'user',
'content': 'Why is the sky blue?',
}],
stream=True,
options={
"temperature": 0,
"num_predict": 512,
},
)
for chunk in stream:
print(chunk["message"]["content"], end="", flush=True)
print('--------------------------------------------------------------')
### Generate API
response = ollama.generate(model="gemma:latest", prompt="Who are you?")
print(response)
print('--------------------------------------------------------------') It provides the compatible API with OpenAI Python API as well. that means we can use the OpenAI Python API with switching URL either OpenAI or Ollama. one concern is API incompatibility in the future, since Ollama needs to catch up with OpenAI Python API and there will be some delay. So we can have generic Ollama Python API in another option flag |
So there has been back and forth about this issue. the conclusion is to support only OpenAI Python API (which is compatible with ollama https://ollama.com/blog/openai-compatibility), that is said we can keep this tool as OpenAI Python API, and replace the service backend as we like. |
spun off from #21 as more concrete example which is ollama.
#36 is merged as part of this task, but it only bypasses dedicated configuration and setting not to fail with other LLMs instead of OpenAI. there are several things need to be done to keep more generic configuration and interfaces.
The text was updated successfully, but these errors were encountered: