Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support [ollama](https://github.com/ollama/ollama) or self-hosted LLMs with more generic interfaces. #37

Closed
2 tasks done
fujitatomoya opened this issue Aug 8, 2024 · 2 comments · Fixed by #38 or #40
Closed
2 tasks done

Comments

@fujitatomoya
Copy link
Owner

fujitatomoya commented Aug 8, 2024

spun off from #21 as more concrete example which is ollama.

#36 is merged as part of this task, but it only bypasses dedicated configuration and setting not to fail with other LLMs instead of OpenAI. there are several things need to be done to keep more generic configuration and interfaces.

  • Documentations: currently ros2ai describes OpenAI use only, so are the environmental setting and documentation. it would be nice to support multiple LLMs especially self-hosted LLMs such as ollama including documentation.
  • Refactoring Implementation: Environmental variables, Classes, Interfaces needs to be redesign to support other LLMs with OpenAI.
@fujitatomoya
Copy link
Owner Author

Some information:

import ollama
from openai import OpenAI

### Curl query
# curl http://localhost:11434/api/generate -d '{"model": "llama3.1", "prompt": "Why is the sky blue?", "stream": true}'
# curl http://localhost:11434/api/chat -d '{ "model": "llama3", "messages": [{ "role": "user", "content": "why is the sky blue?" }]}'

### pull models via API
#ollama.pull('gemma:latest')
#ollama.pull('llama3.1:latest')

### OpenAI API compatibility
client = OpenAI(
    api_key = 'dummy',
    base_url = 'http://localhost:11434/v1'
)

completion = client.chat.completions.create(
    model = 'llama3.1',
    messages = [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello!"}
    ]
)
print(completion.choices[0].message.content)

print('--------------------------------------------------------------')

### Single Short API
response = ollama.chat(
    model='gemma2:latest',
    messages=[
        {
            'role': 'user',
            'content': 'Why is the sky blue?',
        },
    ]
)
print(response['message']['content'])

print('--------------------------------------------------------------')

### Stream API Example
stream = ollama.chat(
    model='gemma:latest',
    messages=[
        {
            'role': 'user',
            'content': 'Why is the sky blue?',
    }],
    stream=True,
    options={
        "temperature": 0,
        "num_predict": 512,
    },
)
for chunk in stream:
    print(chunk["message"]["content"], end="", flush=True)

print('--------------------------------------------------------------')

### Generate API
response = ollama.generate(model="gemma:latest", prompt="Who are you?")
print(response)

print('--------------------------------------------------------------')

It provides the compatible API with OpenAI Python API as well. that means we can use the OpenAI Python API with switching URL either OpenAI or Ollama. one concern is API incompatibility in the future, since Ollama needs to catch up with OpenAI Python API and there will be some delay. So we can have generic Ollama Python API in another option flag api, so that user can use Ollama models with that API path. (bypassing Ollama to OpenAI Python API is just a use case of local LLM service instead of OpenAI.)

@fujitatomoya
Copy link
Owner Author

So there has been back and forth about this issue.

the conclusion is to support only OpenAI Python API (which is compatible with ollama https://ollama.com/blog/openai-compatibility), that is said we can keep this tool as OpenAI Python API, and replace the service backend as we like.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant