Support [ollama](https://github.com/ollama/ollama) or self-hosted LLMs with more generic interfaces. #37

fujitatomoya · 2024-08-08T21:48:03Z

spun off from #21 as more concrete example which is ollama.

#36 is merged as part of this task, but it only bypasses dedicated configuration and setting not to fail with other LLMs instead of OpenAI. there are several things need to be done to keep more generic configuration and interfaces.

Documentations: currently ros2ai describes OpenAI use only, so are the environmental setting and documentation. it would be nice to support multiple LLMs especially self-hosted LLMs such as ollama including documentation.
Refactoring Implementation: Environmental variables, Classes, Interfaces needs to be redesign to support other LLMs with OpenAI.

fujitatomoya · 2024-09-12T06:38:55Z

Some information:

Installation
To install ollamaL: curl -fsSL https://ollama.com/install.sh | sh and pip install ollama, see more details for https://github.com/ollama/ollama and https://github.com/ollama/ollama-python
Examples

import ollama
from openai import OpenAI

### Curl query
# curl http://localhost:11434/api/generate -d '{"model": "llama3.1", "prompt": "Why is the sky blue?", "stream": true}'
# curl http://localhost:11434/api/chat -d '{ "model": "llama3", "messages": [{ "role": "user", "content": "why is the sky blue?" }]}'

### pull models via API
#ollama.pull('gemma:latest')
#ollama.pull('llama3.1:latest')

### OpenAI API compatibility
client = OpenAI(
    api_key = 'dummy',
    base_url = 'http://localhost:11434/v1'
)

completion = client.chat.completions.create(
    model = 'llama3.1',
    messages = [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello!"}
    ]
)
print(completion.choices[0].message.content)

print('--------------------------------------------------------------')

### Single Short API
response = ollama.chat(
    model='gemma2:latest',
    messages=[
        {
            'role': 'user',
            'content': 'Why is the sky blue?',
        },
    ]
)
print(response['message']['content'])

print('--------------------------------------------------------------')

### Stream API Example
stream = ollama.chat(
    model='gemma:latest',
    messages=[
        {
            'role': 'user',
            'content': 'Why is the sky blue?',
    }],
    stream=True,
    options={
        "temperature": 0,
        "num_predict": 512,
    },
)
for chunk in stream:
    print(chunk["message"]["content"], end="", flush=True)

print('--------------------------------------------------------------')

### Generate API
response = ollama.generate(model="gemma:latest", prompt="Who are you?")
print(response)

print('--------------------------------------------------------------')

It provides the compatible API with OpenAI Python API as well. that means we can use the OpenAI Python API with switching URL either OpenAI or Ollama. one concern is API incompatibility in the future, since Ollama needs to catch up with OpenAI Python API and there will be some delay. So we can have generic Ollama Python API in another option flag api, so that user can use Ollama models with that API path. (bypassing Ollama to OpenAI Python API is just a use case of local LLM service instead of OpenAI.)

fujitatomoya · 2024-09-13T23:10:04Z

So there has been back and forth about this issue.

the conclusion is to support only OpenAI Python API (which is compatible with ollama https://ollama.com/blog/openai-compatibility), that is said we can keep this tool as OpenAI Python API, and replace the service backend as we like.

fujitatomoya mentioned this issue Sep 13, 2024

Feature support ollama #38

Merged

fujitatomoya closed this as completed in #38 Sep 13, 2024

fujitatomoya reopened this Sep 13, 2024

fujitatomoya mentioned this issue Sep 14, 2024

Doc update for Ollama Support. #40

Merged

fujitatomoya closed this as completed in #40 Sep 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support [ollama](https://github.com/ollama/ollama) or self-hosted LLMs with more generic interfaces. #37

Support [ollama](https://github.com/ollama/ollama) or self-hosted LLMs with more generic interfaces. #37

fujitatomoya commented Aug 8, 2024 •

edited

Loading

fujitatomoya commented Sep 12, 2024

fujitatomoya commented Sep 13, 2024

Support [ollama](https://github.com/ollama/ollama) or self-hosted LLMs with more generic interfaces. #37

Support [ollama](https://github.com/ollama/ollama) or self-hosted LLMs with more generic interfaces. #37

Comments

fujitatomoya commented Aug 8, 2024 • edited Loading

fujitatomoya commented Sep 12, 2024

fujitatomoya commented Sep 13, 2024

fujitatomoya commented Aug 8, 2024 •

edited

Loading