Skip to content

Commit

Permalink
Initial async model plugin creation docs
Browse files Browse the repository at this point in the history
  • Loading branch information
simonw committed Nov 13, 2024
1 parent ceb60d2 commit 5f66149
Show file tree
Hide file tree
Showing 2 changed files with 50 additions and 0 deletions.
48 changes: 48 additions & 0 deletions docs/plugins/advanced-model-plugins.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,61 @@ The {ref}`model plugin tutorial <tutorial-model-plugin>` covers the basics of de

This document covers more advanced topics.

## Async models

Plugins can optionally provide an asynchronous version of their model, suitable for use with Python [asyncio](https://docs.python.org/3/library/asyncio.html). This is particularly useful for remote models accessible by an HTTP API.

The async version of a model subclasses `llm.AsyncModel` instead of `llm.Model`. It must implement an `async def execute()` async generator method instead of `def execute()`.

This example shows a subset of the OpenAI default plugin illustrating how this method might work:


```python
from typing import AsyncGenerator
import llm

class MyAsyncModel(llm.AsyncModel):
# This cn duplicate the model_id of the sync model:
model_id = "my-model-id"

async def execute(
self, prompt, stream, response, conversation=None
) -> AsyncGenerator[str, None]:
if stream:
completion = await client.chat.completions.create(
model=self.model_id,
messages=messages,
stream=True,
)
async for chunk in completion:
yield chunk.choices[0].delta.content
else:
completion = await client.chat.completions.create(
model=self.model_name or self.model_id,
messages=messages,
stream=False,
)
yield completion.choices[0].message.content
```
This async model instance should then be passed to the `register()` method in the `register_models()` plugin hook:

```python
@hookimpl
def register_models(register):
register(
MyModel(), MyAsyncModel(), aliases=("my-model-aliases",)
)
```

(advanced-model-plugins-attachments)=
## Attachments for multi-modal models

Models such as GPT-4o, Claude 3.5 Sonnet and Google's Gemini 1.5 are multi-modal: they accept input in the form of images and maybe even audio, video and other formats.

LLM calls these **attachments**. Models can specify the types of attachments they accept and then implement special code in the `.execute()` method to handle them.

See {ref}`the Python attachments documentation <python-api-attachments>` for details on using attachments in the Python API.

### Specifying attachment types

A `Model` subclass can list the types of attachments it accepts by defining a `attachment_types` class attribute:
Expand Down
2 changes: 2 additions & 0 deletions docs/python-api.md
Original file line number Diff line number Diff line change
Expand Up @@ -112,6 +112,8 @@ The `response.text()` method described earlier does this for you - it runs throu

If a response has been evaluated, `response.text()` will continue to return the same string.

(python-api-async)=

## Async models

Some plugins provide async versions of their supported models, suitable for use with Python [asyncio](https://docs.python.org/3/library/asyncio.html).
Expand Down

0 comments on commit 5f66149

Please sign in to comment.