Skip to content

Commit

Permalink
Merge pull request #2 from allegro/open-source-documentation
Browse files Browse the repository at this point in the history
Open Source documentation - first version
  • Loading branch information
riccardo-alle authored Feb 19, 2024
2 parents 6cc88df + 5090035 commit 8a92c99
Show file tree
Hide file tree
Showing 21 changed files with 1,368 additions and 11 deletions.
32 changes: 32 additions & 0 deletions .github/workflows/deploy.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
name: Deploy docs

on:
workflow_dispatch:
push:
branches: main
paths:
- 'docs/**'
- 'mkdocs.yml'
- 'Pipfile'

jobs:
build:
runs-on: [self-hosted, linux]
steps:

- uses: actions/checkout@v4

- uses: actions/setup-python@v4
with:
python-version: '3.10'

- run: python -m pip install build

- name: Install poetry
run: make install-poetry

- name: Install dependencies
run: make install-env

- name: Build and Deploy docs
run: make deploy-docs
6 changes: 5 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -12,4 +12,8 @@ linter::
poetry run pylint llm_wrapper --reports=no --output-format=colorized --fail-under=8.0

tests::
poetry run python -m pytest -s --verbose
poetry run python -m pytest -s --verbose


deploy-docs::
poetry run mkdocs gh-deploy --force
44 changes: 44 additions & 0 deletions docs/api/input_output_dataclasses.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
---
layout: default
title: Input/Output dataclasses
nav_order: 0
parent: API
---


## `class llm_wrapper.domain.input_data.InputData` dataclass
```python
@dataclass
class InputData:
input_mappings: Dict[str, str]
id: str
```
#### Fields
- `input_mappings` (`Dict[str, str]`): Contains mapping from symbolic variables used in the prompt to the actual data
that will be injected in place of these variables. You have to provide a map for each of symbolic variable used
in the prompt.
- `id` (`str`): Unique identifier. Requests are done in an async mode, so the order of the responses is not the same
as the order of the input data, so this field can be used to identify them.

## `class llm_wrapper.domain.response.ResponseData` dataclass
```python
@dataclass
class ResponseData:
response: Union[str, BaseModel]
input_data: Optional[InputData] = None

number_of_prompt_tokens: Optional[int] = None
number_of_generated_tokens: Optional[int] = None
error: Optional[str] = None
```
#### Fields
- `response` (`Union[str, BaseModel]`): Contains response of the model. If `output_data_model_class` param was provided
to the `generate()` method, it'll contain response parsed to the provided class. If `output_data_model_class` wasn't
provided, it'll contain raw string returned from the model.
- `input_data` (`Optional[InputData]`): If `input_data` was provided to the `generate()` method, it'll copy-paste that
data to this field.
- `number_of_prompt_tokens` (`int`): Number of tokens used in the prompt.
- `number_of_generated_tokens` (`str`): Number of tokens generated by the model.
- `error` (`str`): If any error that prevented from completing the generation pipeline fully occurred, it'll be listed
here.

82 changes: 82 additions & 0 deletions docs/api/models/azure_llama2_model.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
---
layout: default
title: AzureLlama2Model
parent: Models
grand_parent: API
nav_order: 3
---

## `class llm_wrapper.models.AzureLlama2Model` API
### Methods
```python
__init__(
temperature: float = 0.0,
top_p: float = 1.0,
max_output_tokens: int = 512,
model_total_max_tokens: int = 4096,
max_concurrency: int = 1000,
max_retries: int = 8
)
```
#### Parameters
- `temperature` (`float`): The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more
random, while lower values like 0.2 will make it more focused and deterministic. Default: `0.0`.
- `top_p` (`float`): Default: `1.0`.
- `max_output_tokens` (`int`): The maximum number of tokens to generate by the model. The total length of input tokens
and generated tokens is limited by the model's context length. Default: `512`.
- `model_total_max_tokens` (`int`): Context length of the model - maximum number of input plus generated tokens.
Default: `4096`.
- `max_concurrency` (`int`): Maximum number of concurrent requests. Default: `1000`.
- `max_retries` (`int`): Maximum number of retries if a request fails. Default: `8`.

---

```python
generate(
prompt: str,
input_data: typing.Optional[typing.List[InputData]] = None,
output_data_model_class: typing.Optional[typing.Type[BaseModel]] = None
) -> typing.List[ResponseData]:
```
#### Parameters
- `prompt` (`str`): Prompt to use to query the model.
- `input_data` (`Optional[List[InputData]]`): If prompt contains symbolic variables you can use this parameter to
generate model responses for batch of examples. Each symbolic variable from the prompt should have mapping provided
in the `input_mappings` of `InputData`.
- `output_data_model_class` (`Optional[Type[BaseModel]]`): If provided forces the model to generate output in the
format defined by the passed class. Generated response is automatically parsed to this class.

#### Returns
`List[ResponseData]`: Each `ResponseData` contains the response for a single example from `input_data`. If `input_data`
is not provided, the length of this list is equal 1, and the first element is the response for the raw prompt.

---

```python
AzureLlama2Model.setup_environment(
azure_api_key: str,
azure_endpoint_url: str,
azure_deployment_name: str
)
```
#### Parameters
- `azure_api_key` (`str`): Authentication key for the endpoint.
- `azure_endpoint_url` (`str`): URL of pre-existing endpoint.
- `azure_deployment_name` (`str`): The name under which the model was deployed.

---

### Example usage
```python
from llm_wrapper.models import AzureLlama2Model
from llm_wrapper.domain.configuration import AzureSelfDeployedConfiguration

configuration = AzureSelfDeployedConfiguration(
api_key="<AZURE_API_KEY>",
endpoint_url="<AZURE_ENDPOINT_URL>",
deployment="<AZURE_DEPLOYMENT_NAME>"
)

llama_model = AzureLlama2Model(config=configuration)
llama_response = llama_model.generate("2+2 is?")
```
82 changes: 82 additions & 0 deletions docs/api/models/azure_mistral_model.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
---
layout: default
title: AzureMistralModel
parent: Models
grand_parent: API
nav_order: 5
---

## `class llm_wrapper.models.AzureMistralModel` API
### Methods
```python
__init__(
temperature: float = 0.0,
top_p: float = 1.0,
max_output_tokens: int = 1024,
model_total_max_tokens: int = 8192,
max_concurrency: int = 1000,
max_retries: int = 8
)
```
#### Parameters
- `temperature` (`float`): The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more
random, while lower values like 0.2 will make it more focused and deterministic. Default: `0.0`.
- `top_p` (`float`): Default: `1.0`.
- `max_output_tokens` (`int`): The maximum number of tokens to generate by the model. The total length of input tokens
and generated tokens is limited by the model's context length. Default: `1024`.
- `model_total_max_tokens` (`int`): Context length of the model - maximum number of input plus generated tokens.
Default: `8192`.
- `max_concurrency` (`int`): Maximum number of concurrent requests. Default: `1000`.
- `max_retries` (`int`): Maximum number of retries if a request fails. Default: `8`.

---

```python
generate(
prompt: str,
input_data: typing.Optional[typing.List[InputData]] = None,
output_data_model_class: typing.Optional[typing.Type[BaseModel]] = None
) -> typing.List[ResponseData]:
```
#### Parameters
- `prompt` (`str`): Prompt to use to query the model.
- `input_data` (`Optional[List[InputData]]`): If prompt contains symbolic variables you can use this parameter to
generate model responses for batch of examples. Each symbolic variable from the prompt should have mapping provided
in the `input_mappings` of `InputData`.
- `output_data_model_class` (`Optional[Type[BaseModel]]`): If provided forces the model to generate output in the
format defined by the passed class. Generated response is automatically parsed to this class.

#### Returns
`List[ResponseData]`: Each `ResponseData` contains the response for a single example from `input_data`. If `input_data`
is not provided, the length of this list is equal 1, and the first element is the response for the raw prompt.

---

```python
AzureMistralModel.setup_environment(
azure_api_key: str,
azure_endpoint_url: str,
azure_deployment_name: str
)
```
#### Parameters
- `azure_api_key` (`str`): Authentication key for the endpoint.
- `azure_endpoint_url` (`str`): URL of pre-existing endpoint.
- `azure_deployment_name` (`str`): The name under which the model was deployed.

---

### Example usage
```python
from llm_wrapper.models import AzureMistralModel
from llm_wrapper.domain.configuration import AzureSelfDeployedConfiguration

configuration = AzureSelfDeployedConfiguration(
api_key="<AZURE_API_KEY>",
endpoint_url="<AZURE_ENDPOINT_URL>",
deployment="<AZURE_DEPLOYMENT_NAME>"
)

mistral_model = AzureMistralAIModel(config=configuration)
mistral_response = mistral_model.generate("2+2 is?")
```
93 changes: 93 additions & 0 deletions docs/api/models/azure_openai_model.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
---
layout: default
title: AzureOpenAIModel
parent: Models
grand_parent: API
nav_order: 1
---

## `class llm_wrapper.models.AzureOpenAIModel` API
### Methods
```python
__init__(
temperature: float = 0.0,
max_output_tokens: int = 512,
request_timeout_s: int = 60,
model_total_max_tokens: int = 4096,
max_concurrency: int = 1000,
max_retries: int = 8
)
```
#### Parameters
- `temperature` (`float`): The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more
random, while lower values like 0.2 will make it more focused and deterministic. Default: `0.0`.
- `max_output_tokens` (`int`): The maximum number of tokens to generate by the model. The total length of input tokens
and generated tokens is limited by the model's context length. Default: `512`.
- `request_timeout_s` (`int`): Timeout for requests to the model. Default: `60`.
- `model_total_max_tokens` (`int`): Context length of the model - maximum number of input plus generated tokens.
Default: `4096`.
- `max_concurrency` (`int`): Maximum number of concurrent requests. Default: `1000`.
- `max_retries` (`int`): Maximum number of retries if a request fails. Default: `8`.

---

```python
generate(
prompt: str,
input_data: typing.Optional[typing.List[InputData]] = None,
output_data_model_class: typing.Optional[typing.Type[BaseModel]] = None
) -> typing.List[ResponseData]:
```
#### Parameters
- `prompt` (`str`): Prompt to use to query the model.
- `input_data` (`Optional[List[InputData]]`): If prompt contains symbolic variables you can use this parameter to
generate model responses for batch of examples. Each symbolic variable from the prompt should have mapping provided
in the `input_mappings` of `InputData`.
- `output_data_model_class` (`Optional[Type[BaseModel]]`): If provided forces the model to generate output in the
format defined by the passed class. Generated response is automatically parsed to this class.

#### Returns
`List[ResponseData]`: Each `ResponseData` contains the response for a single example from `input_data`. If `input_data`
is not provided, the length of this list is equal 1, and the first element is the response for the raw prompt.

---

```python
AzureOpenAIModel.setup_environment(
openai_api_key: str,
openai_api_base: str,
openai_api_version: str,
openai_api_deployment_name: str,
openai_api_type: str = "azure",
model_name: str = "gpt-3.5-turbo",
)
```
Sets up the environment for the `AzureOpenAIModel` model.
#### Parameters
- `openai_api_key` (`str`): The API key for your Azure OpenAI resource. You can find this in the Azure portal under
your Azure OpenAI resource.
- `openai_api_base` (`str`): The base URL for your Azure OpenAI resource. You can find this in the Azure portal under
your Azure OpenAI resource.
- `openai_api_version` (`str`): The API version.
- `openai_api_deployment_name` (`str`): The name under which the model was deployed.
- `openai_api_type` (`str`): Default: `"azure"`.
- `model_name` (`str`): Model name to use. Default: `"gpt-3.5-turbo"`.

---

### Example usage
```python
from llm_wrapper.models import AzureOpenAIModel
from llm_wrapped.domain.configuration import AzureOpenAIConfiguration

configuration = AzureOpenAIConfiguration(
api_key="<OPENAI_API_KEY>",
base_url="<OPENAI_API_BASE>",
api_version="<OPENAI_API_VERSION>",
deployment="<OPENAI_API_DEPLOYMENT_NAME>",
model_name="<OPENAI_API_MODEL_NAME>"
)

gpt_model = AzureOpenAIModel(config=configuration)
gpt_response = gpt_model.generate("2+2 is?")
```
Loading

0 comments on commit 8a92c99

Please sign in to comment.