-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #2 from allegro/open-source-documentation
Open Source documentation - first version
- Loading branch information
Showing
21 changed files
with
1,368 additions
and
11 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
name: Deploy docs | ||
|
||
on: | ||
workflow_dispatch: | ||
push: | ||
branches: main | ||
paths: | ||
- 'docs/**' | ||
- 'mkdocs.yml' | ||
- 'Pipfile' | ||
|
||
jobs: | ||
build: | ||
runs-on: [self-hosted, linux] | ||
steps: | ||
|
||
- uses: actions/checkout@v4 | ||
|
||
- uses: actions/setup-python@v4 | ||
with: | ||
python-version: '3.10' | ||
|
||
- run: python -m pip install build | ||
|
||
- name: Install poetry | ||
run: make install-poetry | ||
|
||
- name: Install dependencies | ||
run: make install-env | ||
|
||
- name: Build and Deploy docs | ||
run: make deploy-docs |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
--- | ||
layout: default | ||
title: Input/Output dataclasses | ||
nav_order: 0 | ||
parent: API | ||
--- | ||
|
||
|
||
## `class llm_wrapper.domain.input_data.InputData` dataclass | ||
```python | ||
@dataclass | ||
class InputData: | ||
input_mappings: Dict[str, str] | ||
id: str | ||
``` | ||
#### Fields | ||
- `input_mappings` (`Dict[str, str]`): Contains mapping from symbolic variables used in the prompt to the actual data | ||
that will be injected in place of these variables. You have to provide a map for each of symbolic variable used | ||
in the prompt. | ||
- `id` (`str`): Unique identifier. Requests are done in an async mode, so the order of the responses is not the same | ||
as the order of the input data, so this field can be used to identify them. | ||
|
||
## `class llm_wrapper.domain.response.ResponseData` dataclass | ||
```python | ||
@dataclass | ||
class ResponseData: | ||
response: Union[str, BaseModel] | ||
input_data: Optional[InputData] = None | ||
|
||
number_of_prompt_tokens: Optional[int] = None | ||
number_of_generated_tokens: Optional[int] = None | ||
error: Optional[str] = None | ||
``` | ||
#### Fields | ||
- `response` (`Union[str, BaseModel]`): Contains response of the model. If `output_data_model_class` param was provided | ||
to the `generate()` method, it'll contain response parsed to the provided class. If `output_data_model_class` wasn't | ||
provided, it'll contain raw string returned from the model. | ||
- `input_data` (`Optional[InputData]`): If `input_data` was provided to the `generate()` method, it'll copy-paste that | ||
data to this field. | ||
- `number_of_prompt_tokens` (`int`): Number of tokens used in the prompt. | ||
- `number_of_generated_tokens` (`str`): Number of tokens generated by the model. | ||
- `error` (`str`): If any error that prevented from completing the generation pipeline fully occurred, it'll be listed | ||
here. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,82 @@ | ||
--- | ||
layout: default | ||
title: AzureLlama2Model | ||
parent: Models | ||
grand_parent: API | ||
nav_order: 3 | ||
--- | ||
|
||
## `class llm_wrapper.models.AzureLlama2Model` API | ||
### Methods | ||
```python | ||
__init__( | ||
temperature: float = 0.0, | ||
top_p: float = 1.0, | ||
max_output_tokens: int = 512, | ||
model_total_max_tokens: int = 4096, | ||
max_concurrency: int = 1000, | ||
max_retries: int = 8 | ||
) | ||
``` | ||
#### Parameters | ||
- `temperature` (`float`): The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more | ||
random, while lower values like 0.2 will make it more focused and deterministic. Default: `0.0`. | ||
- `top_p` (`float`): Default: `1.0`. | ||
- `max_output_tokens` (`int`): The maximum number of tokens to generate by the model. The total length of input tokens | ||
and generated tokens is limited by the model's context length. Default: `512`. | ||
- `model_total_max_tokens` (`int`): Context length of the model - maximum number of input plus generated tokens. | ||
Default: `4096`. | ||
- `max_concurrency` (`int`): Maximum number of concurrent requests. Default: `1000`. | ||
- `max_retries` (`int`): Maximum number of retries if a request fails. Default: `8`. | ||
|
||
--- | ||
|
||
```python | ||
generate( | ||
prompt: str, | ||
input_data: typing.Optional[typing.List[InputData]] = None, | ||
output_data_model_class: typing.Optional[typing.Type[BaseModel]] = None | ||
) -> typing.List[ResponseData]: | ||
``` | ||
#### Parameters | ||
- `prompt` (`str`): Prompt to use to query the model. | ||
- `input_data` (`Optional[List[InputData]]`): If prompt contains symbolic variables you can use this parameter to | ||
generate model responses for batch of examples. Each symbolic variable from the prompt should have mapping provided | ||
in the `input_mappings` of `InputData`. | ||
- `output_data_model_class` (`Optional[Type[BaseModel]]`): If provided forces the model to generate output in the | ||
format defined by the passed class. Generated response is automatically parsed to this class. | ||
|
||
#### Returns | ||
`List[ResponseData]`: Each `ResponseData` contains the response for a single example from `input_data`. If `input_data` | ||
is not provided, the length of this list is equal 1, and the first element is the response for the raw prompt. | ||
|
||
--- | ||
|
||
```python | ||
AzureLlama2Model.setup_environment( | ||
azure_api_key: str, | ||
azure_endpoint_url: str, | ||
azure_deployment_name: str | ||
) | ||
``` | ||
#### Parameters | ||
- `azure_api_key` (`str`): Authentication key for the endpoint. | ||
- `azure_endpoint_url` (`str`): URL of pre-existing endpoint. | ||
- `azure_deployment_name` (`str`): The name under which the model was deployed. | ||
|
||
--- | ||
|
||
### Example usage | ||
```python | ||
from llm_wrapper.models import AzureLlama2Model | ||
from llm_wrapper.domain.configuration import AzureSelfDeployedConfiguration | ||
|
||
configuration = AzureSelfDeployedConfiguration( | ||
api_key="<AZURE_API_KEY>", | ||
endpoint_url="<AZURE_ENDPOINT_URL>", | ||
deployment="<AZURE_DEPLOYMENT_NAME>" | ||
) | ||
|
||
llama_model = AzureLlama2Model(config=configuration) | ||
llama_response = llama_model.generate("2+2 is?") | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,82 @@ | ||
--- | ||
layout: default | ||
title: AzureMistralModel | ||
parent: Models | ||
grand_parent: API | ||
nav_order: 5 | ||
--- | ||
|
||
## `class llm_wrapper.models.AzureMistralModel` API | ||
### Methods | ||
```python | ||
__init__( | ||
temperature: float = 0.0, | ||
top_p: float = 1.0, | ||
max_output_tokens: int = 1024, | ||
model_total_max_tokens: int = 8192, | ||
max_concurrency: int = 1000, | ||
max_retries: int = 8 | ||
) | ||
``` | ||
#### Parameters | ||
- `temperature` (`float`): The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more | ||
random, while lower values like 0.2 will make it more focused and deterministic. Default: `0.0`. | ||
- `top_p` (`float`): Default: `1.0`. | ||
- `max_output_tokens` (`int`): The maximum number of tokens to generate by the model. The total length of input tokens | ||
and generated tokens is limited by the model's context length. Default: `1024`. | ||
- `model_total_max_tokens` (`int`): Context length of the model - maximum number of input plus generated tokens. | ||
Default: `8192`. | ||
- `max_concurrency` (`int`): Maximum number of concurrent requests. Default: `1000`. | ||
- `max_retries` (`int`): Maximum number of retries if a request fails. Default: `8`. | ||
|
||
--- | ||
|
||
```python | ||
generate( | ||
prompt: str, | ||
input_data: typing.Optional[typing.List[InputData]] = None, | ||
output_data_model_class: typing.Optional[typing.Type[BaseModel]] = None | ||
) -> typing.List[ResponseData]: | ||
``` | ||
#### Parameters | ||
- `prompt` (`str`): Prompt to use to query the model. | ||
- `input_data` (`Optional[List[InputData]]`): If prompt contains symbolic variables you can use this parameter to | ||
generate model responses for batch of examples. Each symbolic variable from the prompt should have mapping provided | ||
in the `input_mappings` of `InputData`. | ||
- `output_data_model_class` (`Optional[Type[BaseModel]]`): If provided forces the model to generate output in the | ||
format defined by the passed class. Generated response is automatically parsed to this class. | ||
|
||
#### Returns | ||
`List[ResponseData]`: Each `ResponseData` contains the response for a single example from `input_data`. If `input_data` | ||
is not provided, the length of this list is equal 1, and the first element is the response for the raw prompt. | ||
|
||
--- | ||
|
||
```python | ||
AzureMistralModel.setup_environment( | ||
azure_api_key: str, | ||
azure_endpoint_url: str, | ||
azure_deployment_name: str | ||
) | ||
``` | ||
#### Parameters | ||
- `azure_api_key` (`str`): Authentication key for the endpoint. | ||
- `azure_endpoint_url` (`str`): URL of pre-existing endpoint. | ||
- `azure_deployment_name` (`str`): The name under which the model was deployed. | ||
|
||
--- | ||
|
||
### Example usage | ||
```python | ||
from llm_wrapper.models import AzureMistralModel | ||
from llm_wrapper.domain.configuration import AzureSelfDeployedConfiguration | ||
|
||
configuration = AzureSelfDeployedConfiguration( | ||
api_key="<AZURE_API_KEY>", | ||
endpoint_url="<AZURE_ENDPOINT_URL>", | ||
deployment="<AZURE_DEPLOYMENT_NAME>" | ||
) | ||
|
||
mistral_model = AzureMistralAIModel(config=configuration) | ||
mistral_response = mistral_model.generate("2+2 is?") | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,93 @@ | ||
--- | ||
layout: default | ||
title: AzureOpenAIModel | ||
parent: Models | ||
grand_parent: API | ||
nav_order: 1 | ||
--- | ||
|
||
## `class llm_wrapper.models.AzureOpenAIModel` API | ||
### Methods | ||
```python | ||
__init__( | ||
temperature: float = 0.0, | ||
max_output_tokens: int = 512, | ||
request_timeout_s: int = 60, | ||
model_total_max_tokens: int = 4096, | ||
max_concurrency: int = 1000, | ||
max_retries: int = 8 | ||
) | ||
``` | ||
#### Parameters | ||
- `temperature` (`float`): The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more | ||
random, while lower values like 0.2 will make it more focused and deterministic. Default: `0.0`. | ||
- `max_output_tokens` (`int`): The maximum number of tokens to generate by the model. The total length of input tokens | ||
and generated tokens is limited by the model's context length. Default: `512`. | ||
- `request_timeout_s` (`int`): Timeout for requests to the model. Default: `60`. | ||
- `model_total_max_tokens` (`int`): Context length of the model - maximum number of input plus generated tokens. | ||
Default: `4096`. | ||
- `max_concurrency` (`int`): Maximum number of concurrent requests. Default: `1000`. | ||
- `max_retries` (`int`): Maximum number of retries if a request fails. Default: `8`. | ||
|
||
--- | ||
|
||
```python | ||
generate( | ||
prompt: str, | ||
input_data: typing.Optional[typing.List[InputData]] = None, | ||
output_data_model_class: typing.Optional[typing.Type[BaseModel]] = None | ||
) -> typing.List[ResponseData]: | ||
``` | ||
#### Parameters | ||
- `prompt` (`str`): Prompt to use to query the model. | ||
- `input_data` (`Optional[List[InputData]]`): If prompt contains symbolic variables you can use this parameter to | ||
generate model responses for batch of examples. Each symbolic variable from the prompt should have mapping provided | ||
in the `input_mappings` of `InputData`. | ||
- `output_data_model_class` (`Optional[Type[BaseModel]]`): If provided forces the model to generate output in the | ||
format defined by the passed class. Generated response is automatically parsed to this class. | ||
|
||
#### Returns | ||
`List[ResponseData]`: Each `ResponseData` contains the response for a single example from `input_data`. If `input_data` | ||
is not provided, the length of this list is equal 1, and the first element is the response for the raw prompt. | ||
|
||
--- | ||
|
||
```python | ||
AzureOpenAIModel.setup_environment( | ||
openai_api_key: str, | ||
openai_api_base: str, | ||
openai_api_version: str, | ||
openai_api_deployment_name: str, | ||
openai_api_type: str = "azure", | ||
model_name: str = "gpt-3.5-turbo", | ||
) | ||
``` | ||
Sets up the environment for the `AzureOpenAIModel` model. | ||
#### Parameters | ||
- `openai_api_key` (`str`): The API key for your Azure OpenAI resource. You can find this in the Azure portal under | ||
your Azure OpenAI resource. | ||
- `openai_api_base` (`str`): The base URL for your Azure OpenAI resource. You can find this in the Azure portal under | ||
your Azure OpenAI resource. | ||
- `openai_api_version` (`str`): The API version. | ||
- `openai_api_deployment_name` (`str`): The name under which the model was deployed. | ||
- `openai_api_type` (`str`): Default: `"azure"`. | ||
- `model_name` (`str`): Model name to use. Default: `"gpt-3.5-turbo"`. | ||
|
||
--- | ||
|
||
### Example usage | ||
```python | ||
from llm_wrapper.models import AzureOpenAIModel | ||
from llm_wrapped.domain.configuration import AzureOpenAIConfiguration | ||
|
||
configuration = AzureOpenAIConfiguration( | ||
api_key="<OPENAI_API_KEY>", | ||
base_url="<OPENAI_API_BASE>", | ||
api_version="<OPENAI_API_VERSION>", | ||
deployment="<OPENAI_API_DEPLOYMENT_NAME>", | ||
model_name="<OPENAI_API_MODEL_NAME>" | ||
) | ||
|
||
gpt_model = AzureOpenAIModel(config=configuration) | ||
gpt_response = gpt_model.generate("2+2 is?") | ||
``` |
Oops, something went wrong.