Merge pull request #2 from allegro/open-source-documentation

Open Source documentation - first version
allegro · Feb 19, 2024 · 8a92c99 · 8a92c99
2 parents 6cc88df + 5090035
commit 8a92c99
Show file tree

Hide file tree

Showing 21 changed files with 1,368 additions and 11 deletions.
diff --git a/.github/workflows/deploy.yml b/.github/workflows/deploy.yml
@@ -0,0 +1,32 @@
+name: Deploy docs
+
+on:
+  workflow_dispatch:
+  push:
+    branches: main
+    paths:
+      - 'docs/**'
+      - 'mkdocs.yml'
+      - 'Pipfile'
+
+jobs:
+  build:
+    runs-on: [self-hosted, linux]
+    steps:
+
+      - uses: actions/checkout@v4
+
+      - uses: actions/setup-python@v4
+        with:
+          python-version: '3.10'
+
+      - run: python -m pip install build
+
+      - name: Install poetry
+        run: make install-poetry
+
+      - name: Install dependencies
+        run: make install-env
+
+      - name: Build and Deploy docs
+        run: make deploy-docs
diff --git a/Makefile b/Makefile
@@ -12,4 +12,8 @@ linter::
 	poetry run pylint llm_wrapper --reports=no --output-format=colorized --fail-under=8.0
 
 tests::
-	poetry run python -m pytest -s --verbose
+	poetry run python -m pytest -s --verbose
+
+
+deploy-docs::
+	poetry run mkdocs gh-deploy --force
diff --git a/docs/api/input_output_dataclasses.md b/docs/api/input_output_dataclasses.md
@@ -0,0 +1,44 @@
+---
+layout: default
+title: Input/Output dataclasses
+nav_order: 0
+parent: API
+---
+
+
+## `class llm_wrapper.domain.input_data.InputData` dataclass
+```python
+@dataclass
+class InputData:
+    input_mappings: Dict[str, str]
+    id: str
+```
+#### Fields
+- `input_mappings` (`Dict[str, str]`): Contains mapping from symbolic variables used in the prompt to the actual data
+  that will be injected in place of these variables. You have to provide a map for each of symbolic variable used
+  in the prompt. 
+- `id` (`str`): Unique identifier. Requests are done in an async mode, so the order of the responses is not the same
+   as the order of the input data, so this field can be used to identify them.
+
+## `class llm_wrapper.domain.response.ResponseData` dataclass
+```python
+@dataclass
+class ResponseData:
+    response: Union[str, BaseModel]
+    input_data: Optional[InputData] = None
+
+    number_of_prompt_tokens: Optional[int] = None
+    number_of_generated_tokens: Optional[int] = None
+    error: Optional[str] = None
+```
+#### Fields
+- `response` (`Union[str, BaseModel]`): Contains response of the model. If `output_data_model_class` param was provided
+  to the `generate()` method, it'll contain response parsed to the provided class. If `output_data_model_class` wasn't
+  provided, it'll contain raw string returned from the model. 
+- `input_data` (`Optional[InputData]`): If `input_data` was provided to the `generate()` method, it'll copy-paste that
+  data to this field.
+- `number_of_prompt_tokens` (`int`): Number of tokens used in the prompt.
+- `number_of_generated_tokens` (`str`): Number of tokens generated by the model. 
+- `error` (`str`): If any error that prevented from completing the generation pipeline fully occurred, it'll be listed
+   here.
+
diff --git a/docs/api/models/azure_llama2_model.md b/docs/api/models/azure_llama2_model.md
@@ -0,0 +1,82 @@
+---
+layout: default
+title: AzureLlama2Model
+parent: Models
+grand_parent: API
+nav_order: 3
+---
+
+## `class llm_wrapper.models.AzureLlama2Model` API
+### Methods
+```python
+__init__(
+    temperature: float = 0.0,
+    top_p: float = 1.0,
+    max_output_tokens: int = 512,
+    model_total_max_tokens: int = 4096,
+    max_concurrency: int = 1000,
+    max_retries: int = 8
+)
+```
+#### Parameters
+- `temperature` (`float`): The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more
+   random, while lower values like 0.2 will make it more focused and deterministic. Default: `0.0`.
+- `top_p` (`float`): Default: `1.0`.
+- `max_output_tokens` (`int`): The maximum number of tokens to generate by the model. The total length of input tokens 
+   and generated tokens is limited by the model's context length. Default: `512`.
+- `model_total_max_tokens` (`int`): Context length of the model - maximum number of input plus generated tokens.
+   Default: `4096`.
+- `max_concurrency` (`int`): Maximum number of concurrent requests. Default: `1000`.
+- `max_retries` (`int`): Maximum number of retries if a request fails. Default: `8`.
+
+---
+
+```python
+generate(
+    prompt: str,
+    input_data: typing.Optional[typing.List[InputData]] = None,
+    output_data_model_class: typing.Optional[typing.Type[BaseModel]] = None
+) -> typing.List[ResponseData]:
+```
+#### Parameters
+- `prompt` (`str`): Prompt to use to query the model.
+- `input_data` (`Optional[List[InputData]]`): If prompt contains symbolic variables you can use this parameter to
+   generate model responses for batch of examples. Each symbolic variable from the prompt should have mapping provided
+   in the `input_mappings` of `InputData`.
+- `output_data_model_class` (`Optional[Type[BaseModel]]`): If provided forces the model to generate output in the
+  format defined by the passed class. Generated response is automatically parsed to this class.
+
+#### Returns
+`List[ResponseData]`: Each `ResponseData` contains the response for a single example from `input_data`. If `input_data`
+is not provided, the length of this list is equal 1, and the first element is the response for the raw prompt. 
+
+---
+
+```python
+AzureLlama2Model.setup_environment(
+    azure_api_key: str,
+    azure_endpoint_url: str,
+    azure_deployment_name: str
+)
+```
+#### Parameters
+- `azure_api_key` (`str`): Authentication key for the endpoint.
+- `azure_endpoint_url` (`str`): URL of pre-existing endpoint.
+- `azure_deployment_name` (`str`): The name under which the model was deployed.
+
+---
+
+### Example usage
+```python
+from llm_wrapper.models import AzureLlama2Model 
+from llm_wrapper.domain.configuration import AzureSelfDeployedConfiguration
+
+configuration = AzureSelfDeployedConfiguration(
+    api_key="<AZURE_API_KEY>",
+    endpoint_url="<AZURE_ENDPOINT_URL>",
+    deployment="<AZURE_DEPLOYMENT_NAME>"
+)
+
+llama_model = AzureLlama2Model(config=configuration)
+llama_response = llama_model.generate("2+2 is?")
+```
diff --git a/docs/api/models/azure_mistral_model.md b/docs/api/models/azure_mistral_model.md
@@ -0,0 +1,82 @@
+---
+layout: default
+title: AzureMistralModel
+parent: Models
+grand_parent: API
+nav_order: 5
+---
+
+## `class llm_wrapper.models.AzureMistralModel` API
+### Methods
+```python
+__init__(
+    temperature: float = 0.0,
+    top_p: float = 1.0,
+    max_output_tokens: int = 1024,
+    model_total_max_tokens: int = 8192,
+    max_concurrency: int = 1000,
+    max_retries: int = 8
+)
+```
+#### Parameters
+- `temperature` (`float`): The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more
+   random, while lower values like 0.2 will make it more focused and deterministic. Default: `0.0`.
+- `top_p` (`float`): Default: `1.0`.
+- `max_output_tokens` (`int`): The maximum number of tokens to generate by the model. The total length of input tokens 
+   and generated tokens is limited by the model's context length. Default: `1024`.
+- `model_total_max_tokens` (`int`): Context length of the model - maximum number of input plus generated tokens.
+   Default: `8192`.
+- `max_concurrency` (`int`): Maximum number of concurrent requests. Default: `1000`.
+- `max_retries` (`int`): Maximum number of retries if a request fails. Default: `8`.
+
+---
+
+```python
+generate(
+    prompt: str,
+    input_data: typing.Optional[typing.List[InputData]] = None,
+    output_data_model_class: typing.Optional[typing.Type[BaseModel]] = None
+) -> typing.List[ResponseData]:
+```
+#### Parameters
+- `prompt` (`str`): Prompt to use to query the model.
+- `input_data` (`Optional[List[InputData]]`): If prompt contains symbolic variables you can use this parameter to
+   generate model responses for batch of examples. Each symbolic variable from the prompt should have mapping provided
+   in the `input_mappings` of `InputData`.
+- `output_data_model_class` (`Optional[Type[BaseModel]]`): If provided forces the model to generate output in the
+  format defined by the passed class. Generated response is automatically parsed to this class.
+
+#### Returns
+`List[ResponseData]`: Each `ResponseData` contains the response for a single example from `input_data`. If `input_data`
+is not provided, the length of this list is equal 1, and the first element is the response for the raw prompt. 
+
+---
+
+```python
+AzureMistralModel.setup_environment(
+    azure_api_key: str,
+    azure_endpoint_url: str,
+    azure_deployment_name: str
+)
+```
+#### Parameters
+- `azure_api_key` (`str`): Authentication key for the endpoint.
+- `azure_endpoint_url` (`str`): URL of pre-existing endpoint.
+- `azure_deployment_name` (`str`): The name under which the model was deployed.
+
+---
+
+### Example usage
+```python
+from llm_wrapper.models import AzureMistralModel 
+from llm_wrapper.domain.configuration import AzureSelfDeployedConfiguration
+
+configuration = AzureSelfDeployedConfiguration(
+    api_key="<AZURE_API_KEY>",
+    endpoint_url="<AZURE_ENDPOINT_URL>",
+    deployment="<AZURE_DEPLOYMENT_NAME>"
+)
+
+mistral_model = AzureMistralAIModel(config=configuration)
+mistral_response = mistral_model.generate("2+2 is?")
+```
diff --git a/docs/api/models/azure_openai_model.md b/docs/api/models/azure_openai_model.md
@@ -0,0 +1,93 @@
+---
+layout: default
+title: AzureOpenAIModel
+parent: Models
+grand_parent: API
+nav_order: 1
+---
+
+## `class llm_wrapper.models.AzureOpenAIModel` API
+### Methods
+```python
+__init__(
+    temperature: float = 0.0,
+    max_output_tokens: int = 512,
+    request_timeout_s: int = 60,
+    model_total_max_tokens: int = 4096,
+    max_concurrency: int = 1000,
+    max_retries: int = 8
+)
+```
+#### Parameters
+- `temperature` (`float`): The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more
+   random, while lower values like 0.2 will make it more focused and deterministic. Default: `0.0`.
+- `max_output_tokens` (`int`): The maximum number of tokens to generate by the model. The total length of input tokens 
+   and generated tokens is limited by the model's context length. Default: `512`.
+- `request_timeout_s` (`int`): Timeout for requests to the model. Default: `60`.
+- `model_total_max_tokens` (`int`): Context length of the model - maximum number of input plus generated tokens.
+   Default: `4096`.
+- `max_concurrency` (`int`): Maximum number of concurrent requests. Default: `1000`.
+- `max_retries` (`int`): Maximum number of retries if a request fails. Default: `8`.
+
+---
+
+```python
+generate(
+    prompt: str,
+    input_data: typing.Optional[typing.List[InputData]] = None,
+    output_data_model_class: typing.Optional[typing.Type[BaseModel]] = None
+) -> typing.List[ResponseData]:
+```
+#### Parameters
+- `prompt` (`str`): Prompt to use to query the model.
+- `input_data` (`Optional[List[InputData]]`): If prompt contains symbolic variables you can use this parameter to
+   generate model responses for batch of examples. Each symbolic variable from the prompt should have mapping provided
+   in the `input_mappings` of `InputData`.
+- `output_data_model_class` (`Optional[Type[BaseModel]]`): If provided forces the model to generate output in the
+  format defined by the passed class. Generated response is automatically parsed to this class.
+
+#### Returns
+`List[ResponseData]`: Each `ResponseData` contains the response for a single example from `input_data`. If `input_data`
+is not provided, the length of this list is equal 1, and the first element is the response for the raw prompt. 
+
+---
+
+```python
+AzureOpenAIModel.setup_environment(
+    openai_api_key: str,
+    openai_api_base: str,
+    openai_api_version: str,
+    openai_api_deployment_name: str,
+    openai_api_type: str = "azure",
+    model_name: str = "gpt-3.5-turbo",
+)
+```
+Sets up the environment for the `AzureOpenAIModel` model.
+#### Parameters
+- `openai_api_key` (`str`):  The API key for your Azure OpenAI resource. You can find this in the Azure portal under
+   your Azure OpenAI resource.
+- `openai_api_base` (`str`): The base URL for your Azure OpenAI resource. You can find this in the Azure portal under
+   your Azure OpenAI resource. 
+- `openai_api_version` (`str`): The API version.
+- `openai_api_deployment_name` (`str`): The name under which the model was deployed.
+- `openai_api_type` (`str`): Default: `"azure"`.
+- `model_name` (`str`): Model name to use. Default: `"gpt-3.5-turbo"`.
+
+---
+
+### Example usage
+```python
+from llm_wrapper.models import AzureOpenAIModel
+from llm_wrapped.domain.configuration import AzureOpenAIConfiguration
+
+configuration = AzureOpenAIConfiguration(
+    api_key="<OPENAI_API_KEY>",
+    base_url="<OPENAI_API_BASE>",
+    api_version="<OPENAI_API_VERSION>",
+    deployment="<OPENAI_API_DEPLOYMENT_NAME>",
+    model_name="<OPENAI_API_MODEL_NAME>"
+)
+
+gpt_model = AzureOpenAIModel(config=configuration)
+gpt_response = gpt_model.generate("2+2 is?")
+```