docs: update docs (#165)

* docs: update docs
jitsi · Feb 24, 2025 · 399ae65 · 399ae65
1 parent 0796add
commit 399ae65
Show file tree

Hide file tree

Showing 6 changed files with 121 additions and 29 deletions.
diff --git a/README.md b/README.md
@@ -6,31 +6,36 @@ It is comprised of specialized modules which can be enabled or disabled as neede
 
 - **Summary and Action Items** with vllm (or Ollama)
 - **Live Transcriptions** with Faster Whisper via websockets
+- **RAG Assistant**
 - 🚧 _More to follow_
 
 ## Requirements
 
 - Poetry
 - Redis
 
-## Summaries Quickstart
+## Summaries / Assistant Quickstart
 
 ```bash
-# If VLLM cannot be used, make sure to have Ollama started. In that case LLAMA_PATH should be the model name, like "llama3.1".
-export LLAMA_PATH="$HOME/models/Llama-3.1-8B-Instruct"
-
-# disable authorization (for testing)
+# disable authorization
 export BYPASS_AUTHORIZATION=1
 
 # start Redis
-docker run -d --rm -p 6379:6379 redis 
+docker run -d --rm -p 6379:6379 redis
 
+# If using vLLM (running on NVIDIA GPU)
+export LLAMA_PATH="$HOME/models/Llama-3.1-8B-Instruct"
 poetry install --with vllm
-./run.sh
 
-# open http://localhost:8000/summaries/docs in a browser
+# If using Ollama
+export LLAMA_PATH="llama.3.1"
+poetry install
+
+./run.sh
 ```
 
+Visit http://127.0.0.1:8000
+
 ## Live Transcriptions Quickstart
 
 > **Note**: Make sure to have ffmpeg < 7 installed and to update the `DYLD_LIBRARY_PATH` with the path to the ffmpeg 

diff --git a/docs/README.md b/docs/README.md
@@ -1,8 +1,9 @@
 # Skynet Documentation
 
-1. [Modules - Summaries](summaries_module.md)
-2. [Modules - Streaming Whisper Live Transcription](streaming_whisper_module.md)
-3. [Config - Environment Variables](env_vars.md)
-4. [Authentication](auth.md)
-5. [Monitoring](monitoring.md)
-6. [Demos](../demos/)
+1. [Modules - RAG Assistant](assistant.md)
+2. [Modules - Summaries](summaries_module.md)
+3. [Modules - Streaming Whisper Live Transcription](streaming_whisper_module.md)
+4. [Config - Environment Variables](env_vars.md)
+5. [Authentication](auth.md)
+6. [Monitoring](monitoring.md)
+7. [Demos](../demos/)
diff --git a/docs/assistant.md b/docs/assistant.md
@@ -0,0 +1,53 @@
+# Skynet RAG Assistant Module
+
+Enable the module by setting the `ENABLED_MODULES` env var to `assistant`.
+
+Allows you to index a crawled website into a vector store, save the store locally and in an s3 bucket and have it augment the prompt with relevant information for various AI assistant tasks.
+
+> All requests to this service will require a standard HTTP Authorization header with a Bearer JWT. Check the [**Authorization page**](auth.md) for detailed information on how to generate JWTs or disable authorization.
+
+## Requirements
+
+- Redis
+- Poetry
+
+## Configuration
+
+All of the configuration is done via env vars. Check the [Skynet Environment Variables](env_vars.md) page for a list of values.
+
+## Authorization
+
+Each vector store corresponds to a unique identifier, which the current implementation expects to be provided as a customer id parameter, which can be either be a `cid` field in a JWT, or as a `customer_id` query parameter
+
+Thus, when deploying this module, the deployer will also have the responsibility for establishing the access-control list based on this spec.
+
+## First run
+
+```bash
+# start Redis
+docker run -d --rm -p 6379:6379 redis
+
+# If using vLLM (running on NVIDIA GPU)
+export LLAMA_PATH="$HOME/models/Llama-3.1-8B-Instruct"
+poetry install --with vllm
+
+# If using Ollama
+export LLAMA_PATH="llama.3.1"
+poetry install
+
+./run.sh
+```
+
+Visit http://127.0.0.1:8000
+
+## Build Image
+
+```bash
+docker buildx build --push --progress plain --platform linux/amd64 -t your-registry/skynet:your-tag .
+```
+
+When running the resulting image, make sure to mount a model under `/models` on the container fs.
+
+### Code samples
+
+JavaScript: https://github.com/jitsi/skynet/blob/master/docs/sample.js
diff --git a/docs/env_vars.md b/docs/env_vars.md
@@ -6,23 +6,40 @@ Skynet is configurable via environment variables. Some are shared by all modules
 
 | **Name**                       | **Description**                                             | **Default**                               | **Available values**                                                            |
 |--------------------------------|-------------------------------------------------------------|-------------------------------------------|---------------------------------------------------------------------------------|
+| `ENABLE_METRICS`               | If the Prometheus metrics endpoint should be enabled or not | `true`                                    | `true`, `false`                                                                 |
 | `ENABLED_MODULES`              | Which modules should be enabled, separated by commas        | `summaries:dispatcher,summaries:executor,assistant` | `summaries:dispatcher`, `summaries:executor`, `assistant`, `streaming_whisper` |
 | `BYPASS_AUTHORIZATION`         | If signed JWT authorization should be enabled               | `false`                                   | `true`, `false`                                                                 |
-| `ENABLE_METRICS`               | If the Prometheus metrics endpoint should be enabled or not | `true`                                    | `true`, `false`                                                                 |
 | `ASAP_PUB_KEYS_REPO_URL`       | Public key repository URL                                   | `NULL`                                    | N/A                                                                             |
 | `ASAP_PUB_KEYS_FOLDER`         | Public key repository root path                             | `NULL`                                    | N/A                                                                             |
 | `ASAP_PUB_KEYS_AUDS`           | Allowed JWT audiences, separated by commas                  | `NULL`                                    | N/A                                                                             |
 | `ASAP_PUB_KEYS_MAX_CACHE_SIZE` | Public key maximum cache size in bytes                      | `512`                                     | N/A                                                                             |
 | `LOG_LEVEL`                    | Log level                                                   | `DEBUG`                                   | `DEBUG`, `INFO`, `WARNING`, `ERROR`, `CRITICAL`                                 |
 
 
+## Assistant Module Environment Variables
+
+| Name                             | **Description**                                                                                                                                    | **Default**                         | **Available values** |
+|----------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------|----------------------|
+| `EMBEDDINGS_MODEL_PATH`          | The path where the embeddings model is located.                                                                                                    | `nomic-ai/nomic-embed-text-v1.5`    | N/A                  |
+| `VECTOR_STORE_PATH`              | The default path where the vector store is saved locally                                                                                           | `_vector_store_`                    | N/A                  |
+
+
 ## Summaries Module Environment Variables
 
 | Name                             | **Description**                                                                                                                                    | **Default**                         | **Available values** |
 |----------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------|----------------------|
-| `LLAMA_PATH`                     | The path where the llama model is located.                                                                                                         | `NULL`                              | N/A                  |
+| `ENABLE_BATCHING`                | Enable submitting jobs for inference while others are running. The actual batching needs to be supported by the underlying inference processor     | `true`                              | `true`,`false`       |
+| `LLAMA_PATH`                     | The path where the llama model is located.                                                                                                         | `llama3.1`                          | N/A                  |
 | `LLAMA_N_CTX`                    | The context size of the llama model                                                                                                                | `128000`                            | N/A                  |
 | `JOB_TIMEOUT`                    | Timeout in seconds after which an inference job will be considered stuck and the app killed.                                                       | `300`                               | N/A                  |
+| `SUMMARY_MINIMUM_PAYLOAD_LENGTH` | The minimum payload length allowed for summarization.                                                                                              | `100`                               | N/A                  |
+| `SKYNET_LISTEN_IP`               | Default ip address on which the webserver is started.                                                                                              | `0.0.0.0`                           | N/A                  |
+| `SKYNET_PORT`                    | Default port on which the webserver is started.                                                                                                    | `8000`                              | N/A                  |
+
+## Redis vars
+
+| Name                             | **Description**                                                                                                                                    | **Default**                         | **Available values** |
+|----------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------|----------------------|
 | `REDIS_EXP_SECONDS`              | After how many seconds will a completed job expire/be deleted from Redis                                                                           | `1800`                              | N/A                  |
 | `REDIS_HOST`                     | Redis host                                                                                                                                         | `localhost`                         | N/A                  |
 | `REDIS_PORT`                     | Redis port                                                                                                                                         | `6379`                              | N/A                  |
@@ -34,8 +51,24 @@ Skynet is configurable via environment variables. Some are shared by all modules
 | `REDIS_USE_SECRETS_MANAGER`      | Use AWS Secrets Manager to retrieve credentials                                                                                                    | `false`                             | N/A                  |
 | `REDIS_NAMESPACE`                | Prefix for each Redis key                                                                                                                          | `skynet`                            | N/A                  |
 | `REDIS_AWS_REGION`               | The AWS region. Needed when using AWS Secrets Manager to retrieve credentials.                                                                     | `us-west-2`                         | N/A                  |
-| `SUMMARY_MINIMUM_PAYLOAD_LENGTH` | The minimum payload length allowed for summarization.                                                                                              | `100`                               | N/A                  |
-| `SKYNET_LISTEN_IP`               | Default ip address on which the webserver is started.                                                                                              | `0.0.0.0`                           | N/A                  |
+
+## OCI vars
+| Name                             | **Description**                                                                                                                                    | **Default**                         | **Available values** |
+|----------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------|----------------------|
+| `OCI_MODEL_ID`                   | OCI Model id                                                                                                                                       | NULL                                | N/A                  |
+| `OCI_SERVICE_ENDPOINT`           | OCI Service endpoint                                                                                                                               | `https://inference.generativeai.us-chicago-1.oci.oraclecloud.com`                                | N/A                  |
+| `OCI_COMPARTMENT_ID`             | OCI Compartment ID                                                                                                                                 | NULL                                | N/A                  |
+| `OCI_AUTH_TYPE`                  | OCI Authorization type                                                                                                                             | `API KEY`                           | N/A                  |
+| `OCI_CONFIG_PROFILE`             | OCI Config profile                                                                                                                                 | `DEFAULT`                           | N/A                  |
+
+## S3 vars (used for RAG vector store replication)
+| Name                             | **Description**                                                                                                                                    | **Default**                         | **Available values** |
+|----------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------|----------------------|
+| `SKYNET_S3_ACCESS_KEY`           | S3 access key                                                                                                                                      | NULL                                | N/A                  |
+| `SKYNET_S3_BUCKET`               | S3 bucket                                                                                                                                          | NULL                                | N/A                  |
+| `SKYNET_S3_ENDPOINT`             | S3 endpoint                                                                                                                                        | NULL                                | N/A                  |
+| `SKYNET_S3_REGION`               | S3 region                                                                                                                                          | NULL                                | N/A                  |
+| `SKYNET_S3_SECRET_KEY`           | S3 secret key                                                                                                                                      | NULL                                | N/A                  |
 
 ## Streaming Whisper Module Environment Variables
 

diff --git a/docs/summaries_module.md b/docs/summaries_module.md
@@ -19,22 +19,23 @@ Extracts summaries and action items from a given text. The service can be deploy
 
 All of the configuration is done via env vars. Check the [Skynet Environment Variables](env_vars.md) page for a list of values.
 
-## Running with Ollama
-
-First make sure Ollama is installed and running.
+## First run
 
 ```bash
-# Download the preferred llama model
-ollama pull llama3.1
-
-export LLAMA_PATH="llama3.1"
-# disable authorization (for testing)
+# disable authorization
 export BYPASS_AUTHORIZATION=1
 
 # start Redis
-docker run -d --rm -p 6379:6379 redis 
+docker run -d --rm -p 6379:6379 redis
 
+# If using vLLM (running on NVIDIA GPU)
+export LLAMA_PATH="$HOME/models/Llama-3.1-8B-Instruct"
+poetry install --with vllm
+
+# If using Ollama
+export LLAMA_PATH="llama.3.1"
 poetry install
+
 ./run.sh
 ```
 

diff --git a/skynet/env.py b/skynet/env.py
@@ -41,11 +41,10 @@ def tobool(val: str | None):
 llama_n_ctx = int(os.environ.get('LLAMA_N_CTX', 128000))
 
 embeddings_model_path = os.environ.get('EMBEDDINGS_MODEL_PATH', 'nomic-ai/nomic-embed-text-v1.5')
-embeddings_model_n_ctx = int(os.environ.get('EMBEDDINGS_MODEL_N_CTX', 8192))
 
 # azure openai api
 # latest ga version https://learn.microsoft.com/en-us/azure/ai-services/openai/api-version-deprecation#latest-ga-api-release
-azure_openai_api_version = os.environ.get('AZURE_OPENAI_API_VERSION', '2024-02-01')
+azure_openai_api_version = os.environ.get('AZURE_OPENAI_API_VERSION', '2024-10-21')
 
 # openai api
 openai_api_port = 8003