Merge branch 'dev' of github.com:janhq/cortex.cpp into dev

janhq · Nov 4, 2024 · 5fde673 · 5fde673
2 parents f6978cd + 76d653f
commit 5fde673
Show file tree

Hide file tree

Showing 43 changed files with 1,811 additions and 562 deletions.
diff --git a/docs/docs/architecture/cortex-db.md b/docs/docs/architecture/cortex-db.md
@@ -0,0 +1,3 @@
+---
+title: cortex.db
+---
diff --git a/docs/docs/basic-usage/cortexrc.mdx → docs/docs/architecture/cortexrc.mdx b/docs/docs/basic-usage/cortexrc.mdx → docs/docs/architecture/cortexrc.mdx
diff --git a/docs/docs/data-folder.mdx → docs/docs/architecture/data-folder.mdx b/docs/docs/data-folder.mdx → docs/docs/architecture/data-folder.mdx
@@ -132,7 +132,7 @@ The main directory that stores all Cortex-related files, located in the user's h
 #### `models/`
 Contains the AI models used by Cortex for processing and generating responses.
 :::info
-For more information regarding the `model.list` and `model.yaml`, please see [here](/docs/model-yaml).
+For more information regarding the `model.list` and `model.yaml`, please see [here](/docs/capabilities/models/model-yaml).
 :::
 #### `logs/`
 Stores log files that are essential for troubleshooting and monitoring the performance of the Cortex.cpp API server and CLI.

diff --git a/docs/docs/assistants/index.md b/docs/docs/assistants/index.md
@@ -0,0 +1,3 @@
+---
+title: Assistants
+---
diff --git a/docs/docs/assistants/tools/index.md b/docs/docs/assistants/tools/index.md
@@ -0,0 +1,3 @@
+---
+title: Tools 
+---
diff --git a/docs/docs/basic-usage/server.mdx → docs/docs/basic-usage/api-server.mdx b/docs/docs/basic-usage/server.mdx → docs/docs/basic-usage/api-server.mdx
@@ -1,16 +1,11 @@
 ---
-title: API
+title: API Server
 description: Cortex Server Overview.
-slug: "server"
 ---
 
 import Tabs from "@theme/Tabs";
 import TabItem from "@theme/TabItem";
 
-:::warning
-🚧 Cortex.cpp is currently under development. Our documentation outlines the intended behavior of Cortex, which may not yet be fully implemented in the codebase.
-:::
-
 Cortex has an [API server](https://cortex.so/api-reference) that runs at `localhost:39281`.
 
 

diff --git a/docs/docs/basic-usage/command-line.md b/docs/docs/basic-usage/command-line.md
diff --git a/...ocs/basic-usage/integration/js-library.md → docs/docs/basic-usage/cortex-js.md b/...ocs/basic-usage/integration/js-library.md → docs/docs/basic-usage/cortex-js.md
@@ -1,9 +1,18 @@
 ---
 title: cortex.js
-description: How to integrate cortex.js with a Typescript application.
-slug: "ts-library"
+description: How to use the Cortex.js Library
 ---
 
+[Cortex.js](https://github.com/janhq/cortex.js) is a Typescript client library that can be used to interact with the Cortex API. 
+
+This is still a work in progress, and we will let the community know once a stable version is available. 
+
+:::warning
+🚧 Cortex.js is currently under development, and this page is a stub for future development. 
+:::
+
+
+<!-- 
 :::warning
 🚧 Cortex.cpp is currently under development. Our documentation outlines the intended behavior of Cortex, which may not yet be fully implemented in the codebase.
 :::
@@ -61,4 +70,4 @@ async function inference() {
 }
 
 inference();
-```
+``` -->
diff --git a/...ocs/basic-usage/integration/py-library.md → docs/docs/basic-usage/cortex-py.md b/...ocs/basic-usage/integration/py-library.md → docs/docs/basic-usage/cortex-py.md
@@ -1,9 +1,15 @@
 ---
 title: cortex.py
 description: How to integrate cortex.py with a Python application.
-slug: "py-library"
 ---
 
+
+:::warning
+🚧 Cortex.py is currently under development, and this page is a stub for future development. 
+:::
+
+
+<!-- 
 :::warning
 🚧 Cortex.cpp is currently under development. Our documentation outlines the intended behavior of Cortex, which may not yet be fully implemented in the codebase.
 :::
@@ -51,4 +57,4 @@ completion = client.chat.completions.create(
     ],
 )
 print(completion.choices[0].message.content)
-```
+``` -->
diff --git a/docs/docs/basic-usage/overview.mdx → docs/docs/basic-usage/index.mdx b/docs/docs/basic-usage/overview.mdx → docs/docs/basic-usage/index.mdx
@@ -1,6 +1,6 @@
 ---
 title: Overview
-description: Overview.
+description: Cortex Overview
 slug: "basic-usage"
 ---
 

diff --git a/docs/docs/built-in-models.mdx b/docs/docs/built-in-models.mdx
diff --git a/docs/docs/capabilities/audio-generation.md b/docs/docs/capabilities/audio-generation.md
@@ -0,0 +1,3 @@
+---
+unlisted: true
+---
diff --git a/docs/docs/capabilities/embeddings.md b/docs/docs/capabilities/embeddings.md
@@ -0,0 +1,7 @@
+---
+title: Embeddings
+---
+
+:::info
+🚧 Cortex is currently under development, and this page is a stub for future development. 
+:::
diff --git a/docs/docs/capabilities/hardware/index.md b/docs/docs/capabilities/hardware/index.md
@@ -0,0 +1,39 @@
+---
+title: Hardware Awareness
+draft: True
+---
+
+# Hardware Awareness
+
+Cortex is designed to be hardware aware, meaning it can detect your hardware configuration and automatically set parameters to optimize compatibility and performance, and avoid hardware-related errors.
+
+## Hardware Optimization
+
+Cortex's Hardware awareness allows it to do the following: 
+
+- Context Length Optimization: Cortex maximizes the context length allowed by your hardware, ensuring that you can work with larger datasets and more complex models without performance degradation.
+- Engine Optimization: we detect your CPU and GPU, and maintain a list of optimized engines for each hardware configuration, e.g. taking advantage of AVX-2 and AVX-512 instructions on CPUs. 
+
+## Hardware Awareness
+
+- Preventing hardware-related error
+- Error Handling for Insufficient VRAM: When loading a second model, Cortex provides useful error messages if there is insufficient VRAM memory. This proactive approach helps prevent out-of-memory errors and guides users on how to resolve the issue.
+
+### Model Compatibility
+
+- Model Compatibility Detection: Cortex automatically detects your hardware configuration to determine the compatibility of different models. This ensures that the models you use are optimized for your specific hardware setup.
+- This is for the Hub, and for existing Models 
+
+## Hardware Management
+
+### Activating Specific GPUs
+
+Cortex gives you the ability to activating specific GPUs for inference, giving you fine-grained control over hardware resources. This is especially useful for multi-GPU systems. 
+- Activate GPUs: Cortex can activate and utilize GPUs to accelerate processing, ensuring that computationally intensive tasks are handled efficiently.
+You also have the option to deactivate all GPUs, to run inference on only CPU and RAM. 
+
+### Hardware Monitoring
+
+- Monitoring System Usage
+- Monitor VRAM Usage: Cortex keeps track of VRAM usage to prevent out-of-memory (OOM) errors. It ensures that VRAM is used efficiently and provides warnings when resources are running low.
+- Monitor System Resource Usage: Cortex continuously monitors the usage of system resources, including CPU, RAM, and GPUs. This helps in maintaining optimal performance and identifying potential bottlenecks.
diff --git a/docs/docs/capabilities/image-generation.md b/docs/docs/capabilities/image-generation.md
@@ -0,0 +1,3 @@
+---
+unlisted: true
+---
diff --git a/docs/docs/model-overview.mdx → docs/docs/capabilities/models/index.mdx b/docs/docs/model-overview.mdx → docs/docs/capabilities/models/index.mdx
@@ -20,7 +20,7 @@ Cortex.cpp supports three model formats:
 - TensorRT-LLM
 
 :::info
-For details on each format, see the [Model Formats](/docs/model-yaml#model-formats) page.
+For details on each format, see the [Model Formats](/docs/capabilities/models/model-yaml#model-formats) page.
 :::
 
 ## Built-in Models 
@@ -38,5 +38,5 @@ You can see our full list of Built-in Models [here](/models).
 :::
 
 ## Next steps
-- Cortex requires a `model.yaml` file to run a model. Find out more [here](/docs/model-yaml).
+- Cortex requires a `model.yaml` file to run a model. Find out more [here](/docs/capabilities/models/model-yaml).
 - Cortex supports multiple model hubs hosting built-in models. See details [here](/docs/model-sources).
diff --git a/docs/docs/model-yaml.mdx → docs/docs/capabilities/models/model-yaml.mdx b/docs/docs/model-yaml.mdx → docs/docs/capabilities/models/model-yaml.mdx
@@ -6,24 +6,14 @@ description: The model.yaml
 import Tabs from "@theme/Tabs";
 import TabItem from "@theme/TabItem";
 
-
 :::warning
 🚧 Cortex.cpp is currently under development. Our documentation outlines the intended behavior of Cortex, which may not yet be fully implemented in the codebase.
 :::
 
 Cortex.cpp uses a `model.yaml` file to specify the configuration for running a model. Models can be downloaded from the Cortex Model Hub or Hugging Face repositories. Once downloaded, the model data is parsed and stored in the `models` folder.
 
-## `model.list`
-The `model.list` file acts as a registry for all model files used by Cortex.cpp. It keeps track of every downloaded and imported model by listing their details in a structured format. Each time a model is downloaded or imported, Cortex.cpp will automatically append an entry to `model.list` with the following format:
-```
-# Downloaded model
-<model-id> <author_repo-id> <branch-name> <path-to-model.yaml> <model-alias>
-
-# Imported model
-<model-id> local imported <path-to-model-id.yaml> <model-alias>
+## Structure of `model.yaml`
 
-```
-## `model.yaml` High Level Structure
 Here is an example of `model.yaml` format:
 ```yaml
 # BEGIN GENERAL METADATA
@@ -71,7 +61,7 @@ ngl: 33             # Undefined = loaded from model
 
 The `model.yaml` is composed of three high-level sections:
 
-### Cortex Meta
+### Model Metadata
 ```yaml
 model: gemma-2-9b-it-Q8_0 
 name: Llama 3.1      

diff --git a/docs/docs/model-presets.mdx → docs/docs/capabilities/models/presets.mdx b/docs/docs/model-presets.mdx → docs/docs/capabilities/models/presets.mdx
@@ -7,11 +7,12 @@ description: Model Presets
 🚧 Cortex.cpp is currently under development. Our documentation outlines the intended behavior of Cortex, which may not yet be fully implemented in the codebase.
 :::
 
+<!-- 
 ## Model Presets
 
 Model presets are saved `model.yaml` files that serve as templates for pre-configured model settings. These presets are designed to ensure optimal performance with the specified engine.
 These presets are not restricted to specific models. You can apply the presets to any model or any engine runtime.
 
 :::info
 Model presets override the values of the `model.yaml`. If presets are available, Cortex uses them. Otherwise, it defaults to `model.yaml` values.
-:::
+::: -->
diff --git a/docs/docs/capabilities/moderation.md b/docs/docs/capabilities/moderation.md
@@ -0,0 +1,3 @@
+---
+unlisted: true
+---
diff --git a/docs/docs/capabilities/reasoning.md b/docs/docs/capabilities/reasoning.md
@@ -0,0 +1,3 @@
+---
+unlisted: true
+---
diff --git a/docs/docs/capabilities/speech-to-text.md b/docs/docs/capabilities/speech-to-text.md
@@ -0,0 +1,3 @@
+---
+unlisted: true
+---
diff --git a/docs/docs/capabilities/text-generation.md b/docs/docs/capabilities/text-generation.md
@@ -0,0 +1,7 @@
+---
+title: Text Generation
+---
+
+:::info
+🚧 Cortex is currently under development, and this page is a stub for future development. 
+:::
diff --git a/docs/docs/capabilities/text-to-speech.md b/docs/docs/capabilities/text-to-speech.md
@@ -0,0 +1,3 @@
+---
+unlisted: true
+---
diff --git a/docs/docs/capabilities/vision.md b/docs/docs/capabilities/vision.md
@@ -0,0 +1,3 @@
+---
+unlisted: true
+---
diff --git a/docs/docs/chat-completions.mdx b/docs/docs/chat-completions.mdx
@@ -1,7 +1,6 @@
 ---
 title: Chat Completions
-description: Chat Completions Feature.
-slug: "text-generation"
+description: Chat Completions Feature
 ---
 
 import Tabs from "@theme/Tabs";

diff --git a/docs/docs/integrate-remote-engine.mdx → docs/docs/engines/engine-extension.mdx b/docs/docs/integrate-remote-engine.mdx → docs/docs/engines/engine-extension.mdx
@@ -1,8 +1,13 @@
 ---
-title: Integrate Remote Engine
-description: How to integrate remote engine into Cortex.
+title: Building Engine Extensions
+description: Cortex supports Engine Extensions to integrate both :ocal inference engines, and Remote APIs.
 ---
 
+:::info
+🚧 Cortex is currently under development, and this page is a stub for future development. 
+:::
+
+<!-- 
 import Tabs from "@theme/Tabs";
 import TabItem from "@theme/TabItem";
 
@@ -81,4 +86,4 @@ The `transformResponse` method is used to transform the data received from the e
 **Example: Anthropic Engine**
 
 In the Anthropic Engine, the `transformResponse` method handles both stream and non-stream responses. It processes the response data and converts it into a standardized format.
-
+ -->