From c48c7ee4129db610f9dcdb481a83e2bb16c375dd Mon Sep 17 00:00:00 2001 From: Gabrielle Ong Date: Wed, 6 Nov 2024 21:34:49 +0800 Subject: [PATCH 1/3] v1.0.1 QA template (22 Oct 2024) --- .github/ISSUE_TEMPLATE/QA_checklist.md | 153 +++++++++++++++++++++++++ 1 file changed, 153 insertions(+) create mode 100644 .github/ISSUE_TEMPLATE/QA_checklist.md diff --git a/.github/ISSUE_TEMPLATE/QA_checklist.md b/.github/ISSUE_TEMPLATE/QA_checklist.md new file mode 100644 index 000000000..d990398fe --- /dev/null +++ b/.github/ISSUE_TEMPLATE/QA_checklist.md @@ -0,0 +1,153 @@ +--- +name: QA Checklist +about: QA Checklist +title: 'QA: [VERSION]' +labels: 'type: QA checklist' +assignees: '' +--- +**QA details:** + +Version: `v1.0.x-xxx` + +OS (select one) +- [ ] Windows 11 (online & offline) +- [ ] Ubuntu 24, 22 (online & offline) +- [ ] Mac Silicon OS 14/15 (online & offline) +- [ ] Mac Intel (online & offline) + +-------- + +# 1. Manual QA +## Installation +- [ ] it should install with network installer +- [ ] it should install with local installer +- [ ] it should install 2 binaries (cortex and cortex-server) +- [ ] it should install with correct folder permissions +- [ ] it should install with folders: /engines /logs (no /models folder until model pull) + + +## Data/Folder structures +- [ ] cortex.so models are stored in `cortex.so/model_name/variants/`, with .gguf and model.yml file +- [ ] huggingface models are stored `huggingface.co/author/model_name` with .gguf and model.yml file +- [ ] downloaded models are saved in cortex.db (view via SQL) +- [ ] [to add] tests for copying models data folder & relative paths + + +## Cortex Update +- [ ] cortex -v should check output current version and check for updates +- [ ] cortex update replaces the app, installer, uninstaller and binary file (without installing cortex.llamacpp) +- [ ] cortex update should update from ~3-5 versions ago to latest (+3 to 5 bump) +- [ ] cortex update should update from the previous version to latest (+1 bump) +- [ ] cortex update should update from previous stable version to latest (stable checking) +- [ ] it should gracefully update when server is actively running + +## Overall / App Shell +- [ ] cortex returns helpful text in a timely* way +- [ ] `cortex` or `cortex -h` displays help commands +- [ ] CLI commands should start the API server, if not running [WIP `cortex pull`, `cortex engines install`] +- [ ] it should correctly log to cortex-cli.log and cortex.log +- [ ] There should be no stdout from inactive shell session + +## Engines +- [ ] llama.cpp should be installed by default +- [ ] it should run gguf models on llamacpp +- [ ] it should install engines +- [ ] it should list engines (Compatible, Ready, Not yet installed) +- [ ] it should get engines +- [ ] it should uninstall engines +- [ ] it should gracefully continue engine installation if interrupted halfway (partial download) +- [ ] it should gracefully handle when users try to CRUD incompatible engines (No variant found for xxx) +- [ ] it should run trtllm models on trt-llm [WIP, not tested] +- [ ] it shoud handle engine variants [WIP, not tested] +- [ ] it should update engines versions [WIP, not tested] + +## Server +- [ ] `cortex start` should start server and output API documentation page +- [ ] users can see API documentation page +- [ ] `cortex stop` should stop server +- [ ] it should correctly log to cortex logs +- [ ] `cortex ps` should return server status and running models (or no model loaded) + +## Model Pulling +- [ ] Pulling a model should pull .gguf and model.yml file +- [ ] Model download progress should appear (with accurate %, total time, download size, speed) +### cortex.so +- [ ] it should pull by built in model_ID +- [ ] pull by model_ID should recommend default variant at the top (set in HF model.yml) +- [ ] it should pull by built-in model_id:variant +### huggingface.co +- [ ] it should pull by HF repo/model ID +- [ ] it should pull by full HF url (ending in .gguf) +### Interrupted Download +- [ ] it should allow user to interrupt / stop download +- [ ] pulling again after interruption should accurately calculates remainder of model file size neeed to be downloaded (`Found unfinished download! Additional XGB needs to be downloaded`) +- [ ] it should allow to continue downloading the remainder after interruption + +## Model Management +- [ ] it should list downloaded models +- [ ] it should get info of a local model +- [ ] it should update models +- [ ] it should delete a model +- [ ] it should import models with model_id and model_path +- [ ] [To deprecate] it should alias models (deprecate once `cortex run` with regex is implemented) + +## Model Running +- [ ] `cortex run ` - if no local models detected, shows `pull` model menu +- [ ] `cortex run` - if local model detected, runs the local model +- [ ] `cortex run` - if multiple local models detected, shows list of local models for users to select +- [ ] `cortex run ` should return gracefully `Model not found!` +- [ ] run should autostart server +- [ ] `cortex run ` starts interactive chat (by default) +- [ ] `cortex run -d` runs in detached mode +- [ ] `cortex models start ` +- [ ] terminate StdIn or `exit()` should exit interactive chat + +## Hardware Detection / Acceleration [WIP] +- [ ] it should auto offload max ngl +- [ ] it should correctly detect available GPUs +- [ ] it should gracefully detect missing dependencies/drivers +CPU Extension (e.g. AVX-2, noAVX, AVX-512) +GPU Acceleration (e.g. CUDA11, CUDA12, Vulkan, sycl, etc) + +## Uninstallation / Reinstallation +- [ ] it should uninstall 2 binaries (cortex and cortex-server) +- [ ] it should uninstall with 2 options to delete or not delete data folder +- [ ] it should gracefully uninstall when server is still running +- [ ] uninstalling should not leave any dangling files +- [ ] uninstalling should not leave any dangling processes +- [ ] it should reinstall without having conflict issues with existing cortex data folders + +-- +# 2. API QA + +## Overall API +- [ ] API page is updated at localhost:port endpoint (upon `cortex start`) +- [ ] OpenAI compatibility for below +- [ ] https://cortex.so/api-reference is updated + +## Endpoints +### Chat Completions +- [ ] POST `v1/chat/completions` + +### Engines +- [ ] GET `/v1/engines` +- [ ] DELETE `/v1/engines/install/{name}` +- [ ] POST `/v1/engines/install/{name}` +- [ ] GET `/v1/engines/{name}` + +### Models +- [ ] GET `/v1/models` lists models +- [ ] POST `/v1/models/pull` starts download (websockets) +- [ ] `websockets /events` emitted when model pull starts +- [ ] DELETE `/v1/models/pull` stops download (websockets) +- [ ] `websockets /events` stopped when model pull stops +- [ ] POST `/v1/models/start` starts model +- [ ] POST `/v1/models/stop` stops model +- [ ] DELETE `/v1/models/{id}` deletes model +- [ ] GET `/v1/models/{id}` gets model +- [ ] PATCH `/v1/models/{model}` updates model.yaml params + +---- +#### Test list for reference: +- #1357 e2e tests for APIs in CI +- #1147, #1225 for starting QA list \ No newline at end of file From 312e206b395b1ec25d23cfc1c639f5f793d797df Mon Sep 17 00:00:00 2001 From: Gabrielle Ong Date: Wed, 6 Nov 2024 21:35:55 +0800 Subject: [PATCH 2/3] v1.0.2 QA checklist (6 Nov) --- .github/ISSUE_TEMPLATE/QA_checklist.md | 116 +++++++++++++++---------- 1 file changed, 68 insertions(+), 48 deletions(-) diff --git a/.github/ISSUE_TEMPLATE/QA_checklist.md b/.github/ISSUE_TEMPLATE/QA_checklist.md index d990398fe..a0c68eb38 100644 --- a/.github/ISSUE_TEMPLATE/QA_checklist.md +++ b/.github/ISSUE_TEMPLATE/QA_checklist.md @@ -17,43 +17,49 @@ OS (select one) -------- -# 1. Manual QA +# 1. Manual QA (CLI) ## Installation +- [ ] it should install with local installer (default; no internet required during installation, all dependencies bundled) - [ ] it should install with network installer -- [ ] it should install with local installer -- [ ] it should install 2 binaries (cortex and cortex-server) +- [ ] it should install 2 binaries (cortex and cortex-server) [mac: binaries in `/usr/local/bin`] - [ ] it should install with correct folder permissions - [ ] it should install with folders: /engines /logs (no /models folder until model pull) - +- [ ] It should install with Docker image https://cortex.so/docs/installation/docker/ ## Data/Folder structures - [ ] cortex.so models are stored in `cortex.so/model_name/variants/`, with .gguf and model.yml file - [ ] huggingface models are stored `huggingface.co/author/model_name` with .gguf and model.yml file -- [ ] downloaded models are saved in cortex.db (view via SQL) -- [ ] [to add] tests for copying models data folder & relative paths - +- [ ] downloaded models are saved in cortex.db with the right fields: `model`, `author_repo_id`, `branch_name`, `path_to_model_yaml` (view via SQL) ## Cortex Update - [ ] cortex -v should check output current version and check for updates - [ ] cortex update replaces the app, installer, uninstaller and binary file (without installing cortex.llamacpp) -- [ ] cortex update should update from ~3-5 versions ago to latest (+3 to 5 bump) -- [ ] cortex update should update from the previous version to latest (+1 bump) -- [ ] cortex update should update from previous stable version to latest (stable checking) -- [ ] it should gracefully update when server is actively running +- [ ] `cortex update` should update from ~3-5 versions ago to latest (+3 to 5 bump) +- [ ] `cortex update` should update from the previous version to latest (+1 bump) +- [ ] `cortex update -v 1.x.x-xxx` should update from the previous version to specified version +- [ ] `cortex update` should update from previous stable version to latest +- [ ] it should gracefully update when server is actively running ## Overall / App Shell -- [ ] cortex returns helpful text in a timely* way +- [ ] cortex returns helpful text in a timely* way (< 5s) - [ ] `cortex` or `cortex -h` displays help commands -- [ ] CLI commands should start the API server, if not running [WIP `cortex pull`, `cortex engines install`] +- [ ] CLI commands should start the API server, if not running [except - [ ] it should correctly log to cortex-cli.log and cortex.log - [ ] There should be no stdout from inactive shell session ## Engines - [ ] llama.cpp should be installed by default -- [ ] it should run gguf models on llamacpp -- [ ] it should install engines -- [ ] it should list engines (Compatible, Ready, Not yet installed) +- [ ] it should run gguf models on llamacpp +- [ ] it should list engines - [ ] it should get engines +- [ ] it should install engines (latest version if not specified) +- [ ] it should install engines (with specified variant and version) +- [ ] it should get default engine +- [ ] it should set default engine (with specified variant/version) +- [ ] it should load engine +- [ ] it should unload engine +- [ ] it should update engine (to latest version) +- [ ] it should update engine (to specified version) - [ ] it should uninstall engines - [ ] it should gracefully continue engine installation if interrupted halfway (partial download) - [ ] it should gracefully handle when users try to CRUD incompatible engines (No variant found for xxx) @@ -62,15 +68,17 @@ OS (select one) - [ ] it should update engines versions [WIP, not tested] ## Server -- [ ] `cortex start` should start server and output API documentation page -- [ ] users can see API documentation page -- [ ] `cortex stop` should stop server -- [ ] it should correctly log to cortex logs +- [ ] `cortex start` should start server and output localhost URL & port number +- [ ] users can access API Swagger documentation page at localhost URL & port number +- [ ] `cortex start` can be configured with parameters (port, [logLevel [WIP]](https://github.com/janhq/cortex.cpp/pull/1636)) https://cortex.so/docs/cli/start/ +- [ ] it should correctly log to cortex logs (logs/cortex.log, logs/cortex-cli.log) - [ ] `cortex ps` should return server status and running models (or no model loaded) +- [ ] `cortex stop` should stop server ## Model Pulling - [ ] Pulling a model should pull .gguf and model.yml file -- [ ] Model download progress should appear (with accurate %, total time, download size, speed) +- [ ] Model download progress should appear as download bars for each file +- [ ] Model download progress should be accurate (%, total time, download size, speed) ### cortex.so - [ ] it should pull by built in model_ID - [ ] pull by model_ID should recommend default variant at the top (set in HF model.yml) @@ -85,16 +93,15 @@ OS (select one) ## Model Management - [ ] it should list downloaded models -- [ ] it should get info of a local model -- [ ] it should update models +- [ ] it should get a local model +- [ ] it should update model parameters in model.yaml - [ ] it should delete a model - [ ] it should import models with model_id and model_path -- [ ] [To deprecate] it should alias models (deprecate once `cortex run` with regex is implemented) ## Model Running - [ ] `cortex run ` - if no local models detected, shows `pull` model menu - [ ] `cortex run` - if local model detected, runs the local model -- [ ] `cortex run` - if multiple local models detected, shows list of local models for users to select +- [ ] `cortex run` - if multiple local models detected, shows list of local models (from multiple model sources eg cortexso, HF authors) for users to select (via regex search) - [ ] `cortex run ` should return gracefully `Model not found!` - [ ] run should autostart server - [ ] `cortex run ` starts interactive chat (by default) @@ -102,7 +109,7 @@ OS (select one) - [ ] `cortex models start ` - [ ] terminate StdIn or `exit()` should exit interactive chat -## Hardware Detection / Acceleration [WIP] +## Hardware Detection / Acceleration [WIP, no need to QA] - [ ] it should auto offload max ngl - [ ] it should correctly detect available GPUs - [ ] it should gracefully detect missing dependencies/drivers @@ -120,34 +127,47 @@ GPU Acceleration (e.g. CUDA11, CUDA12, Vulkan, sycl, etc) -- # 2. API QA -## Overall API -- [ ] API page is updated at localhost:port endpoint (upon `cortex start`) -- [ ] OpenAI compatibility for below +## Checklist for each endpoint +- [ ] Upon `cortex start`, API page is displayed at localhost:port endpoint +- [ ] Endpoints should support the parameters stated in API reference (towards OpenAI Compatibility) - [ ] https://cortex.so/api-reference is updated ## Endpoints ### Chat Completions - [ ] POST `v1/chat/completions` +- [ ] Cortex supports Function Calling #295 ### Engines -- [ ] GET `/v1/engines` -- [ ] DELETE `/v1/engines/install/{name}` -- [ ] POST `/v1/engines/install/{name}` -- [ ] GET `/v1/engines/{name}` - -### Models -- [ ] GET `/v1/models` lists models -- [ ] POST `/v1/models/pull` starts download (websockets) -- [ ] `websockets /events` emitted when model pull starts -- [ ] DELETE `/v1/models/pull` stops download (websockets) -- [ ] `websockets /events` stopped when model pull stops -- [ ] POST `/v1/models/start` starts model -- [ ] POST `/v1/models/stop` stops model -- [ ] DELETE `/v1/models/{id}` deletes model -- [ ] GET `/v1/models/{id}` gets model -- [ ] PATCH `/v1/models/{model}` updates model.yaml params - ----- -#### Test list for reference: +- [ ] List engines: GET `/v1/engines` +- [ ] Get engine: GET `/v1/engines/{name}` +- [ ] Install engine: POST `/v1/engines/install/{name}` +- [ ] Get default engine variant/version: GET `v1/engines/{name}/default` +- [ ] Set default engine variant/version: POST `v1/engines/{name}/default` +- [ ] Load engine: POST `v1/engines/{name}/load` +- [ ] Unload engine: DELETE `v1/engines/{name}/load` +- [ ] Update engine: POST `v1/engines/{name}/update` +- [ ] uninstall engine: DELETE `/v1/engines/install/{name}` + +### Pulling Models +- [ ] Pull model: POST `/v1/models/pull` starts download (websockets) +- [ ] Pull model: `websockets /events` emitted +- [ ] Stop model download: DELETE `/v1/models/pull` (websockets) +- [ ] Stop model download: `websockets /events` stopped +- [ ] Import model: POST `v1/models/import` + +### Running Models +- [ ] List models: GET `v1/models` +- [ ] Start model: POST `/v1/models/start` +- [ ] Stop model: POST `/v1/models/stop` +- [ ] Get model: GET `/v1/models/{id}` +- [ ] Delete model: DELETE `/v1/models/{id}` +- [ ] Update model: PATCH `/v1/models/{model}` updates model.yaml params + +## Server +- [ ] CORs [WIP] +- [ ] health: GET `/healthz` +- [ ] terminate server: DELETE `/processManager/destroy` +-------- +Test list for reference: - #1357 e2e tests for APIs in CI - #1147, #1225 for starting QA list \ No newline at end of file From 9e0834d6a9a16d82c9e72bc66820c0c190d1dfe0 Mon Sep 17 00:00:00 2001 From: Gabrielle Ong Date: Wed, 6 Nov 2024 21:36:39 +0800 Subject: [PATCH 3/3] remove validation for bug report additional info --- .github/ISSUE_TEMPLATE/bug_report.yml | 2 -- 1 file changed, 2 deletions(-) diff --git a/.github/ISSUE_TEMPLATE/bug_report.yml b/.github/ISSUE_TEMPLATE/bug_report.yml index 6d4cae367..6684aa985 100644 --- a/.github/ISSUE_TEMPLATE/bug_report.yml +++ b/.github/ISSUE_TEMPLATE/bug_report.yml @@ -51,8 +51,6 @@ body: - label: cortex.onnx (NPUs, DirectML) - type: input - validations: - required: true attributes: label: "Hardware Specs eg OS version, GPU" description: