epic: Jan integrates Cortex.cpp #3825

0xSage · 2024-10-17T06:37:27Z

Goal

Jan integrates Cortex.cpp and has a bug-free transition to Cortex.cpp
Planning: planning: Jan's path to cortex.cpp? #3690

Tasklist

Eng Spec (copy over from planning: Jan's path to cortex.cpp? #3690, replace this post)

Previous Issues

feat: Jan supports new Cortex's Model Folder and model.yaml architecture #3633

louis-jan · 2024-10-24T07:29:08Z

Implementation Specs

Migration Path:

App 0.5.8 opens
Return model list from cache (given users are on 0.5.7) -> function normally.
Scan JSON models (legacy logics - fresh install or older versions) -> function normally.
In the background, the app attempts to import models from cortex.cpp and merge them with legacy downloaded models (failed to import models).
The app combines models returned by cortex.cpp and legacy JSON models. Cortex.cpp models are prioritized in case of the same ID (Models are imported successfully.)

Changes

Naming convention

inference-nitro-extension is renamed into inference-cortex-extension.
cortex.cpp binaries have the same name as engine releases.
Pre-package everything, include cuda dependencies (dll, so) so users don't have to install separately.
Support noavx-cuda binaries as a fallback

Simplifed

Deprecated ModelFile. It's no longer relevant. Now, providers define models, so it should manage how to run itself.
Remove install cuda toolkit UX, should be ready after installed.

Downloader

App proxies to cortex.cpp or app's downloader, depending on the cortex.cpp model support capability.

Model Hub

App allows extensions to register models available for download in RAM. After downloading them, the models will have their yaml or json persisted along with the model files.
App priorities model hub decoration (previous json metadata) over cortex.cpp metadata (such as name, size, tags)

Observability

cortex-extension should watch cortex.cpp server upon launch. It ensures that the cortex process runs with the application.
All requests will be queued and run when the server to come online, ensuring the UX remains the same. So there would be no asynchronous requests and server run introduced. E.g. Model import or start should not fail due to server not being online in time.
So there would be no attempt to kill the cortex process on model start every time. It is just a stop and start model, so it will not block other API requests.

Goals

Updated from the older version to this version, models will be imported and run normally. Models are not imported will still able to run since we will attempt to do preflight before running.
Users can download models or app proxies to cortex.cpp, or use the app downloader, depending on the cortex.cpp model support capability.

Subtasks

@janhq/jan @janhq/cortex

0xSage added type: epic A major feature or initiative category: local providers labels Oct 17, 2024

0xSage assigned louis-jan Oct 17, 2024

louis-jan mentioned this issue Oct 17, 2024

feat: Jan Integrates Cortex.cpp as Provider #3821

Merged

14 tasks

0xSage mentioned this issue Oct 17, 2024

epic: Provider Refactor + Extensions #3824

Closed

14 tasks

0xSage added category: providers Local & remote inference providers and removed category: local providers labels Oct 17, 2024

0xSage mentioned this issue Oct 17, 2024

epic: Fix Local Engine issues (llama.cpp) #3614

Closed

10 tasks

This was referenced Oct 21, 2024

feat: Jan supports new Cortex's Model Folder and model.yaml architecture #3633

Closed

epic: Change Jan API server to use cortex.cpp #3487

Closed

dan-homebrew changed the title ~~epic: Jan Integrates Cortex.cpp as Provider~~ epic: Provider Extension - Cortex.cpp Oct 21, 2024

louis-jan added this to the v0.5.8 milestone Oct 25, 2024

dan-homebrew changed the title ~~epic: Provider Extension - Cortex.cpp~~ epic: Jan integrates Cortex.cpp Oct 29, 2024

imtuyethan added the P1: important Important feature / fix label Oct 31, 2024

imtuyethan closed this as completed Nov 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

epic: Jan integrates Cortex.cpp #3825

epic: Jan integrates Cortex.cpp #3825

0xSage commented Oct 17, 2024 •

edited by dan-homebrew

Loading

louis-jan commented Oct 24, 2024 •

edited

Loading

epic: Jan integrates Cortex.cpp #3825

epic: Jan integrates Cortex.cpp #3825

Comments

0xSage commented Oct 17, 2024 • edited by dan-homebrew Loading

Goal

Tasklist

Previous Issues

louis-jan commented Oct 24, 2024 • edited Loading

Implementation Specs

Migration Path:

Changes

Naming convention

Simplifed

Downloader

Model Hub

Observability

Goals

Subtasks

0xSage commented Oct 17, 2024 •

edited by dan-homebrew

Loading

louis-jan commented Oct 24, 2024 •

edited

Loading