Support local model with inference-engine mlx #475

OKHand-Zy · 2024-11-20T06:10:04Z

Enhancement: support local and custom models #165

This is a modified version of my existing code, although the code quality may not be ideal. It supports running local path models using mlx for both CLI and ChatAPI. (For now, you still need to manually place the model into the ~/.cache/exo directory before use. I'm working on automating this, but my unfamiliarity with grpc requires further research.)

I would greatly appreciate any suggestions on how to improve or optimize the code for better results.

Changes:

Added a "How to use local models" section to the README.
Implemented init_exo_env to configure local model cards and the local model store.
Added bypass logic for local models using if...else statements.

blindcrone · 2024-11-21T15:37:39Z

I like the idea here, but think rather than rely on a folder structure this should use config files or command line arguments to specify paths to model implementations and populate things like the model card list at runtime. I'm considering refactoring the inference engine to take model implementations by default and use the shard downloader as one of a few possible routes to get weights, and I think automatically instantiating and parsing a default directory structure for this purpose creates a lot of potential for issues down the line

OKHand-Zy · 2024-11-22T01:39:57Z

@blindcrone
Recently, while implementing the HTTP functionality for the local model, I realized what you meant. I've switched to using aiohttp to establish an HTTP service on each node. When needed, I'll check which node has the necessary data and use an internal network to download it in chunks (similar to how exo does it). Afterward, I'll rely on the inference_engine in the command to use the model, instead of configuring it through a config file. I'm wondering if this aligns with your thoughts? If there's a better approach, I'm open to suggestions.

… and automatic download steps.

OKHand-Zy · 2024-12-24T03:22:18Z

New Updates:

Automatic broadcasting of local model downloads is now supported.
Added --stored-model-ip and --stored-model-port arguments.
CLI and ChatAPI now support running local models.
Added a 'storedhost' directory within the download directory for local model code.
Added instructions on how to run local models to the README.

Note: This feature only supports the --inference-engine mlx option.

OKHand-Zy and others added 12 commits November 8, 2024 09:29

add inference mlx run local model in single node

89665df

Merge exo f1eec9f commit version

b2bcc12

filter merge erro

0b87eb9

filter f1eec9f model change

e2f0723

futuer:(i-e:mlx)suppoert local model and HF model terminal complet

2a2e3b2

futur:support cli and chatapi local model complet

d8bbb2b

filter run_model_cli

9ab3513

add init_exo_function (helpers.py)

a9c345a

Merge branch 'main' 1fa42f3 into support-local-model

1ae2648

filter read me

cdba915

filter cli local model error

0b06fe1

filter some mark

438eae4

OKHand-Zy force-pushed the support-local-model branch from f6fc665 to 438eae4 Compare November 22, 2024 01:12

Merge exo 93d38e2 commits

32574b9

OKHand-Zy added 2 commits November 22, 2024 10:10

filter name miss

535cb44

Merge exo commit 7013041 fix 'CLI reply error'

1065242

OKHand-Zy marked this pull request as draft November 27, 2024 15:35

OKHand-Zy added 4 commits November 28, 2024 11:12

remove:add lcoal model arg

1e5429d

filter: other dir no config error

880682a

merge: new image

c4e18ae

Merge exo 17411df branch

daeb1eb

OKHand-Zy marked this pull request as ready for review December 6, 2024 07:49

OKHand-Zy added 6 commits December 12, 2024 10:33

Merge exo d4cc2cf commit

dd49f08

filter: chat-api modelpool for Local model

cb270ed

add: Local Model Api

316e58b

filter: file&code to other palce

961c0dc

future: enhance lh_helpers but download_file function not complete

85d41f0

filter: add args setting sotre model ip and port

aca7eba

OKHand-Zy added 10 commits December 18, 2024 16:49

future:add other node update local model to model_crads

ef60ff3

Merge exo cfedcec commit

e96f5ae

filter: _resolve_tokenizer find error

22b4484

future: local mode auto download prototype

8c3a600

filter：download file path

f0e1877

filter: stored_model dir path and http class name

5eef335

feat:chatAPI can run Local model on multi node

c557fe8

Merge branch 'auto dowload-local-model' into support-local-model

8e05ad7

docs: Update the README.md to add instructions for using local models…

322b36c

… and automatic download steps.

Merge exo main branch fdc3b5a commit

677260e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support local model with inference-engine mlx #475

Support local model with inference-engine mlx #475

OKHand-Zy commented Nov 20, 2024 •

edited

Loading

blindcrone commented Nov 21, 2024

OKHand-Zy commented Nov 22, 2024 •

edited

Loading

OKHand-Zy commented Dec 24, 2024 •

edited

Loading

Support local model with inference-engine mlx #475

Are you sure you want to change the base?

Support local model with inference-engine mlx #475

Conversation

OKHand-Zy commented Nov 20, 2024 • edited Loading

Changes:

blindcrone commented Nov 21, 2024

OKHand-Zy commented Nov 22, 2024 • edited Loading

OKHand-Zy commented Dec 24, 2024 • edited Loading

OKHand-Zy commented Nov 20, 2024 •

edited

Loading

OKHand-Zy commented Nov 22, 2024 •

edited

Loading

OKHand-Zy commented Dec 24, 2024 •

edited

Loading