Feature request: engine.preload()

Perhaps related to [this PR](https://github.com/mlc-ai/web-llm/pull/525), but opposite: 

I'd like to be able to easily ask WebLLM to download a second (or third, etc) model to cache, while continuing to use the existing, already loaded model. Then get a callback when the second model has loaded, so that I can inform the user they can now switch to the other model if they prefer. 

Or is there an optimal way to do this already?

Currently my idea is to create a separate function to load the new shards into the cache manually, separately/outside of WebLLM. But I'd prefer to use WebLLM for this if there is a feature for this already (I searched the repo but couldn't find any).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature request: engine.preload() #529

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature request: engine.preload() #529

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions