Open
Description
Perhaps related to this PR, but opposite:
I'd like to be able to easily ask WebLLM to download a second (or third, etc) model to cache, while continuing to use the existing, already loaded model. Then get a callback when the second model has loaded, so that I can inform the user they can now switch to the other model if they prefer.
Or is there an optimal way to do this already?
Currently my idea is to create a separate function to load the new shards into the cache manually, separately/outside of WebLLM. But I'd prefer to use WebLLM for this if there is a feature for this already (I searched the repo but couldn't find any).
Metadata
Metadata
Assignees
Labels
No labels