[Version] Bump version to 0.2.58, support embedding #539
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Change
engine.embeddings.create()
:snowflake-arctic-embed-s
andsnowflake-arctic-embed-m
are supported. We add the following models to the prebuilt model list:snowflake-arctic-embed-m-q0f32-MLC-b32
snowflake-arctic-embed-m-q0f32-MLC-b4
snowflake-arctic-embed-s-q0f32-MLC-b32
snowflake-arctic-embed-s-q0f32-MLC-b4
b32
means the model is compiled to support a maximum batch size of 32. If an input with more than 32 entries are provided, we will call multipleforward()
(e.g. if input has 67 entries, we forward 3 times). The larger the maximum batch size, the more memory it takes to load the model. SeeModelRecord.vram_required_MB
inconfig.ts
for specifics.TVMjs
Still compiled at apache/tvm@1fcb620, no change