Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Version] Bump version to 0.2.58, support embedding #539

Merged
merged 1 commit into from
Aug 12, 2024

Conversation

CharlieFRuan
Copy link
Contributor

Change

  • Supports embedding via OpenAI API engine.embeddings.create():
  • Currently, only snowflake-arctic-embed-s and snowflake-arctic-embed-m are supported. We add the following models to the prebuilt model list:
    • snowflake-arctic-embed-m-q0f32-MLC-b32
    • snowflake-arctic-embed-m-q0f32-MLC-b4
    • snowflake-arctic-embed-s-q0f32-MLC-b32
    • snowflake-arctic-embed-s-q0f32-MLC-b4
    • b32 means the model is compiled to support a maximum batch size of 32. If an input with more than 32 entries are provided, we will call multiple forward() (e.g. if input has 67 entries, we forward 3 times). The larger the maximum batch size, the more memory it takes to load the model. See ModelRecord.vram_required_MB in config.ts for specifics.

TVMjs

Still compiled at apache/tvm@1fcb620, no change

@CharlieFRuan CharlieFRuan merged commit 552ec95 into mlc-ai:main Aug 12, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant