Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEAT: support sparse vector for bge-m3 #2540

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

pengjunfeng11
Copy link

@pengjunfeng11 pengjunfeng11 commented Nov 11, 2024

支持bge-m3模型的稀疏向量生成功能,

调用方式为:model.create_embedding(text, return_sparse=True)

新增convert_ids_to_tokens方法

该方法可将token_id转换为人类可读文字,调用方式为

from xinference.client import Client

client = Client("http://ip:port")
model = client.get_model(model_name)
seq = model.convert_ids_to_tokens(key_list)

该方法返回类型为List[str],如传入List[str],将按顺序返回值

Fixes #2527 .

@XprobeBot XprobeBot added this to the v0.16 milestone Nov 11, 2024
@qinxuye qinxuye changed the title sparse vector support FEAT: support sparse vector for bge-m3 Nov 11, 2024
@qinxuye
Copy link
Contributor

qinxuye commented Nov 15, 2024

For convert_ids_to_tokens, can you add a test to verify it to

https://github.com/xorbitsai/inference/blob/main/xinference/model/embedding/tests/test_embedding_models.py

For bge-m3, it's too large for CI, we can test it manually.

from sentence_transformers import SentenceTransformer

kwargs.setdefault("normalize_embeddings", True)

if kwargs.get("return_sparse") and "m3" in self._model_spec.model_name.lower():
self._kwargs["hybrid_mode"] = True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks a bit disruptive to the design, I don't know if there is a more elegant way.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you referring only to the if judgment part or the subsequent reload part?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean the reload part.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about loading bge-m3 when specifying hybrid_mode=True? This can be done in load.

@pengjunfeng11
Copy link
Author

For convert_ids_to_tokens, can you add a test to verify it to

main/xinference/model/embedding/tests/test_embedding_models.py

For bge-m3, it's too large for CI, we can test it manually.

OK

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

xf不支持生成稀疏向量
3 participants