Skip to content

Adding support for DashScopeEmbeddings to handle Aliyun embedding models #16

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: pypi/0.0.0-alpha
Choose a base branch
from

Conversation

ZZH-qwq
Copy link
Contributor

@ZZH-qwq ZZH-qwq commented May 30, 2025

This fix ensures proper initialization of Aliyun's text embedding models via DashScopeEmbeddings, as required by Aliyun's official documentation.

Addresses the embedding initialization issue reported in: #8 (comment)

Further testing may be needed by maintainers to verify full compatibility.

@code4DB
Copy link
Collaborator

code4DB commented May 30, 2025

@ZZH-qwq Thanks for your effort. Please attach the snapshot of successful running cases to facilitate the validation :)

@ZZH-qwq
Copy link
Contributor Author

ZZH-qwq commented May 30, 2025

Certainly. However, the successful implementation currently requires truncating inputs before embedding:

text = [t[:24000] for t in text]
embedding = await embedding_model.aembed_documents(text)

This truncation workaround isn't included in the current PR as it's unrelated to the core fix.

Additionally, note that initializing DashScopeEmbeddings introduces a new dependency on dashscope. Could you please verify whether standard initialization still functions properly without dashscope installed, as long as no Aliyun cloud models are used?

The snapshot:

snapshot of successful running

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants