Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HuggingFaceTEIDocumentEmbedder do not truncate rather they throw an exception. #7413

Closed
PAHXO opened this issue Mar 23, 2024 · 4 comments · Fixed by #7460
Closed

HuggingFaceTEIDocumentEmbedder do not truncate rather they throw an exception. #7413

PAHXO opened this issue Mar 23, 2024 · 4 comments · Fixed by #7460

Comments

@PAHXO
Copy link

PAHXO commented Mar 23, 2024

HuggingFaceTEIDocumentEmbedder do not auto-truncate rather they throw an exception.
I'd like to have the option to pass an argument to truncate my text and go on.

@PAHXO PAHXO changed the title HuggingFaceTEITextEmbedder do not truncate rather they throw an exception. HuggingFaceTEIDocumentEmbedder do not truncate rather they throw an exception. Mar 23, 2024
@anakin87
Copy link
Member

@awinml I remember that you developed this Embedder.
Do you have any ideas/suggestions?

@awinml
Copy link
Contributor

awinml commented Mar 25, 2024

@anakin87 This is only an issue when using a Text-embeddings-inference deployed endpoint, the HF Inference endpoints automatically truncate and but don't normalize. You can view a simple example showcasing this with both endpoints in this Colab notebook.

The default behaviour of TEI endpoints is to automatically normalize and raise an error if the tokens exceed 512. This can be changed using the truncate and normalize parameters when computing the embeddings. Please see the embed API reference for more information about this.

We use the InferenceClient.feature_extraction method to generate the embeddings. This method does not support passing the truncate and normalize parameters. I had opened a PR (huggingface/huggingface_hub#1940) to fix this, which was deferred. Instead, their suggestion was to use the InferenceClient.post method and pass in the parameters as a JSON payload. Something like this:

import json
import numpy as np
from huggingface_hub import InferenceClient

client = InferenceClient(...)
text = "Very long text"

# NOTE: `truncate` and `normalize` parameters only work for TEI-powered APIs
response = client.post(json={"inputs": [text], "truncate": True, "normalize":True}, task="feature-extraction")
response_dict = json.loads(response.decode())
embedding = np.array(response_dict, dtype="float32").tolist()

I can open a PR to refactor the embedders if this approach is okay. The other option would be to wait until they standardize their API and document this limitation in the docs.

@anakin87
Copy link
Member

Let me try to recap... Please correct me if I am wrong.

Currently, HFTEIEmbedders in Haystack support these different backends:

  • TEI deployed locally with Docker
  • TEI on paid HF Inference Endpoints
  • HF Free Inference API

Using the InferenceClient, truncate and normalize parameters are only taken into consideration in the first two cases.
When using the HF Free Inference API, they are ignored and the defaults are used (truncate=True; normalize=False).

If my analysis is correct, I would do the following:

  • introduce these 2 parameters in our Embedders with these defaults: truncate=True; normalize=False
  • explain well in the docstring that these two parameters are only considered when using a TEI service (locally or deployed on HF Inference Endpoints). They are ignored when using the HF Free Inference API.

@awinml WDYT?

@awinml
Copy link
Contributor

awinml commented Mar 27, 2024

@anakin87 Sounds good! I'll add the truncate and normalize parameters with the defaults you mentioned. The docstrings will clearly explain that these only take effect when using a local or paid TEI service, not the free Hugging Face API. Thanks for the detailed recap!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants