Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating embeddings before training model #2930

Open
jinaduuthman opened this issue Sep 10, 2024 · 3 comments
Open

Creating embeddings before training model #2930

jinaduuthman opened this issue Sep 10, 2024 · 3 comments

Comments

@jinaduuthman
Copy link

@tomaarsen, Hi,
I am using the Sbert Trainer method and specifically using the Triplet pairs. [query,positive,negative]
Now I need to add some text to the query([query+some_long_text, positive, negative]) and it would be longer than the max_seq_len and I don't want it truncated.

I read it somewhere that I can create an embedding for the some_long_text and pass this to the model training. I think this looks weird since I am concatenating the embedding with text that way. I have also read one thread here thatcreating embedding before feeding into the model makes the model not to adjust the pretrained weights, is there any better way to do this?

Note that I am using the MultipleNegativeRankingLoss.

@jinaduuthman
Copy link
Author

@tomaarsen

@tomaarsen
Copy link
Collaborator

Hello!

Apologies for the delayed response.

I read it somewhere that I can create an embedding for the some_long_text and pass this to the model training.

I haven't heard about this yet.

I have also read one thread here thatcreating embedding before feeding into the model makes the model not to adjust the pretrained weights, is there any better way to do this?

I think this was likely referring to if you create all embeddings before training a model and then using those, then you're not iteratively updating the model weights like is required for actually training a better model. There's a few reasons why that doesn't work, but in short, then gradient descent doesn't work.

I don't think there's a convenient way to avoid the truncation, I'm afraid.

  • Tom Aarsen

@jinaduuthman
Copy link
Author

Thank you for your response.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants