-
Notifications
You must be signed in to change notification settings - Fork 389
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: 4018 feature add out of the box embedding support via clientfee…
…dbackintegrations (#4454) <!-- Thanks for your contribution! As part of our Community Growers initiative 🌱, we're donating Justdiggit bunds in your name to reforest sub-Saharan Africa. To claim your Community Growers certificate, please contact David Berenstein in our Slack community or fill in this form https://tally.so/r/n9XrxK once your PR has been merged. --> # Description I added support for `sentence-transformers` via the introduction of the `SentenceTransformersExtractor`. During the implementation, I fixed some minor issues. - replaced the `vector_settings_by_name` that had been defined twice for the `remote` dataset and once within the `local` dataset, by the method we defined in the shared `base`. - updated the `__repr__` of the dataset to include the `vector_settings`. - rewrote some code of the `TextDescriptivesExtractor` to align the structure and usage of both extractors. - resolved a bug in the `TextDescriptivesExtractor`, where we might update records from wrong indices with embedding during simultaneous IO. - removed unit tests for the `TextDescriptivesExtractor` and created integration tests instead. - removed some unused imoports Closes #4018 **Type of change** (Please delete options that are not relevant. Remember to title the PR according to the type of change) - [X] New feature (non-breaking change which adds functionality) - [X] Refactor (change restructuring the codebase without changing functionality) - [X] Improvement (change adding some improvement to an existing functionality) **How Has This Been Tested** (Please describe the tests that you ran to verify your changes. And ideally, reference `tests`) - [X] integration/**/**/integration/test_textdescriptives.py - [X] integration/**/**/integration/test_sentencetransformerx.py **Checklist** - [x] I added relevant documentation - [x] I followed the style guidelines of this project - [x] I did a self-review of my code - [x] I made corresponding changes to the documentation - [ ] My changes generate no new warnings - [x] I have added tests that prove my fix is effective or that my feature works - [ ] I filled out [the contributor form](https://tally.so/r/n9XrxK) (see text above) - [x] I have added relevant notes to the `CHANGELOG.md` file (See https://keepachangelog.com/) --------- Co-authored-by: Gabriel Martín Blázquez <[email protected]>
- Loading branch information
1 parent
c95326a
commit 27e3f09
Showing
27 changed files
with
1,269 additions
and
274 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file added
BIN
+191 KB
...rials/add-sentence-transformers-embeddings-as-vectors/sentence-transformers.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.