-
Notifications
You must be signed in to change notification settings - Fork 216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is there a way or an option that I can compute similarity score after calling inference to calculate embeddings? #305
Comments
You can use embedding-reader to compute such similarities between pairs I guess we could also add the option to do it here too though |
I see, that would be great. By the way, in the paper "DataComp: In search of the next generation of multimodal datasets" I see you guys are using cosine similarity, what's the difference between cosine similarity and dot product (maybe after normalization?). |
After normalization dot product is cosine
…On Fri, Aug 18, 2023, 15:27 Jinkai ***@***.***> wrote:
embedding-reader
I see, that would be great. By the way, in the paper "*DataComp: In
search of the next generation of multimodal datasets*" I see you guys are
using cosine similarity, how big difference is it between cosine similarity
and dot product (maybe after normalization?).
—
Reply to this email directly, view it on GitHub
<#305 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAR437WT4H5Y7N3VMWD2KPTXV5UUHANCNFSM6AAAAAA3VJCVJ4>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Hey, @rom1504 , is there an option to store the clip scores when using doing the clip inference transformation? Thanks! |
I guess you mean to compute dot products between pair of text and image
embeddings ?
There is no such option in clip-retrieval inference. But it's very cheap to
compute those on cpu after the inference is done so maybe just do that ?
…On Tue, Oct 17, 2023, 9:22 PM nicolas-dufour ***@***.***> wrote:
Hey, @rom1504 <https://github.com/rom1504> , is there an option to store
the clip scores when using doing the clip inference transformation? Thanks!
—
Reply to this email directly, view it on GitHub
<#305 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAR437VK65FCS6NE4QOXHZLX72BDDAVCNFSM6AAAAAA3VJCVJ6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONRWGQYTAMZUGM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Yeah but i would be useful when wanting to avoid keeping the image embeddings to save storage when only needing the text embeddings and the clip score |
Makes sense, feel free to open a PR |
Well, I use
clip-retrieval inference --input_dataset image/mytest.tar --output_folder embeddings_folder --clip_model ViT-L/14 --input_format="webdataset"
to successfully calculate embeddings for both images and texts, and I do see npy files generated. However, I wish there is a way that the similarity score can be calculated at the same time and store to the generated parquet file in the meantime.
The text was updated successfully, but these errors were encountered: