Is there a way or an option that I can compute similarity score after calling inference to calculate embeddings? #305

zwsjink · 2023-08-18T09:51:36Z

Well, I use

clip-retrieval inference --input_dataset image/mytest.tar --output_folder embeddings_folder --clip_model ViT-L/14 --input_format="webdataset"

to successfully calculate embeddings for both images and texts, and I do see npy files generated. However, I wish there is a way that the similarity score can be calculated at the same time and store to the generated parquet file in the meantime.

The text was updated successfully, but these errors were encountered:

rom1504 · 2023-08-18T10:09:14Z

You can use embedding-reader to compute such similarities between pairs
Computing these dot products is very cheap

I guess we could also add the option to do it here too though

zwsjink · 2023-08-18T13:27:19Z

embedding-reader

I see, that would be great. By the way, in the paper "DataComp: In search of the next generation of multimodal datasets" I see you guys are using cosine similarity, what's the difference between cosine similarity and dot product (maybe after normalization?).

rom1504 · 2023-08-18T14:29:39Z

After normalization dot product is cosine

…

On Fri, Aug 18, 2023, 15:27 Jinkai ***@***.***> wrote: embedding-reader I see, that would be great. By the way, in the paper "*DataComp: In search of the next generation of multimodal datasets*" I see you guys are using cosine similarity, how big difference is it between cosine similarity and dot product (maybe after normalization?). — Reply to this email directly, view it on GitHub <#305 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAR437WT4H5Y7N3VMWD2KPTXV5UUHANCNFSM6AAAAAA3VJCVJ4> . You are receiving this because you commented.Message ID: ***@***.***>

nicolas-dufour · 2023-10-17T13:22:45Z

Hey, @rom1504 , is there an option to store the clip scores when using doing the clip inference transformation? Thanks!

rom1504 · 2023-10-17T13:30:52Z

I guess you mean to compute dot products between pair of text and image embeddings ? There is no such option in clip-retrieval inference. But it's very cheap to compute those on cpu after the inference is done so maybe just do that ?

…

On Tue, Oct 17, 2023, 9:22 PM nicolas-dufour ***@***.***> wrote: Hey, @rom1504 <https://github.com/rom1504> , is there an option to store the clip scores when using doing the clip inference transformation? Thanks! — Reply to this email directly, view it on GitHub <#305 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAR437VK65FCS6NE4QOXHZLX72BDDAVCNFSM6AAAAAA3VJCVJ6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONRWGQYTAMZUGM> . You are receiving this because you were mentioned.Message ID: ***@***.***>

nicolas-dufour · 2023-10-17T13:34:14Z

Yeah but i would be useful when wanting to avoid keeping the image embeddings to save storage when only needing the text embeddings and the clip score

rom1504 · 2023-10-17T16:13:53Z

Makes sense, feel free to open a PR

rom1504 added the enhancement New feature or request label Jan 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is there a way or an option that I can compute similarity score after calling inference to calculate embeddings? #305

Is there a way or an option that I can compute similarity score after calling inference to calculate embeddings? #305

zwsjink commented Aug 18, 2023 •

edited

Loading

rom1504 commented Aug 18, 2023

zwsjink commented Aug 18, 2023 •

edited

Loading

rom1504 commented Aug 18, 2023 via email

nicolas-dufour commented Oct 17, 2023

rom1504 commented Oct 17, 2023 via email

nicolas-dufour commented Oct 17, 2023 •

edited

Loading

rom1504 commented Oct 17, 2023

Is there a way or an option that I can compute similarity score after calling inference to calculate embeddings? #305

Is there a way or an option that I can compute similarity score after calling inference to calculate embeddings? #305

Comments

zwsjink commented Aug 18, 2023 • edited Loading

rom1504 commented Aug 18, 2023

zwsjink commented Aug 18, 2023 • edited Loading

rom1504 commented Aug 18, 2023 via email

nicolas-dufour commented Oct 17, 2023

rom1504 commented Oct 17, 2023 via email

nicolas-dufour commented Oct 17, 2023 • edited Loading

rom1504 commented Oct 17, 2023

zwsjink commented Aug 18, 2023 •

edited

Loading

zwsjink commented Aug 18, 2023 •

edited

Loading

nicolas-dufour commented Oct 17, 2023 •

edited

Loading