Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue reproducing retrieval results using API and hugging face implementation #140

Open
Aafiya-H opened this issue Jan 22, 2024 · 2 comments

Comments

@Aafiya-H
Copy link

Aafiya-H commented Jan 22, 2024

Hi, I tried replicating the audio to text retrieval results using the PyPI library and the hugging face implementation, however the obtained numbers do not match with those reported in the paper.
For the hugging face implementation, I use ClapTextModelWithProjection and ClapAudioModelWithProjection. I obtain the similarity score by performing cosine similarity and sort the retrieved texts by similarity score.
Similarly for the PyPI library implementation, I use get_audio_embedding_from_data and get_text_embedding and follow the same procedure as above.
The model is initialized as following:

model = laion_clap.CLAP_Module(enable_fusion=enable_fusion)
model.load_ckpt() 

I am using Clotho version 2.1 evaluation split from here and AudioCaps val split from google drive link in repository
Could you please help me understand what could be the issue?

@lukewys
Copy link
Contributor

lukewys commented Mar 31, 2024

Hi,

I would recommend using this github implementation to evaluate the model. Also, for clotho dataset, for one audio there are 5 text labels. Thus, the metric calculation is a bit different. Please refer to our implementation of evaluation in here: https://github.com/LAION-AI/CLAP/blob/main/src/laion_clap/training/train.py#L577

@carankt
Copy link

carankt commented Sep 23, 2024

@Aafiya-H were you able to reproduce the results with the github repo?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants