-
Notifications
You must be signed in to change notification settings - Fork 135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is there any possible script/instruction to look at for fine-tuning? #129
Comments
Hi, Since our model is trained on a large-scale data collection, we use the webdataset library to handle IO, instead of directly downloading all data in our server. In your case, you can finetune the model without using the webdataset, which means that you need write a dataloader and a dataset class by yourself. Such dataset and dataloader classes are able to load the audio and the text for the model training. There are many ways to do that, I can refer you to my HTS-AT audio loader (which does not use Webdataset), and you need to load the text data by yourself. Previously, the audio loader loads the h5 file, but now you can directly use the torchaudio.load api. |
And for how to finetune our model, I think eval_linear_probe code should definitely be the sample code for your reference. |
Hi Xuechen, I saw you have done some work in CLAP fine-tuning, and I wonder if you have performed the fine-tuning job in MSFT_CLAP/example for an ASR task? How does it work? Since I also want to fine-tune CLAP and perform some downstream tasks, I would appreciate it if you could provide me with an answer. |
@Kinetic-shaun Thanks for your message and reading my recent paper! Yes actually I have scripts with fine-tuning the Microsoft CLAP model. But it is not fully ready for contributing since it is very specific to the datasets described in the paper, instead of the one that has been used in the original repo - they are very different tasks and scenarios, as you may have read. Recently I've been busy. I believe after the paper's status has been updated, I will have a chance to clean up the code and make it online to the repo. BTW, @RetroCirce sorry that for the LAION AI case, I did not have time to make the fine-tuning work for my dataset. |
Hello @underdogliu , I also am very interested in fine-tuning CLAP for some specific use cases, and would love some insight on how you prepared your data! |
Hi! First of all thanks a lot for such amazing project.
I wonder if there is a valid way to fine-tune the model for specific tasks using customized datasets? I am trying to adapt the model to improve the performance and the source data structure I have is also for classification. It has below content:
I found that this script looks promising, but I do not know how to configure my data in order to replicate the process on my datasets since I found in the options, the sccipt has many setting very specific to ESC50. Do we have any gist on this problem? Thanks in advance!
The text was updated successfully, but these errors were encountered: