Is there any possible script/instruction to look at for fine-tuning? #129

underdogliu · 2023-10-08T06:24:54Z

Hi! First of all thanks a lot for such amazing project.

I wonder if there is a valid way to fine-tune the model for specific tasks using customized datasets? I am trying to adapt the model to improve the performance and the source data structure I have is also for classification. It has below content:

Source audio
Text prompt
The event ID

I found that this script looks promising, but I do not know how to configure my data in order to replicate the process on my datasets since I found in the options, the sccipt has many setting very specific to ESC50. Do we have any gist on this problem? Thanks in advance!

RetroCirce · 2023-11-14T19:42:32Z

Hi,

Since our model is trained on a large-scale data collection, we use the webdataset library to handle IO, instead of directly downloading all data in our server.

In your case, you can finetune the model without using the webdataset, which means that you need write a dataloader and a dataset class by yourself. Such dataset and dataloader classes are able to load the audio and the text for the model training. There are many ways to do that, I can refer you to my HTS-AT audio loader (which does not use Webdataset), and you need to load the text data by yourself. Previously, the audio loader loads the h5 file, but now you can directly use the torchaudio.load api.

RetroCirce · 2023-11-14T19:46:04Z

And for how to finetune our model, I think eval_linear_probe code should definitely be the sample code for your reference.

Kinetic-shaun · 2024-02-19T08:05:51Z

Hi Xuechen,

I saw you have done some work in CLAP fine-tuning, and I wonder if you have performed the fine-tuning job in MSFT_CLAP/example for an ASR task? How does it work? Since I also want to fine-tune CLAP and perform some downstream tasks, I would appreciate it if you could provide me with an answer.

underdogliu · 2024-02-20T00:25:56Z

@Kinetic-shaun Thanks for your message and reading my recent paper! Yes actually I have scripts with fine-tuning the Microsoft CLAP model. But it is not fully ready for contributing since it is very specific to the datasets described in the paper, instead of the one that has been used in the original repo - they are very different tasks and scenarios, as you may have read.

Recently I've been busy. I believe after the paper's status has been updated, I will have a chance to clean up the code and make it online to the repo.

BTW, @RetroCirce sorry that for the LAION AI case, I did not have time to make the fine-tuning work for my dataset.

cvillela · 2024-04-29T23:14:24Z

Hello @underdogliu , I also am very interested in fine-tuning CLAP for some specific use cases, and would love some insight on how you prepared your data!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is there any possible script/instruction to look at for fine-tuning? #129

Is there any possible script/instruction to look at for fine-tuning? #129

underdogliu commented Oct 8, 2023

RetroCirce commented Nov 14, 2023

RetroCirce commented Nov 14, 2023

Kinetic-shaun commented Feb 19, 2024

underdogliu commented Feb 20, 2024

cvillela commented Apr 29, 2024

Is there any possible script/instruction to look at for fine-tuning? #129

Is there any possible script/instruction to look at for fine-tuning? #129

Comments

underdogliu commented Oct 8, 2023

RetroCirce commented Nov 14, 2023

RetroCirce commented Nov 14, 2023

Kinetic-shaun commented Feb 19, 2024

underdogliu commented Feb 20, 2024

cvillela commented Apr 29, 2024