Finetune CLAP on {audio, text} pairs #141

jerpint · 2024-02-08T20:09:30Z

Hello!

Suppose I have a dataset of {audio, text} pairs. I would now like to finetune CLAP on this audio subset. Do you have any tips for getting started with such a task? Would continuing the training from a checkpoint with a smaller learning rate be somewhat of a good start? Do you have scripts that allow to do something similar?

Thanks

lukewys · 2024-03-31T14:46:31Z

Please see https://github.com/LAION-AI/CLAP?tab=readme-ov-file#dataset-format for details on the dataset format that we trained on. I think you can refer to the training script for fine-tuning, but remember to modify the learning rate and weight initialization.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Finetune CLAP on {audio, text} pairs #141

Finetune CLAP on {audio, text} pairs #141

jerpint commented Feb 8, 2024

lukewys commented Mar 31, 2024

Finetune CLAP on {audio, text} pairs #141

Finetune CLAP on {audio, text} pairs #141

Comments

jerpint commented Feb 8, 2024

lukewys commented Mar 31, 2024