An official implementation of "SONAR: A Synthetic AI-Audio Detection Framework and Benchmark"
conda env export > environment.yml
conda activate sonar
Please download the following datasets and extract them in /data/
or change the database path correspondingly in the code.
The structure should look like
data
├── LJSpeech-1.1
│ ├── wavs
│ ├── metadata.csv
│ └── README
├── wavefake
│ ├── ljspeech_full_band_melgan
│ ├── ljspeech_hifiGAN
│ ├── ...
│ └── ljspeech_waveglow
├── LibriSeVoc
│ ├── diffwave
│ ├── gt
│ ├── ...
│ └── wavernn
├── in_the_wild
│ ├── 0.wav
│ ├── ...
│ ├── 31778.wav
│ └── meta.csv
To train traditional models, please run main_tm.py
Arguments
-
--config
: config files for different models. -
Train AASIST on wavefake
python main_tm.py --config ./config/AASIST.conf
-
Evaluation (modify the
model_path
in corresponding config files.)python main_tm.py --config ./config/AASIST.conf --eval
To fine-tune foundation models, please run main_fm.py
-
Fine-tune Wave2Vec2BERT
python main_fm.py --model wave2vec2bert
This repository is built on top of the following open source projects.