Is the evaluation metric for training not loss or accuracy, but rather EER? #396

NathanJHLee · 2025-01-03T05:09:09Z

Hi, wespeaker team.
I just followed your configure and got loss and acc. But i think it is too low. So i searched this on issues.
I found similar thing.
#165

According to above, most important thing is checking EER, loss and acc are less important during training?
I am wondering loss and acc when it is 150 epochs for resnet34 In your training case.
And please check my training log that it is trained usual or not.
Thank you!!

In my training case as below.
Dataset : Voxcelebs
model : Resnet34

[ INFO : 2024-12-24 21:51:06,518 ] - | [ INFO : 2024-12-24 21:51:06,528 ] - | [ INFO : 2024-12-24 21:51:06,536 ] - | [ INFO : 2024-12-24 21:51:06,541 ] - | [ INFO : 2024-12-24 21:51:06,549 ] - | [ INFO : 2024-12-24 21:51:47,574 ] - | [ INFO : 2024-12-24 21:51:47,630 ] - | [ INFO : 2024-12-24 21:51:47,815 ] - | [ INFO : 2024-12-24 21:51:47,855 ] - | [ INFO : 2024-12-24 21:51:47,857 ] - | [ INFO : 2024-12-24 21:51:47,860 ] - | [ INFO : 2024-12-24 21:51:47,862 ] - | [ INFO : 2024-12-24 21:51:47,884 ] - | [ INFO : 2024-12-24 21:53:27,802 ] - | [ INFO : 2024-12-24 21:53:27,803 ] - | [ INFO : 2024-12-24 21:53:27,813 ] - | [ INFO : 2024-12-24 21:53:27,826 ] - | [ INFO : 2024-12-24 21:53:27,834 ] - | [ INFO : 2024-12-24 21:53:27,846 ] - | [ INFO : 2024-12-24 21:53:27,851 ] - | 149| 1000| 0.01728| 0.2| 1.4961| 71.647|
149| 1000| 0.01728| 0.2| 1.4996| 71.495|
149| 1000| 0.01728| 0.2| 1.4984| 71.522|
149| 1000| 0.01728| 0.2| 1.5064| 71.708|
149| 1000| 0.01728| 0.2| 1.4848| 71.882|
149| 1066| 0.017247| 0.2| 1.5084| 71.395|
149| 1066| 0.017247| 0.2| 1.4973| 71.537|
149| 1066| 0.017247| 0.2| 1.4936| 71.602|
149| 1066| 0.017247| 0.2| 1.4988| 71.583|
149| 1066| 0.017247| 0.2| 1.4966| 71.579|
149| 1066| 0.017247| 0.2| 1.4868| 71.852|
149| 1066| 0.017247| 0.2| 1.5065| 71.697|
149| 1066| 0.017247| 0.2| 1.4921| 71.735|
150| 100| 0.017198| 0.2| 1.5211| 71.57|
150| 100| 0.017198| 0.2| 1.4884| 71.703|
150| 100| 0.017198| 0.2| 1.4869| 71.93|
150| 100| 0.017198| 0.2| 1.4878| 71.688|
150| 100| 0.017198| 0.2| 1.4782| 72.234|
150| 100| 0.017198| 0.2| 1.4912| 71.219|
150| 100| 0.017198| 0.2| 1.4765| 71.727|

And it is my 'config.yaml'
data_type: raw
dataloader_args:
batch_size: 128
drop_last: true
num_workers: 16
pin_memory: false
prefetch_factor: 8
dataset_args:
aug_prob: 0.6
fbank_args:
dither: 1.0
frame_length: 25
frame_shift: 10
num_mel_bins: 80
filter: true
filter_args:
max_num_frames: 800
min_num_frames: 100
num_frms: 200
resample_rate: 16000
sample_num_per_epoch: 0
shuffle: true
shuffle_args:
shuffle_size: 2500
spec_aug: false
spec_aug_args:
max_f: 8
max_t: 10
num_f_mask: 1
num_t_mask: 1
prob: 0.6
speed_perturb: true
enable_amp: false
exp_dir: RESNET-TSTP-emb256-fbank80-num_frms200-aug0.6-spTrue-saFalse-ArcMargin-SGD-epoch150_20241223
gpus:

0
1
2
3
4
5
6
7
log_batch_interval: 100
loss: CrossEntropyLoss
loss_args: {}
margin_scheduler: MarginScheduler
margin_update:
epoch_iter: 1066
final_margin: 0.2
fix_start_epoch: 40
increase_start_epoch: 20
increase_type: exp
initial_margin: 0.0
update_margin: true
model: ResNet34
model_args:
embed_dim: 256
feat_dim: 80
pooling_func: TSTP
two_emb_layer: false
model_init: null
noise_data: data/musan/lmdb
num_avg: 10
num_epochs: 250
optimizer: SGD
optimizer_args:
lr: 0.1
momentum: 0.9
nesterov: true
weight_decay: 0.0001
projection_args:
do_lm: false
easy_margin: false
embed_dim: 256
num_class: 17982
project_type: arc_margin
scale: 32.0
reverb_data: data/rirs/lmdb
save_epoch_interval: 5
scheduler: ExponentialDecrease
scheduler_args:
epoch_iter: 1066
final_lr: 5.0e-05
initial_lr: 0.1
num_epochs: 250
scale_ratio: 16.0
warm_from_zero: true
warm_up_epoch: 6
seed: 42
train_data: data/vox2_dev/raw.list
train_label: data/vox2_dev/utt2spk

wsstriving · 2025-01-04T09:16:08Z

Since this is ArcMargin loss with margins, the current loss behavior is expected. If you switch to the standard softmax criterion, you can achieve significantly higher accuracy more easily.

NathanJHLee · 2025-01-05T23:34:53Z

Thank you for your explanation. I have one more question. If I use ArcMargin for projection during training, is there a way to determine the optimal number of epochs during Stage 3? Or is the only option to check the EER by evaluating the saved epoch models?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is the evaluation metric for training not loss or accuracy, but rather EER? #396

Is the evaluation metric for training not loss or accuracy, but rather EER? #396

NathanJHLee commented Jan 3, 2025

wsstriving commented Jan 4, 2025

NathanJHLee commented Jan 5, 2025

Is the evaluation metric for training not loss or accuracy, but rather EER? #396

Is the evaluation metric for training not loss or accuracy, but rather EER? #396

Comments

NathanJHLee commented Jan 3, 2025

wsstriving commented Jan 4, 2025

NathanJHLee commented Jan 5, 2025