Training instructions from README.md not working for me #104

davidmartinrius · 2023-04-21T13:51:15Z

Hello,

I am working with an Ubuntu 22, a NVIDIA RTX 3080, 64GB RAM

I followed the steps of the DEMO in the README.md to train a model of LibriTTS.

The result after the inference is wrong. It sounds like a weird noise. I attached the wav inside a zip because github does not allow to upload a wav.

0.zip

I ran the inference like in the instructions:

python3 bin/infer.py --output-dir infer/demos \
    --model-name valle --norm-first true --add-prenet false \
    --share-embedding true --norm-first true --add-prenet false \
    --text-prompts "KNOT one point one five miles per hour." \
    --audio-prompts ./prompts/8463_294825_000043_000000.wav \
    --text "To get up and running quickly just follow the steps below." \
    --checkpoint=${exp_dir}/best-valid-loss.pt

Please, can you help me to understand what am I doing wrong?

Ask me for any information you need to analyze, I will provide it.

When training I had to change the parameter --max-duration to prevent out of memory error.

In AR model I changed --max-duration to 20
In NAR model I changed --max-duration to 15

In both cases I had to remove "--valid-interval 20000" because this parameter is not recognized by bin/trainer.py

Thank you,

David Martin Rius

The text was updated successfully, but these errors were encountered:

lifeiteng · 2023-04-22T12:55:10Z

@davidmartinrius The problem is --max-duration to 20 which means the batch_size is in [1, 6].

try to train the model on 3090/4090 or A100.

davidmartinrius · 2023-04-22T14:57:54Z

@lifeiteng thanks for your response. Your response is not clear for me. Maybe the batch size is lower because of the vram, and it means that needs more iterations, but it should not affect the performance. I'm sorry but your response is not useful for me. It should be able to train it in almost any Nvidia RTX > 3000 series GPU...

Please, if you really think that the max duration is the problem, can you explain how to adapt it to a 10GB GPU?...

lifeiteng · 2023-04-23T02:47:43Z

@davidmartinrius
Small batch_size will not converge to a good local optimal point. It's common sense in DeepLearning.

davidmartinrius · 2023-04-23T10:47:29Z

I agree with you in this point. I understand that when the batch size is too small, the gradients computed from the batch may not be representative of the overall structure of the dataset, leading to unstable and slow convergence during training.

Said that, do you think is it possible to make it work adjusting the gradient accumulation, the learning rate, batch normalization or even adding more layers? Actually I don't know the whole project, maybe you could valuate it.

If there is a way to optimize it I would like to try it. I know it means more training hours and more development.

Thank you!

David Martin Rius

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training instructions from README.md not working for me #104

Training instructions from README.md not working for me #104

davidmartinrius commented Apr 21, 2023 •

edited

Loading

lifeiteng commented Apr 22, 2023 •

edited

Loading

davidmartinrius commented Apr 22, 2023 •

edited

Loading

lifeiteng commented Apr 23, 2023

davidmartinrius commented Apr 23, 2023 •

edited

Loading

Training instructions from README.md not working for me #104

Training instructions from README.md not working for me #104

Comments

davidmartinrius commented Apr 21, 2023 • edited Loading

lifeiteng commented Apr 22, 2023 • edited Loading

davidmartinrius commented Apr 22, 2023 • edited Loading

lifeiteng commented Apr 23, 2023

davidmartinrius commented Apr 23, 2023 • edited Loading

davidmartinrius commented Apr 21, 2023 •

edited

Loading

lifeiteng commented Apr 22, 2023 •

edited

Loading

davidmartinrius commented Apr 22, 2023 •

edited

Loading

davidmartinrius commented Apr 23, 2023 •

edited

Loading