You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
2023-10-19 10:10:09,510 INFO [infer.py:224] synthesize text: Selamat pagi
2023-10-19 10:10:09,513 WARNING [words_mismatch.py:88] words count mismatch on 500.0% of the lines (5/1)
2023-10-19 10:10:09,516 WARNING [words_mismatch.py:88] words count mismatch on 400.0% of the lines (4/1)
Traceback (most recent call last):
File "bin/infer.py", line 282, in <module>
main()
File "/media/de3fd1ee-a8c4-4153-9cf5-d642327ff6d0/TTS/valle/valle_env/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "bin/infer.py", line 251, in main
encoded_frames = model.inference(
File "/media/de3fd1ee-a8c4-4153-9cf5-d642327ff6d0/TTS/valle/vall-e/valle/models/valle.py", line 1050, in inference
raise SyntaxError(
SyntaxError: well trained model shouldn't reach here.
It means that AR model could not predict EOS token which implies that it was not trained well. Do you know if this happens with other examples? Btw, does the loss curve of AR training seem ok?
It's the same with my problem. When I tested with a short prompt audio (3s or 4s), it was still good. However, the model didn't work or have a bad result. Could you guys help me to fix it?
I get an error like this:
how to solve it? I have done AR and NAR training following the information here https://github.com/lifeiteng/vall-e#:~:text=LibriTTS%20demo%20Trained%20on%20one%20GPU%20with%2024G%20memory
The text was updated successfully, but these errors were encountered: