Replies: 1 comment 1 reply
-
If I was going to attempt to improve audio quality, I would focus on improving the vocoder and the diffusion model, in that order. I would be curious to know how well Univnet would perform upsampling to 44kHz from the current MEL spec. I bet it would work pretty well. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Thank you for sharing this mind-blowing TTS!
Would love to know what it takes to increase the output bit rate, to have a higher quality output?
I've read that this training model was trained mainly on audio books, as we probably know it has not so good audio quality (lower bitrate in general). I was wondering is it the bottleneck?
Beta Was this translation helpful? Give feedback.
All reactions