Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better audio quality with larger resnet #65

Open
cschaefer26 opened this issue May 28, 2021 · 0 comments
Open

Better audio quality with larger resnet #65

cschaefer26 opened this issue May 28, 2021 · 0 comments

Comments

@cschaefer26
Copy link

cschaefer26 commented May 28, 2021

Hi, great repo!

I found that the audio quality improves considerably with a slightly increased ResNet as suggested in https://arxiv.org/pdf/2005.05106.pdf. The shaky and metallic artefacts are reduced a lot.

Here is a comparison of your pretrained LJSpeech with a current model I am still training (for TTS I used https://github.com/as-ideas/ForwardTacotron)

Original (6400 epochs):
https://drive.google.com/file/d/1LOIB9B7LDX9g-kVu_p1anGJgJ5vjE27s/view?usp=sharing

Larger ResNet (2000 epochs):
https://drive.google.com/file/d/19_d2SQU1xZi-o90MJ8NcKhIS6AFwliH-/view?usp=sharing

If you are interested I could open a PR making the layers more flexible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant