Skip to content

Speech length #89

Answered by neonbjb
kadattack asked this question in Q&A
Jun 3, 2022 · 2 comments · 2 replies
Discussion options

You must be logged in to vote

Hey, thanks!

Ack on the long sentence problem. This is mostly caused by the fact that my training set-up could not accommodate training against long speech segments, so the models just get weaker in these cases. Still, it should do a pretty good job up to 15 seconds, which is more than enough time to speak even fairly long sentences. For extra-long sentences, I recommend breaking on commas, semicolons or dashes.

You can precompile voice latents using get_conditioning_latents.py: https://github.com/neonbjb/tortoise-tts#generating-conditioning-latents-from-voices. Expect to shave off about half a second by doing this.

Replies: 2 comments 2 replies

Comment options

You must be logged in to vote
2 replies
@kadattack
Comment options

@neonbjb
Comment options

Answer selected by kadattack
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants