Skip to content

Usefulness of conditioning latents #84

Answered by neonbjb
planetrocke asked this question in Q&A
Discussion options

You must be logged in to vote

This time is predominantly spent loading the model files (~4GB of them) from disk. If you use tortoise programmatically, this should only happen when you first instantiate TextToSpeech. After that, this loading time would be skipped for each call to tts().

Generating the conditioning latents is relatively fast. It should only take a few milliseconds once everything is in memory. If that's not the case, there's a bug here.

Replies: 2 comments 1 reply

Comment options

You must be logged in to vote
1 reply
@planetrocke
Comment options

Comment options

You must be logged in to vote
0 replies
Answer selected by planetrocke
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants