Training Model #4024

blessziamah · 2024-10-13T17:44:31Z

blessziamah
Oct 13, 2024

Is it possible for me to train a new model with a dataset that contains only voice recordings? No transcript

samarasimhapeyala · 2024-10-28T09:11:25Z

samarasimhapeyala
Oct 28, 2024

Training a text-to-speech model typically requires both audio recordings and corresponding transcripts (text) to learn the mapping between spoken sounds and written language. If you only have voice recordings without transcripts, you would face a significant challenge because the model needs the text to understand what to synthesize. You can use ASR models like whisper to generate transcripts for your audios.

Please mark it answered if your doubt is resolved.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training Model #4024

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Training Model #4024

blessziamah Oct 13, 2024

Replies: 1 comment

samarasimhapeyala Oct 28, 2024

blessziamah
Oct 13, 2024

samarasimhapeyala
Oct 28, 2024