Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop Audio cut off at the end/is there a way to add time buffer? #416

Open
jtfl28 opened this issue Apr 27, 2023 · 6 comments
Open

Stop Audio cut off at the end/is there a way to add time buffer? #416

jtfl28 opened this issue Apr 27, 2023 · 6 comments

Comments

@jtfl28
Copy link

jtfl28 commented Apr 27, 2023

Almost every clip I produce abruptly ends the sentence just a second early. Most of the time it doesn't complete the last word so just adding blank space in-between the sentences won't work.

Is there any way to avoid this? Thanks in advance for the help!

@neonbjb
Copy link
Owner

neonbjb commented Apr 27, 2023

This is at least partially caused by the conditioning voice. For some reason some voices exhibit this more than others. I would try using different conditioning clips or fiddling with the one you have.

@n8bot
Copy link
Contributor

n8bot commented Apr 28, 2023

I have found that ensuring there is a period at the very end of the prompt can help with this. Interesting to know that it is voice-dependent. Good tip.

@n8bot
Copy link
Contributor

n8bot commented Apr 30, 2023

I have an open pull request to add more high quality voices to tortoise, with many audio clips in each voice. They can also be re-arranged to evoke specific emotions.

They seem good at not cutting off the end. I suspect the reason that some voices do that is because the clips do not have a soft transition at the end of the clip. What is not exactly audible to us, is a sudden and dramatic falloff of audio signal to the computer. Using samples from professional voice over clips seems to alleviate the issue.

So, when making new voices, it might be important to add a fade in and out to each and every audio clip — even if the fade lasts only a few ms.

#425

@ziyaad30
Copy link

ziyaad30 commented May 4, 2023

I have an open pull request to add more high quality voices to tortoise, with many audio clips in each voice. They can also be re-arranged to evoke specific emotions.

They seem good at not cutting off the end. I suspect the reason that some voices do that is because the clips do not have a soft transition at the end of the clip. What is not exactly audible to us, is a sudden and dramatic falloff of audio signal to the computer. Using samples from professional voice over clips seems to alleviate the issue.

So, when making new voices, it might be important to add a fade in and out to each and every audio clip — even if the fade lasts only a few ms.

#425

Tried that a while ago, does not work so I had to use below:

one_sec_segment = AudioSegment.silent(duration=500)  #duration in milliseconds
sound = AudioSegment.from_wav(file)
final_sound = sound + one_sec_segment
final_sound.export(f'outputs/silenced_{fname}.wav', format="wav")

Which inserts the full audio, so seems to me the program itself is cutting the audio off

@spottenn
Copy link
Contributor

I have an open pull request to add more high quality voices to tortoise, with many audio clips in each voice. They can also be re-arranged to evoke specific emotions.
They seem good at not cutting off the end. I suspect the reason that some voices do that is because the clips do not have a soft transition at the end of the clip. What is not exactly audible to us, is a sudden and dramatic falloff of audio signal to the computer. Using samples from professional voice over clips seems to alleviate the issue.
So, when making new voices, it might be important to add a fade in and out to each and every audio clip — even if the fade lasts only a few ms.
#425

Tried that a while ago, does not work so I had to use below:

one_sec_segment = AudioSegment.silent(duration=500)  #duration in milliseconds
sound = AudioSegment.from_wav(file)
final_sound = sound + one_sec_segment
final_sound.export(f'outputs/silenced_{fname}.wav', format="wav")

Which inserts the full audio, so seems to me the program itself is cutting the audio off

I'm considering fixing this bug. This may be a bug with saving the sound to a wave file. Where did you put your code in order to get it to work and not cut off the end? How do I replicate your results?

@worldwidewebcap
Copy link

Just write something like "End" after the last word of each sentence in the prompt. This prevents your intended last word from being cut short, using your placeholder word (like "End") instead. This make it easy to edit and cut out the end word later.
This is my workaround, anyway. Works for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants