Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange sound at the end of voice #129

Open
Andiami-Yusaka opened this issue Oct 23, 2023 · 2 comments
Open

Strange sound at the end of voice #129

Andiami-Yusaka opened this issue Oct 23, 2023 · 2 comments

Comments

@Andiami-Yusaka
Copy link

Hi there,
I converted the below text using one sample voice, and the below settings. The quality of final voice is great; however, there is a strange noise at the end of sentence. It just appears in some sentences. I appreciate it if you assist me to set the proper configuration and resolve this issue.

Text:
Additionally, research findings on spatial puzzles were updated and further research was conducted. Documentation for the final end goal of the interactive shop interface was also started.

Setting
text_split
candidates=1
output_dir=results
seed=50
quiet=no
vocoder=BigVGAN_Base
models_dir=
disable_redaction=no
batch_size=
diff_checkpoint=
ar_checkpoint=
speed=original_tortoise
multi_output_regenerate
ooutput=result
device=
low_vram=no
no_cache=no
clvp_checkpoint=
preset=standard
tuning=cond_free
gvoicefixer=yes

Voice
https://github.com/152334H/tortoise-tts-fast/assets/129772750/782853af-664d-47ef-8b1c-64f2f0b6a684

@78Alpha
Copy link

78Alpha commented Oct 29, 2023

Seems pretty standard. A few dozen model trainings and stuff like that is always present. Sometime a full second.

It could be that whatever dataset is being used is being split mid-sentence, leaving a pop.

@JeffPlsFix
Copy link

JeffPlsFix commented Dec 4, 2023

I ran into the same issue, and i get these weird noises every 1 in 10 clips or so.

From som quick testing i believe it is voicefixer that creates those artifacts.
Disabling voicefixer seems to eliminate the weird sound, but ofcourse the overall quality becomes worse.

Instead, i just trimmed the wav files by 0.1 seconds at the endings, eliminating the strange noise without trimming any actual spoken voice as there usually is some few milliseconds leftover at the end of each clip.

I used scipy for this:
https://gist.github.com/JeffPlsFix/f4c54f68e8a9b3d4c8093dccd7ad0664

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants