Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extreme background noises on almost all generations and mixed speakers #99

Open
AkumaNoTsubasa opened this issue Mar 11, 2024 · 0 comments

Comments

@AkumaNoTsubasa
Copy link

Hello,

I really love to finally have found an UI for Suno Bark, which makes it really easier to generate some stuff on the fly, as my knowledge in python is so barebones, I am happy I get a line of text spoken. But but I have some major issues.

  1. About 80% of all Text I generate has massive background noises or is just noise.
  2. I have it happen multiple times that, no matter if I use plain input or SSML with only one single speaker defined, that the generation ends up switching between 2-5 voices.
  3. That the chosen model often only respects the language of the premade suno voices but not the acutal chosen speaker. I often get the female voice eventhough I chose a male one.
  4. Random length of the generation. It often generates 3-8 seconds of silence in the beginning and sometimes also 3-4 seconds in the middle of a line of text. It seems it tries to keep the soundfiles at 10-15 seconds length.

I am using a AMD Ryzen 7 5800X 8-Core Processor @ 3.80 GHz and a 3070ti

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant