Voice Cloning through a two step styling process? #140

kaushal-gawri9899 · 2024-09-24T11:58:17Z

Hey, is it possible to allow voice cloning by implementing a two way process for encoding? Basically, before encoding, can we inject a speaker embedding to be used at time of encoding instead of solely depending on the style prompt? I'm looking to control the styling through a two way process where i can provide the required speaker embedding to the encoder for tone coloring/voice cloning and can do the rest of the styling through the prompt (ignoring who the speaker is)?

apresence · 2024-09-26T07:23:53Z

If I understand your request correctly, I am working on effectively the same thing. It looks like your method is much more involved, so you might have better results with it. I'm working on cleaning up the code and once it's ready I'll submit a PR for it (I'm a full-time programmer with a day job, which means ~60 hrs/wk... so I'm finding the time when I can). Already submitted a PR to prep some changes for it. See #139.

kaushal-gawri9899 · 2024-09-26T13:40:27Z

I guess it's similar but based on the PR, I'm under the impression that you're trying to propagate the speaker representations using "input_values" in the encoder, right? I'm trying to use a different approach where i train the model to consider the speaker reference voice in the decoder (Causal LM) so I had tweaked the architecture as stated above.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Voice Cloning through a two step styling process? #140

Voice Cloning through a two step styling process? #140

kaushal-gawri9899 commented Sep 24, 2024

apresence commented Sep 26, 2024 •

edited

Loading

kaushal-gawri9899 commented Sep 26, 2024 •

edited

Loading

Voice Cloning through a two step styling process? #140

Voice Cloning through a two step styling process? #140

Comments

kaushal-gawri9899 commented Sep 24, 2024

apresence commented Sep 26, 2024 • edited Loading

kaushal-gawri9899 commented Sep 26, 2024 • edited Loading

apresence commented Sep 26, 2024 •

edited

Loading

kaushal-gawri9899 commented Sep 26, 2024 •

edited

Loading