Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: MaryTTS Compatibility #57

Open
ther3zz opened this issue Feb 13, 2024 · 3 comments
Open

Feature Request: MaryTTS Compatibility #57

ther3zz opened this issue Feb 13, 2024 · 3 comments

Comments

@ther3zz
Copy link

ther3zz commented Feb 13, 2024

Hello,

Would it be possible to write MarryTTS compatibility into this (similar to what coqui-tts has)?

The specific intent here is to provide compatibility with Home Assistant

@neowisard

This comment was marked as outdated.

@ther3zz
Copy link
Author

ther3zz commented Jul 4, 2024

@daswer123 Could you please estimate the complexity of forming this endpoint in your product ? and specify where to do it, I can try to do it with DeepSeek coder in PyCharm.

about last endpoint in this API , only this need to me and community of HomeAssistant.It is a smart home product and it has a handy assistant. It requires STT, LLM, function, TTS. And preferably with an api compatible with OpenAI. STT - whisper.cpp LLM llama.cpp TTS ??? (alltalk_tts has big overhead, localai buggy, silero deprecated).

/process?INPUT_TEXT=..text..&INPUT_TYPE=TEXT&LOCALE=[locale]&VOICE=[name]&OUTPUT_TYPE=AUDIO&AUDIO=WAVE_FILE - Processes the text and returns a wav file. We can probably ignore INPUT_TYPE, OUTPUT_TYPE and AUDIO as I've never seen any program using a different setting.

@neowisard So there's a different project (openedai-speech) which works well with this home assistant HACS integration openai_tts fork.

If you want to be able to type your own model/voice values, take a look at this openai_tts PR

@neowisard
Copy link

neowisard commented Jul 4, 2024

Thx !
Just to be clear, I have tested both api's and despite the fact that they use almost identical engines and models.
xtts-api-server with deepseed (12-13sec) on my Tesla P40 is slightly faster than openedai-speech with enabled deepspeed (18-21 sec).
Screenshot from 2024-07-05 01-27-11
Screenshot from 2024-07-05 01-34-57

and both has some memory leak (on vgpu) .
I just forked and tweaked it for me. it is in my repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants