-
-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cartesia Audio Cutting #100
Comments
hey @abdulrahmanmajid, thanks for reporting this. This is currently being worked on. |
Thanks for letting me know @prateeksachan, May I know the ETA? |
This should get done this week itself. |
Hey @prateeksachan , The issue I mentioned earlier is happening on Azure TTS as well. I did a fresh install, but the problem persists. Sometimes it works perfectly, but other times it becomes super choppy, breaking every single word as if it doesn’t have enough time to speak, and then it gets cut off by the next word. Could you take a look and see if you can fix it? Or let me know where the issue might be, and I’ll try to fix it and create a PR? |
Hi @abdulrahmanmajid can you please share the payload you're using? I'll take a look. |
Sorry for the late reply @prateeksachan, I've been testing it with 2 repos and 2 different payloads this is the payload I used on a fresh installation on my VM
and then this is the 2nd payload I use for the custom repo
also I had a question how exactly does backchanneling work? when the user speaks and LLM stays silent we stream words from constants?, are those words relevant to the conversation? also is it in the voice we choose? 2nd is it pulled from s3 or from local files? how does it exactly work also could you explain how use fillers work? I cant hear the agent using fillers during the conversation when this is turned on the ambient audio wasn't working until someone recently created the PR to fix it. |
and both have the same issue sometimes it speaks perfect then randomly it will start chopping up as if you are playing a YouTube video on super slow Wi-Fi could this be a issue with the VM? but the VM has 40gbps speeds |
so I've tested it with a different VM and the issue still is there, there's and issue with the transcriber aswell where it would stop listening or start transcribing gibberish even when you stay silent, as for the audio cutting issue I've attached a video and I cant attach the audio recording from Twilio but the same issue is in the recording as well |
can you try with
|
ok that fixed it pretty much but still happens occasionally, what about the other question I had? "how exactly does backchanneling work? when the user speaks and LLM stays silent we stream words from constants?, are those words relevant to the conversation? also is it in the voice we choose? 2nd is it pulled from s3 or from local files? how does it exactly work also could you explain how use_fillers work? I cant hear the agent using fillers during the conversation when this is turned on" |
Cartesia TTS cutting every single word, no constant smooth speech
Please share the correct payload for using Cartesia
is this correct?
The text was updated successfully, but these errors were encountered: