Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Application/Real-time API Does Not Wait for Speaker to Finish Speaking #52

Open
madhubandru opened this issue Nov 12, 2024 · 1 comment

Comments

@madhubandru
Copy link

The application currently identifies short pauses (around 1-2 seconds) during user speech as the end of the speaker's input. As a result, the application prematurely responds based on incomplete sentences or partial questions, leading to incomplete responses or unnecessary follow-up questions.

To improve user experience, we need to extend the wait duration to allow users to complete their thoughts before the application processes the input. This should account for natural pauses in speech to ensure the application only responds once the speaker has truly finished.

Request:

  • Implement or adjust a configurable delay/wait period after detecting speech pauses.
  • Ensure that brief pauses do not trigger the end of input, and responses only initiate when a user has likely finished speaking.

Expected Outcome:
The application should respond only after confirming that the user has completed their input, accommodating natural pauses without interruptions.

Any suggestions or discussions on this topic that could provide solutions or enhancements would be greatly appreciated. Thank you in advance!

@madhubandru
Copy link
Author

madhubandru commented Nov 12, 2024

Hi, @pamelafox @pablocastro can you please provide some direction on this issue? Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant