feat: update conversation when agent finishes speaking or is interrupted #34
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Purpose
Resolves an issue where we record multiple instances of the user speaking and the agent responding. This would happen because we would always update the db whenever calling the function to generate a response even if the user immediately cuts of the agent such that they never actually hear any part of the agents response.
I resolved this by removing the call to the final processing function from the stream and update chat function (if the function is called in a voice call). Instead the function is called from two event hooks triggered by livekit. I added hooks for two different events: when the agent finishes speaking (
agent_speech_committed
) and when the agent is interrupted (agent_speech_interrupted
).I found that using these two events together worked reliably to record all messages that make sense to record because the interruption event only fires when the agent is cut off while it is actually speaking. The event also includes a message object with text that only includes what the agent actually said rather than the full message. This event never fires when you pause for a moment (and a response begins to be generated by the backend), but you do not pause long enough for the agent to actually start talking.
Known Issues
There is a known issue where the first message the agent says (Hey, great to meet you, ! How's it going?) is recorded twice for an unknown reason. Possibly both events are triggered from this even if you don't actually interrupt the agent.