You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Description:
I am attempting to create a custom WebSocket-based AudioInterface for the ElevenLabs Conversational API. The goal is to process audio chunks and pass them to the input_callback function. However, when sending audio chunks to the process_audio_chunk method, they do not seem to be forwarded to the input_callback.
Despite buffering and finalizing the audio input, the input_callback does not receive any data, and the expected behavior of transmitting the full audio chunk does not occur. There are no error messages, but logging suggests that the audio is being buffered correctly.
Steps to Reproduce:
Implement the WebSocketAudioInterface as shown in the code snippet below.
Start the interface with start(input_callback), passing a valid callback function.
Send audio chunks via process_audio_chunk(audio_chunk).
Call finalize_audio_input() to process and send the buffered audio.
Observe that input_callback is not triggered with the expected audio data. Expected Behavior:
The input_callback should receive the buffered audio when finalize_audio_input() is called. Actual Behavior:
The input_callback never receives the audio data, even when finalize_audio_input() is explicitly called.
def start(self, input_callback):
self.input_callback = input_callback
self.audio_buffer.clear()
self.output_thread = threading.Thread(target=self._output_thread, daemon=True)
self.output_thread.start()
socketio.emit('interface_ready', {'status': 'ready'}, room=self.sid)
def process_audio_chunk(self, audio_chunk: bytes):
if not self.input_callback:
print(f"[WARNING] No input_callback registered for {self.sid}")
return
if not isinstance(audio_chunk, (bytes, bytearray)):
print(f"[ERROR] Invalid audio chunk type: {type(audio_chunk)}")
return
print(f"[INPUT] Processing {len(audio_chunk)} bytes of audio from {self.sid}")
self.audio_buffer.extend(audio_chunk)
def finalize_audio_input(self):
if not self.input_callback:
print(f"[WARNING] No input_callback registered for {self.sid}")
return
if not self.audio_buffer:
print(f"[INFO] No audio to process for {self.sid}")
return
print(f"[INPUT] Finalizing audio input of {len(self.audio_buffer)} bytes for {self.sid}")
self.input_callback(bytes(self.audio_buffer)) # This does not seem to trigger
self.audio_buffer.clear()
Additional context
Additional Context:
Related Issues: None found in the repository.
Possible Workaround: Manually calling input_callback externally works, but this defeats the purpose of using the built-in process_audio_chunk method.
Logs:
pgsql
Copy
Edit
[INPUT] Processing 512 bytes of audio from session-12345
[INPUT] Buffered 512 bytes from session-12345 (total buffered: 1024)
[INPUT] Finalizing audio input of 1024 bytes for session-12345
[WARNING] No input_callback registered for session-12345 # Unexpected
Any guidance on resolving this issue would be appreciated. Thank you!
The text was updated successfully, but these errors were encountered:
Hi @Al-aminI , thanks for adding the issue. taking a look into this now. out of curiosity could you give me a rough idea of what you're trying to do when you process the audio?
Hi @AngeloGiacco thanks for the reply, really appreciate your prompt response, I was trying to send in input audio chunks(via the input callback), from client as bytes, and then retrieve the output audio chunks via the output callback, however, I was able to retrieve audio output from the output callback, but unable to send in audio bytes through the input callback.
I was trying to implement a web socket audio interface, just like you have the default audio interface that use system audio through py audio, I was trying to replicate the same but with web socket so that I can integrate to client/frontend.
It is something you could also implement just like you have the default audio interface, since most of use cases will require a socket to connect to the frontend and the conversationalAI API.
Thank you
Description
Description:
I am attempting to create a custom WebSocket-based AudioInterface for the ElevenLabs Conversational API. The goal is to process audio chunks and pass them to the input_callback function. However, when sending audio chunks to the process_audio_chunk method, they do not seem to be forwarded to the input_callback.
Despite buffering and finalizing the audio input, the input_callback does not receive any data, and the expected behavior of transmitting the full audio chunk does not occur. There are no error messages, but logging suggests that the audio is being buffered correctly.
Steps to Reproduce:
Implement the WebSocketAudioInterface as shown in the code snippet below.
Start the interface with start(input_callback), passing a valid callback function.
Send audio chunks via process_audio_chunk(audio_chunk).
Call finalize_audio_input() to process and send the buffered audio.
Observe that input_callback is not triggered with the expected audio data.
Expected Behavior:
The input_callback should receive the buffered audio when finalize_audio_input() is called.
Actual Behavior:
The input_callback never receives the audio data, even when finalize_audio_input() is explicitly called.
Code example
class WebSocketAudioInterface(AudioInterface):
def init(self, sid):
self.sid = sid
self.input_callback = None
self.audio_buffer = bytearray()
self.output_queue = queue.Queue()
self.should_stop = threading.Event()
self.output_thread = None
print(f"Created WebSocketAudioInterface for session {sid}")
Additional context
Additional Context:
Related Issues: None found in the repository.
Possible Workaround: Manually calling input_callback externally works, but this defeats the purpose of using the built-in process_audio_chunk method.
Logs:
pgsql
Copy
Edit
Any guidance on resolving this issue would be appreciated. Thank you!
The text was updated successfully, but these errors were encountered: