User Audio Input transcription issue #624

kamjony · 2024-12-20T10:58:56Z

davidmigloz · 2024-12-21T07:30:12Z

There seems to be an issue with the input_audio_transcription param:

{type: error, event_id: event_AgoCpMJ7LkKOPOCfcNLn9, error: {type: invalid_request_error, code: unknown_parameter, message: Unknown parameter: 'session.input_audio_transcription.enabled'., param: session.input_audio_transcription.enabled, event_id: evt_nAh8N2BDAnkzVHMeH}}

I'll look further into it.

davidmigloz · 2024-12-21T07:42:09Z

I'll remove the enable parameter which is not required anymore. Apart from that, the request looks to be according to the spec.

It seems more people are facing this issue:
https://community.openai.com/t/realtime-api-session-update-doesnt-change-input-audio-format/967077

Most of the issues when not getting a transcription have to do with the input audio that is being passed.
Write it to a file and listen to it, maybe you’ll spot some errors.
Make sure that the samplerate is 24000 Hz as the API requires this.
Make sure that the audio doesn’t sound distorted, cut out, speed up or down or pitched up/down.

Let me know if you manage to solve it.

kamjony · 2024-12-23T15:14:41Z

@davidmigloz I did everything that you suggested above but was unable to get "type:"conversation.item.input_audio_transcription.completed".
So, I just created my own client and consumed the realtime api just to see if it was actually a problem with the package. After several hours of debugging, I found out that, if you send full audio in conversation.create, transcription is not generated by openAI. But, if you append the audio PCM using input_audio_buffer.append and then commit the audio using "type": "input_audio_buffer.commit", then the transcription is generated. I will try with the package client soon and report back.

davidmigloz · 2024-12-26T20:09:43Z

Interesting, thanks for sharing! Let me know the results when you try it with openai_realtime_dart client

dominikmucklow · 2025-01-07T11:42:14Z

Has anyone been able to resolve this issue? We are receiving the audio just fine but cannot locate a transcription in any of the event logs.

kamjony · 2025-01-07T11:58:31Z

@davidmigloz @dominikmucklow With this package, I was unable to get transcription as there are no separate function to commit the audio before generating a gpt response. So, I have manually implemented chatgpt realtime following their docs. Now, I get transcription and have a freedom to generate response when I want to.

davidmigloz · 2025-01-07T15:37:54Z

Thanks for the feedback @kamjony! I'll review their latest spec, maybe they added that method after my initial implementation.

dominikmucklow · 2025-01-08T09:34:12Z

@kamjony Thanks for the info, and glad to know that it can work. Manual implementation is over my head, so I'll keep following updates to this package. @davidmigloz keep up the good work! Here's a lunch on me.

davidmigloz · 2025-01-10T16:29:13Z

Thanks! I'll try to look into it this weekend

kamjony added the t:bug Something isn't working label Dec 20, 2024

github-project-automation bot added this to LangChain.dart Dec 20, 2024

github-project-automation bot moved this to 📋 Backlog in LangChain.dart Dec 20, 2024

davidmigloz added the p:openai_realtime_dart openai_realtime_dart package label Dec 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

User Audio Input transcription issue #624

User Audio Input transcription issue #624

kamjony commented Dec 20, 2024 •

edited by davidmigloz

Loading

davidmigloz commented Dec 21, 2024

davidmigloz commented Dec 21, 2024 •

edited

Loading

kamjony commented Dec 23, 2024

davidmigloz commented Dec 26, 2024

dominikmucklow commented Jan 7, 2025

kamjony commented Jan 7, 2025

davidmigloz commented Jan 7, 2025

dominikmucklow commented Jan 8, 2025

davidmigloz commented Jan 10, 2025

User Audio Input transcription issue #624

User Audio Input transcription issue #624

Comments

kamjony commented Dec 20, 2024 • edited by davidmigloz Loading

System Info

Related Components

Reproduction

Expected behavior

davidmigloz commented Dec 21, 2024

davidmigloz commented Dec 21, 2024 • edited Loading

kamjony commented Dec 23, 2024

davidmigloz commented Dec 26, 2024

dominikmucklow commented Jan 7, 2025

kamjony commented Jan 7, 2025

davidmigloz commented Jan 7, 2025

dominikmucklow commented Jan 8, 2025

davidmigloz commented Jan 10, 2025

kamjony commented Dec 20, 2024 •

edited by davidmigloz

Loading

davidmigloz commented Dec 21, 2024 •

edited

Loading