You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I’m currently using Buzz for transcribing interviews with multiple speakers. However, I’ve noticed that the transcription doesn’t differentiate between different voices or speakers in the audio. Is speaker diarization (speaker identification) available or on the roadmap as a feature?
Additionally, I noticed the "prompt" feature, but it doesn't seem to affect speaker recognition. Could you clarify its purpose and if it might relate to this?
Thanks in advance for your help!
The text was updated successfully, but these errors were encountered:
Yes this is one of idea that has been requested previously. It is in the list of things that would be nice to have in the future.
Also seems that https://github.com/MahmoudAshraf97/whisper-diarization works quite well, so it could be implemented in the Buzz at some future day.
Prompt feature of the whisper models is described here https://cookbook.openai.com/examples/whisper_prompting_guide
In my testing it has not showed super meaningful results, but other may get better results. Feel free to share feedback on the results of prompting as it may be useful to others
I’m currently using Buzz for transcribing interviews with multiple speakers. However, I’ve noticed that the transcription doesn’t differentiate between different voices or speakers in the audio. Is speaker diarization (speaker identification) available or on the roadmap as a feature?
Additionally, I noticed the "prompt" feature, but it doesn't seem to affect speaker recognition. Could you clarify its purpose and if it might relate to this?
Thanks in advance for your help!
The text was updated successfully, but these errors were encountered: