Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

speak to text , and Text to Text for language trans "extensions" #203

Open
gedw99 opened this issue Oct 29, 2024 · 1 comment
Open

speak to text , and Text to Text for language trans "extensions" #203

gedw99 opened this issue Oct 29, 2024 · 1 comment

Comments

@gedw99
Copy link

gedw99 commented Oct 29, 2024

whisper can be wrapped with golang easily and then the system can do speak to text.

working demo here:

https://github.com/gedw99/galene-stt that is NOT integrated with broadcast-box yet.

This makefile works everywhere and "dep-test" will run and do an audio to text...


Text to Text might also be useful as another "extension".

Just raising to see if there is support for integration or not.

@gedw99 gedw99 changed the title speak to text , and Text to Text for language trans speak to text , and Text to Text for language trans "extensions" Oct 29, 2024
@ChaseCares
Copy link
Collaborator

Hello! This is a interesting idea, I can't speak to whether adding extensions would be feasible, or if that is something Sean would like to add. However, I do have reservations to adding speech recognition. I use and rely on stt and have first hand experience in how inaccurate the transcribing can be. I don't believe that it would be accurate enough to be useful, in my experience the transcriptions often require heavy editing to match what was spoken.

With that said, if it was going to be implemented, I think adding some warnings about accuracy would be a good idea, and that the text may be inaccurate or misleading. This is important because if a user is exclusively relying on the text to understand what is going on they would have no way to verify the accuracy.

I am optimistic about the technology, it has gotten significantly better recently. I would be very interesting to hear other people's opinions about it.

Thanks for the suggestion!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants