Below are some configuration settings related to Speech-to-Text.
You may also wish to see:
- 🌟 Features / 🦻 Speech-to-Text for a higher-level introduction to the Speech-to-Text features
- 📖 Usage / 🦻 Speech-to-Text section for more details on how to use the bot for Speech-to-Text in a room
Controls how voice messages sent by 👥 user are handled.
The following configuration values are recognized:
-
(default)
transcribe_and_generate_text
: the bot will turn 👥 user voice messages into text and then generate text messages via 💬 Text Generation. This is the default setting to allow for Seamless voice interaction. -
ignore
: the bot will ignore all audio messages -
only_transcribe
: the bot will turn 👥 user voice messages into text, but will not proceed with 💬 Text Generation. Switching to this may be useful in some cases, as in Transcribe-only mode.
Example: !bai config room speech-to-text set-flow-type ignore
(this can also be set globally, see 🛠️ Room Settings)
Controls how the transcribed text of voice messages is sent to the chat when Flow Type = only_transcribe
.
The following configuration values are recognized:
-
(default)
text
: the transcribed text is sent as a regular message. This is more convenient if you'd like to forward the transcribed message to other rooms. -
notice
: the transcribed text is sent as a notice message. This provides better compatibility with other bots in the room, as they are less likely to interact with messages of type notice.
Example: !bai config room speech-to-text set-msg-type-for-non-threaded-only-transcribed-messages notice
(this can also be set globally, see 🛠️ Room Settings)
Lets you specify the language of the input voice messages, to avoid using auto-detection.
Supplying the input language using a 2-letter code (e.g. ja
) as per ISO-639-1 may improve accuracy & latency.
In the above example screenshot, even without a language specified, the voice was understood correctly as Bulgarian, but was produced in latin, not Cyrillic, which is wrong.
If different 👥 user are using different languages, do not specify a language.
💡 Certain models (like OpenAI's Whisper) may perform auto-translation if you specify a language, but you're speaking another one. You may abuse this side-effect for performing voice-to-text translation, but be aware that not all models behave this way.
Example (setting it to Japanese): !bai config room speech-to-text set-language ja
(this can also be set globally, see 🛠️ Room Settings)