🦻 Speech-to-Text

Below are some configuration settings related to Speech-to-Text.

You may also wish to see:

🌟 Features / 🦻 Speech-to-Text for a higher-level introduction to the Speech-to-Text features
📖 Usage / 🦻 Speech-to-Text section for more details on how to use the bot for Speech-to-Text in a room

🪄 Flow Type

Controls how voice messages sent by 👥 user are handled.

The following configuration values are recognized:

(default) transcribe_and_generate_text: the bot will turn 👥 user voice messages into text and then generate text messages via 💬 Text Generation. This is the default setting to allow for Seamless voice interaction.
ignore: the bot will ignore all audio messages
only_transcribe: the bot will turn 👥 user voice messages into text, but will not proceed with 💬 Text Generation. Switching to this may be useful in some cases, as in Transcribe-only mode.

Example: !bai config room speech-to-text set-flow-type ignore (this can also be set globally, see 🛠️ Room Settings)

🪄 Message Type for non-threaded only-transcribed messages

Controls how the transcribed text of voice messages is sent to the chat when Flow Type = only_transcribe.

The following configuration values are recognized:

(default) text: the transcribed text is sent as a regular message. This is more convenient if you'd like to forward the transcribed message to other rooms.
notice: the transcribed text is sent as a notice message. This provides better compatibility with other bots in the room, as they are less likely to interact with messages of type notice.

Example: !bai config room speech-to-text set-msg-type-for-non-threaded-only-transcribed-messages notice (this can also be set globally, see 🛠️ Room Settings)

🔤 Language

Lets you specify the language of the input voice messages, to avoid using auto-detection. Supplying the input language using a 2-letter code (e.g. ja) as per ISO-639-1 may improve accuracy & latency.

In the above example screenshot, even without a language specified, the voice was understood correctly as Bulgarian, but was produced in latin, not Cyrillic, which is wrong.

If different 👥 user are using different languages, do not specify a language.

💡 Certain models (like OpenAI's Whisper) may perform auto-translation if you specify a language, but you're speaking another one. You may abuse this side-effect for performing voice-to-text translation, but be aware that not all models behave this way.

Example (setting it to Japanese): !bai config room speech-to-text set-language ja (this can also be set globally, see 🛠️ Room Settings)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

speech-to-text.md

speech-to-text.md

🦻 Speech-to-Text

🪄 Flow Type

🪄 Message Type for non-threaded only-transcribed messages

🔤 Language

Files

speech-to-text.md

Latest commit

History

speech-to-text.md

File metadata and controls

🦻 Speech-to-Text

🪄 Flow Type

🪄 Message Type for non-threaded only-transcribed messages

🔤 Language