Add options for handling multilingual input #200

jsichi · 2024-04-08T09:02:13Z

This is an incomplete PR intended to start on addressing issues semi-related to #184.

The multilingual_input option controls whether multiple languages should be expected in the input stream. If False (the backwards compatible default), only one language is expected, and it will be either the one specified by the client, or the first one heard if none was specified by the client. If True, the language can change throughout the stream, and for transcription, this will result in a multilingual text. Notifications will be sent to the client whenever a language change is detected. If the pauses between utterances in different languages are not long enough, the transcript boundaries may be incorrect, i.e. the first sentence in the new language may be incorrectly transcribed in the previous language. This seems currently unavoidable due to the way the last work-in-progress segment gets reprocessed.

The lang_filter option allows the client to restrict the candidate set of languages for which to listen. This may be useful regardless of the multilingual_input setting, e.g. at the beginning of the input where the actual language may be incorrectly detected initially. If not set (the backwards compatible default), all known languages are listened for.

If there's interest in adding these, I can propagate them to the TensorRT code as well. I'm not sure how to add tests since that would require using a large multilingual model (we would also need to add some multilingual samples, which might be useful anyway).

AdolfVonKleist · 2024-09-26T06:42:52Z

Overall, how reliable is this in general, and compared to say, what happens when you have no special filtering/processing in place? Do you have any objective benchmark? I'm interested in making use of a similar approach locally.

jsichi · 2024-09-26T15:17:14Z

It's been a while since I worked on this, but it was a noticeable improvement. I don't have any benchmark for you.

jsichi added 2 commits April 8, 2024 17:46

Add multilingual_input and lang_filter options

2c2967b

Fix whitespace.

20e1ddf

jsichi mentioned this pull request Apr 8, 2024

The concept of transcribed bilingualism. #184

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add options for handling multilingual input #200

Add options for handling multilingual input #200

jsichi commented Apr 8, 2024 •

edited

Loading

AdolfVonKleist commented Sep 26, 2024

jsichi commented Sep 26, 2024

Add options for handling multilingual input #200

Are you sure you want to change the base?

Add options for handling multilingual input #200

Conversation

jsichi commented Apr 8, 2024 • edited Loading

AdolfVonKleist commented Sep 26, 2024

jsichi commented Sep 26, 2024

jsichi commented Apr 8, 2024 •

edited

Loading