-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat/lang_detection_plugin #19
base: dev
Are you sure you want to change the base?
Conversation
@coderabbitai review |
Actions performedReview triggered.
|
WalkthroughThe recent updates enhance the Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant CLI
participant STTServer
participant LangPlugin
participant MultiModelContainer
User->>CLI: Run ovos-stt-server with lang detection
CLI->>STTServer: Start server with --lang-engine and --multi options
STTServer->>LangPlugin: Load language detection plugin
STTServer->>MultiModelContainer: Initialize with multi-language support
User->>STTServer: Send audio for processing
STTServer->>LangPlugin: Detect language
LangPlugin->>STTServer: Return detected language
STTServer->>MultiModelContainer: Process audio with detected language model
MultiModelContainer->>User: Return transcription result
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Outside diff range, codebase verification and nitpick comments (2)
requirements/requirements.txt (1)
5-6
: Addition offlask
to requirements.
flask
has been added without a version specification. It's crucial to specify version constraints to avoid potential future compatibility issues.- flask + flask~=2.0README.md (1)
52-52
: Grammatical correction needed in example usage.The abbreviation "e.g." should be used with periods as indicated by LanguageTool.
- eg `ovos-stt-server --engine ovos-stt-plugin-fasterwhisper --lang-engine ovos-audio-transformer-plugin-fasterwhisper` + e.g., `ovos-stt-server --engine ovos-stt-plugin-fasterwhisper --lang-engine ovos-audio-transformer-plugin-fasterwhisper`
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (6)
- .github/workflows/publish_alpha.yml (1 hunks)
- README.md (1 hunks)
- ovos_stt_http_server/init.py (3 hunks)
- ovos_stt_http_server/main.py (2 hunks)
- requirements/requirements.txt (1 hunks)
- setup.py (1 hunks)
Files skipped from review due to trivial changes (2)
- .github/workflows/publish_alpha.yml
- setup.py
Additional context used
LanguageTool
README.md
[uncategorized] ~51-~51: The abbreviation “e.g.” (= for example) requires two periods.
Context: ... Note:ffmpeg
is required for Gradio eg `ovos-stt-server --engine ovos-stt-plug...(E_G)
Additional comments not posted (5)
requirements/requirements.txt (1)
1-1
: Version specification forovos-plugin-manager
updated.The version change from
~=0.0.18
to~=0.0, >=0.0.26a15
broadens the acceptable version range and specifies a minimum version. This should be verified to ensure compatibility with the new features introduced in the PR.README.md (1)
33-48
: Updated command line options forovos-stt-server
.The changes to the command line options, including the addition of
--lang-engine
and--multi
, align with the PR's objectives to enhance language detection capabilities. Ensure that these changes are reflected in the user documentation and help outputs.Verification successful
Updated command line options for
ovos-stt-server
have been successfully documented.The new command line options
--lang-engine
and--multi
are present in theREADME.md
file, ensuring that the documentation is consistent with the changes.
README.md
: Lines mentioning--lang-engine
and--multi
.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify that the new command line options are documented. # Test: Search for mentions of the new options in the documentation. rg --type markdown $'--lang-engine'Length of output: 450
Script:
#!/bin/bash # Description: Verify that the new command line options are documented. # Corrected Test: Search for mentions of the new options in the documentation. rg --type markdown '--lang-engine' rg --type markdown '--multi'Length of output: 916
Script:
#!/bin/bash # Description: Verify that the new command line options are documented. # Corrected Test: Search for mentions of the new options in the documentation. rg --type markdown -- '--lang-engine' rg --type markdown -- '--multi'Length of output: 317
ovos_stt_http_server/__main__.py (2)
24-30
: Addition of--lang-engine
and--multi
arguments.The new arguments are added correctly and are marked as required for
--lang-engine
and optional for--multi
. This aligns with the PR's objectives to support multilingual plugins and language detection.
44-46
: Modification instart_stt_server
function call.The function now includes
lang_engine
andmulti
arguments, which is necessary for the new features. Verify that these changes are compatible with the rest of the application, especially in how thestart_stt_server
function is defined and used elsewhere.Verification successful
Modification in
start_stt_server
function call verified.The
start_stt_server
function definition inovos_stt_http_server/__init__.py
includes thelang_engine
andmulti
arguments, and the only other call to this function inovos_stt_http_server/__main__.py
correctly handles these arguments.
ovos_stt_http_server/__init__.py
:def start_stt_server(engine: str, lang_engine: str = None, multi: bool = False, has_gradio: bool = False) -> (FastAPI, ModelContainer): app, engine = create_app(engine, lang_engine, multi, has_gradio) return app, engine
ovos_stt_http_server/__main__.py
:server, engine = start_stt_server(args.engine, lang_engine=args.lang_engine, multi=bool(args.multi), has_gradio=bool(args.gradio))Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the usage of `start_stt_server` with new arguments. # Test: Search for the function definition and other calls to ensure they handle the new arguments. rg --type python $'start_stt_server'Length of output: 66
Script:
#!/bin/bash # Description: Verify the usage of `start_stt_server` with new arguments. # Search for the function definition of `start_stt_server`. ast-grep --lang python --pattern $'def start_stt_server($_, $_, $_, $_):\n $$$' # Search for other calls to `start_stt_server` in the codebase. rg 'start_stt_server'Length of output: 368
Script:
#!/bin/bash # Description: Verify the usage of `start_stt_server` with new arguments. # Search for the full function definition of `start_stt_server` in `ovos_stt_http_server/__init__.py`. rg 'def start_stt_server' -A 10 ovos_stt_http_server/__init__.py # Search for other calls to `start_stt_server` in the codebase. rg 'start_stt_server' --context 5Length of output: 2875
ovos_stt_http_server/__init__.py (1)
113-117
: New/lang_detect
endpoint added.The endpoint for language detection has been implemented correctly. Ensure that this endpoint is included in the API documentation and that its functionality is covered by integration tests.
def __init__(self, plugin: str, lang_plugin: str = None, config: dict = None): | ||
plugin = load_stt_plugin(plugin) | ||
self.lang_plugin = None | ||
if not plugin: | ||
raise ValueError(f"Failed to load STT: {plugin}") | ||
if lang_plugin: | ||
lang_plugin = load_audio_transformer_plugin(lang_plugin) | ||
if not lang_plugin: | ||
raise ValueError(f"Failed to load lang detection plugin: {plugin}") | ||
assert issubclass(lang_plugin, AudioLanguageDetector) | ||
self.lang_plugin = lang_plugin() | ||
self.engine = plugin(config) | ||
|
||
def process_audio(self, audio: AudioData, lang: str): | ||
if lang == "auto": | ||
lang, prob = self.lang_plugin.detect(audio) | ||
return self.engine.execute(audio, language=lang) or "" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modifications in ModelContainer
and MultiModelContainer
.
The constructors have been modified to load language plugins and handle configurations. The process_audio
methods have been updated to use language detection when lang
is set to "auto". These changes are crucial for the new language detection feature and should be thoroughly tested.
Would you like me to help with writing unit tests for these changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JarbasAl, here are the unit tests for the changes in ModelContainer
and MultiModelContainer
:
import unittest
from unittest.mock import patch, MagicMock
from speech_recognition import AudioData
from ovos_plugin_manager.stt import load_stt_plugin
from ovos_plugin_manager.audio_transformers import load_audio_transformer_plugin, AudioLanguageDetector
class TestModelContainer(unittest.TestCase):
def setUp(self):
self.audio_data = AudioData(b'\x00\x01', 16000, 2)
@patch('ovos_plugin_manager.stt.load_stt_plugin')
@patch('ovos_plugin_manager.audio_transformers.load_audio_transformer_plugin')
def test_init(self, mock_load_audio, mock_load_stt):
mock_stt_plugin = MagicMock()
mock_lang_plugin = MagicMock()
mock_load_stt.return_value = mock_stt_plugin
mock_load_audio.return_value = mock_lang_plugin
# Test successful initialization
container = ModelContainer('dummy_stt', 'dummy_lang')
self.assertIsNotNone(container.engine)
self.assertIsNotNone(container.lang_plugin)
# Test initialization failure
mock_load_stt.return_value = None
with self.assertRaises(ValueError):
ModelContainer('dummy_stt', 'dummy_lang')
@patch('ovos_plugin_manager.audio_transformers.load_audio_transformer_plugin')
def test_process_audio_auto_lang(self, mock_load_audio):
mock_lang_plugin = MagicMock()
mock_lang_plugin.detect.return_value = ('en', 0.99)
mock_load_audio.return_value = mock_lang_plugin
container = ModelContainer('dummy_stt', 'dummy_lang')
container.lang_plugin = mock_lang_plugin
container.engine = MagicMock()
container.engine.execute.return_value = "processed audio"
result = container.process_audio(self.audio_data, 'auto')
self.assertEqual(result, "processed audio")
mock_lang_plugin.detect.assert_called_once_with(self.audio_data)
class TestMultiModelContainer(unittest.TestCase):
def setUp(self):
self.audio_data = AudioData(b'\x00\x01', 16000, 2)
@patch('ovos_plugin_manager.stt.load_stt_plugin')
@patch('ovos_plugin_manager.audio_transformers.load_audio_transformer_plugin')
def test_init(self, mock_load_audio, mock_load_stt):
mock_stt_plugin = MagicMock()
mock_lang_plugin = MagicMock()
mock_load_stt.return_value = mock_stt_plugin
mock_load_audio.return_value = mock_lang_plugin
# Test successful initialization
container = MultiModelContainer('dummy_stt', 'dummy_lang')
self.assertIsNotNone(container.plugin_class)
self.assertIsNotNone(container.lang_plugin)
# Test initialization failure
mock_load_stt.return_value = None
with self.assertRaises(ValueError):
MultiModelContainer('dummy_stt', 'dummy_lang')
@patch('ovos_plugin_manager.audio_transformers.load_audio_transformer_plugin')
def test_process_audio_auto_lang(self, mock_load_audio):
mock_lang_plugin = MagicMock()
mock_lang_plugin.detect.return_value = ('en', 0.99)
mock_load_audio.return_value = mock_lang_plugin
container = MultiModelContainer('dummy_stt', 'dummy_lang')
container.lang_plugin = mock_lang_plugin
container.engines = {'en': MagicMock()}
container.engines['en'].execute.return_value = "processed audio"
result = container.process_audio(self.audio_data, 'auto')
self.assertEqual(result, "processed audio")
mock_lang_plugin.detect.assert_called_once_with(self.audio_data.get_wav_data())
if __name__ == '__main__':
unittest.main()
These tests cover the initialization and processing logic in both ModelContainer
and MultiModelContainer
, ensuring that the language detection and STT processing work as expected when the language is set to "auto".
needs OpenVoiceOS/ovos-plugin-manager#220
Summary by CodeRabbit
New Features
ovos-stt-server
, including--lang-engine
for audio language detection and--multi
for multi-language plugin instances.title
,description
,info
, andbadge
./lang_detect
endpoint for language detection.Improvements
Bug Fixes