-
-
Notifications
You must be signed in to change notification settings - Fork 2
feat/lang_detection_plugin #19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@coderabbitai review |
Actions performedReview triggered.
|
WalkthroughThe recent updates enhance the Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant CLI
participant STTServer
participant LangPlugin
participant MultiModelContainer
User->>CLI: Run ovos-stt-server with lang detection
CLI->>STTServer: Start server with --lang-engine and --multi options
STTServer->>LangPlugin: Load language detection plugin
STTServer->>MultiModelContainer: Initialize with multi-language support
User->>STTServer: Send audio for processing
STTServer->>LangPlugin: Detect language
LangPlugin->>STTServer: Return detected language
STTServer->>MultiModelContainer: Process audio with detected language model
MultiModelContainer->>User: Return transcription result
Possibly related PRs
Suggested reviewers
Poem
Warning Rate limit exceeded@JarbasAl has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 11 minutes and 18 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Outside diff range, codebase verification and nitpick comments (2)
requirements/requirements.txt (1)
5-6
: Addition offlask
to requirements.
flask
has been added without a version specification. It's crucial to specify version constraints to avoid potential future compatibility issues.- flask + flask~=2.0README.md (1)
52-52
: Grammatical correction needed in example usage.The abbreviation "e.g." should be used with periods as indicated by LanguageTool.
- eg `ovos-stt-server --engine ovos-stt-plugin-fasterwhisper --lang-engine ovos-audio-transformer-plugin-fasterwhisper` + e.g., `ovos-stt-server --engine ovos-stt-plugin-fasterwhisper --lang-engine ovos-audio-transformer-plugin-fasterwhisper`
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (6)
- .github/workflows/publish_alpha.yml (1 hunks)
- README.md (1 hunks)
- ovos_stt_http_server/init.py (3 hunks)
- ovos_stt_http_server/main.py (2 hunks)
- requirements/requirements.txt (1 hunks)
- setup.py (1 hunks)
Files skipped from review due to trivial changes (2)
- .github/workflows/publish_alpha.yml
- setup.py
Additional context used
LanguageTool
README.md
[uncategorized] ~51-~51: The abbreviation “e.g.” (= for example) requires two periods.
Context: ... Note:ffmpeg
is required for Gradio eg `ovos-stt-server --engine ovos-stt-plug...(E_G)
Additional comments not posted (5)
requirements/requirements.txt (1)
1-1
: Version specification forovos-plugin-manager
updated.The version change from
~=0.0.18
to~=0.0, >=0.0.26a15
broadens the acceptable version range and specifies a minimum version. This should be verified to ensure compatibility with the new features introduced in the PR.README.md (1)
33-48
: Updated command line options forovos-stt-server
.The changes to the command line options, including the addition of
--lang-engine
and--multi
, align with the PR's objectives to enhance language detection capabilities. Ensure that these changes are reflected in the user documentation and help outputs.Verification successful
Updated command line options for
ovos-stt-server
have been successfully documented.The new command line options
--lang-engine
and--multi
are present in theREADME.md
file, ensuring that the documentation is consistent with the changes.
README.md
: Lines mentioning--lang-engine
and--multi
.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify that the new command line options are documented. # Test: Search for mentions of the new options in the documentation. rg --type markdown $'--lang-engine'Length of output: 450
Script:
#!/bin/bash # Description: Verify that the new command line options are documented. # Corrected Test: Search for mentions of the new options in the documentation. rg --type markdown '--lang-engine' rg --type markdown '--multi'Length of output: 916
Script:
#!/bin/bash # Description: Verify that the new command line options are documented. # Corrected Test: Search for mentions of the new options in the documentation. rg --type markdown -- '--lang-engine' rg --type markdown -- '--multi'Length of output: 317
ovos_stt_http_server/__main__.py (2)
24-30
: Addition of--lang-engine
and--multi
arguments.The new arguments are added correctly and are marked as required for
--lang-engine
and optional for--multi
. This aligns with the PR's objectives to support multilingual plugins and language detection.
44-46
: Modification instart_stt_server
function call.The function now includes
lang_engine
andmulti
arguments, which is necessary for the new features. Verify that these changes are compatible with the rest of the application, especially in how thestart_stt_server
function is defined and used elsewhere.Verification successful
Modification in
start_stt_server
function call verified.The
start_stt_server
function definition inovos_stt_http_server/__init__.py
includes thelang_engine
andmulti
arguments, and the only other call to this function inovos_stt_http_server/__main__.py
correctly handles these arguments.
ovos_stt_http_server/__init__.py
:def start_stt_server(engine: str, lang_engine: str = None, multi: bool = False, has_gradio: bool = False) -> (FastAPI, ModelContainer): app, engine = create_app(engine, lang_engine, multi, has_gradio) return app, engine
ovos_stt_http_server/__main__.py
:server, engine = start_stt_server(args.engine, lang_engine=args.lang_engine, multi=bool(args.multi), has_gradio=bool(args.gradio))Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the usage of `start_stt_server` with new arguments. # Test: Search for the function definition and other calls to ensure they handle the new arguments. rg --type python $'start_stt_server'Length of output: 66
Script:
#!/bin/bash # Description: Verify the usage of `start_stt_server` with new arguments. # Search for the function definition of `start_stt_server`. ast-grep --lang python --pattern $'def start_stt_server($_, $_, $_, $_):\n $$$' # Search for other calls to `start_stt_server` in the codebase. rg 'start_stt_server'Length of output: 368
Script:
#!/bin/bash # Description: Verify the usage of `start_stt_server` with new arguments. # Search for the full function definition of `start_stt_server` in `ovos_stt_http_server/__init__.py`. rg 'def start_stt_server' -A 10 ovos_stt_http_server/__init__.py # Search for other calls to `start_stt_server` in the codebase. rg 'start_stt_server' --context 5Length of output: 2875
ovos_stt_http_server/__init__.py (1)
113-117
: New/lang_detect
endpoint added.The endpoint for language detection has been implemented correctly. Ensure that this endpoint is included in the API documentation and that its functionality is covered by integration tests.
Update requirements.txt feat/lang_detection_plugin
84505cb
to
f51e109
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
🧹 Outside diff range and nitpick comments (8)
requirements/requirements.txt (1)
Line range hint
6-6
: Add version constraint for flaskThe flask dependency lacks version constraints, which could lead to compatibility issues or breaking changes when new versions are released.
-flask +flask~=2.3.3README.md (3)
33-47
: Consider reorganizing command line options for better readabilityThe command line options could be organized into logical groups to improve readability:
- Core options (engine, lang-engine, port, host, lang, multi)
- Web UI options (gradio, cache, title, description, info, badge)
Consider reorganizing the help text like this:
usage: ovos-stt-server [-h] [--engine ENGINE] [--port PORT] [--host HOST] Core Options: -h, --help show this help message and exit --engine ENGINE stt plugin to be used --lang-engine LANG_ENGINE audio language detection plugin to be used (optional) --port PORT port number --host HOST host --lang LANG default language supported by plugin (default comes from mycroft.conf) --multi Load a plugin instance per language (force lang support, loads multiple plugins into memory) Web UI Options: --gradio Enable Gradio Web UI --cache Cache models for Gradio demo --title TITLE Title for webUI --description DESCRIPTION Text description to print in UI --info INFO Text to display at end of UI --badge BADGE URL of visitor badge
51-51
: Add more usage examples for new featuresWhile the example demonstrates the
--lang-engine
option, it would be helpful to include examples showing:
- Usage of the
--multi
flag- Language auto-detection with "auto" setting
- Web UI configuration
Consider adding examples like:
# Basic usage with language engine ovos-stt-server --engine ovos-stt-plugin-fasterwhisper --lang-engine ovos-audio-transformer-plugin-fasterwhisper # Multi-model usage for multiple languages ovos-stt-server --engine ovos-stt-plugin-fasterwhisper --multi # Web UI with customization ovos-stt-server --engine ovos-stt-plugin-fasterwhisper --gradio --title "My STT Server" --description "Custom STT service"
51-51
: Fix formatting: Replace "eg" with "e.g."For consistency and proper formatting, replace all instances of "eg" with "e.g." throughout the document.
-eg `ovos-stt-server --engine ovos-stt-plugin-fasterwhisper --lang-engine ovos-audio-transformer-plugin-fasterwhisper` +e.g., `ovos-stt-server --engine ovos-stt-plugin-fasterwhisper --lang-engine ovos-audio-transformer-plugin-fasterwhisper` -eg +e.g.,Also applies to: 14-14, 20-20
ovos_stt_http_server/__main__.py (4)
31-32
: Improve help text clarity for the--multi
flagThe current help text "Load a plugin instance per language (force lang support)" could be clearer about its purpose and implications.
Consider this improved help text:
- parser.add_argument("--multi", help="Load a plugin instance per language (force lang support)", + parser.add_argument("--multi", help="Enable multi-language support by loading separate plugin instances for each language", action="store_true")
29-30
: Consider adding language code validationThe language code from configuration or command line should be validated to ensure it follows a standard format (e.g., ISO 639-1 or BCP 47).
Consider adding validation:
def validate_lang_code(lang_code: str) -> str: # Basic validation for language codes like "en-us" if not re.match(r'^[a-z]{2,3}(-[a-z]{2,4})?$', lang_code.lower()): raise argparse.ArgumentTypeError(f"Invalid language code format: {lang_code}") return lang_code.lower() parser.add_argument("--lang", help="default language supported by plugin (e.g., en-us)", type=validate_lang_code, default=Configuration().get("lang", "en-us"))
46-48
: Remove redundant bool conversionThe
bool()
conversion is unnecessary forargs.multi
sinceargparse
'sstore_true
action already returns a boolean value.- server, engine = start_stt_server(args.engine, lang_engine=args.lang_engine, - multi=bool(args.multi), - has_gradio=bool(args.gradio)) + server, engine = start_stt_server(args.engine, lang_engine=args.lang_engine, + multi=args.multi, + has_gradio=args.gradio)
26-32
: Consider caching language detection resultsSince language detection is computationally expensive, consider implementing a caching mechanism for the language detection results, especially when processing multiple audio segments from the same source or speaker.
This could be implemented as:
- An in-memory cache with TTL
- A Redis cache for distributed setups
- A local file-based cache for development
Would you like me to provide a detailed implementation suggestion for any of these approaches?
Also applies to: 46-48
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (6)
.github/workflows/publish_alpha.yml
(1 hunks)README.md
(1 hunks)ovos_stt_http_server/__init__.py
(2 hunks)ovos_stt_http_server/__main__.py
(2 hunks)requirements/requirements.txt
(1 hunks)setup.py
(1 hunks)
✅ Files skipped from review due to trivial changes (2)
- .github/workflows/publish_alpha.yml
- setup.py
🧰 Additional context used
🪛 LanguageTool
README.md
[uncategorized] ~50-~50: The abbreviation “e.g.” (= for example) requires two periods.
Context: ... Note: ffmpeg
is required for Gradio eg `ovos-stt-server --engine ovos-stt-plug...
(E_G)
🔇 Additional comments (3)
requirements/requirements.txt (1)
Line range hint 5-5
: Version specification concern for ovos-utils
Referring to the previous review comment about specifying an upper version bound for ovos-utils to prevent potential compatibility issues.
ovos_stt_http_server/__main__.py (2)
14-20
: LGTM! Clean import organization
The addition of Configuration
from ovos_config
is appropriate for centralizing configuration management.
46-48
: Verify start_stt_server interface changes
Let's verify that the changes to start_stt_server
signature are properly implemented across the codebase.
✅ Verification successful
Interface changes to start_stt_server
are correctly implemented
The function signature in ovos_stt_http_server/__init__.py
matches exactly with its usage in ovos_stt_http_server/__main__.py
. The function is defined with parameters (engine: str, lang_engine: str = None, multi: bool = False, has_gradio: bool = False)
and is called with corresponding arguments. No other usages were found in the codebase.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for start_stt_server function definition and usages
echo "Searching for start_stt_server definition:"
ast-grep --pattern 'def start_stt_server($$$)'
echo -e "\nSearching for start_stt_server calls:"
rg "start_stt_server\(" --type python
Length of output: 305
Script:
#!/bin/bash
# Search for start_stt_server definition and usages with correct file extensions
echo "Searching for start_stt_server definition:"
rg "def start_stt_server" -A 3
echo -e "\nSearching for start_stt_server calls:"
rg "start_stt_server\("
Length of output: 756
companion to OpenVoiceOS/ovos-plugin-manager#289 allow lang detection without initing 2 models
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Outside diff range and nitpick comments (1)
ovos_stt_http_server/__init__.py (1)
24-24
: Remove commented codeThe commented LOG level setting appears to be unused. Consider removing it if it's not needed, or document why it's kept as a reference.
companion to OpenVoiceOS/ovos-plugin-manager#289 allow lang detection without initing 2 models
companion to OpenVoiceOS/ovos-plugin-manager#289 allow lang detection without initing 2 models
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
ovos_stt_http_server/__init__.py
(2 hunks)
🔇 Additional comments (5)
ovos_stt_http_server/__init__.py (5)
28-43
: LGTM: Constructor implementation looks good!
The initialization logic properly handles both STT and language detection plugins, with appropriate error handling and validation.
71-72
:
Add fallback mechanism for language detection
The method should fall back to engine detection when lang_plugin
is None, similar to ModelContainer.
Apply this diff:
def detect_language(self, audio, valid_langs: Optional[Union[Set[str], List[str]]] = None) -> Tuple[str, float]:
+ if self.lang_plugin is None:
+ # Use the first available engine for detection
+ engine = next(iter(self.engines.values())) if self.engines else self.get_engine("en")
+ return engine.detect_language(audio, valid_langs)
return self.lang_plugin.detect(audio, valid_langs)
Likely invalid or redundant comment.
50-51
:
Implement language detection in process_audio
The method doesn't handle the "auto" language case as described in the PR objectives. When lang="auto"
, it should perform language detection before STT processing.
Apply this diff:
def process_audio(self, audio: AudioData, lang: str = "auto"):
+ if lang == "auto":
+ if not self.lang_plugin:
+ raise ValueError("Language detection plugin is not available")
+ lang, _ = self.detect_language(audio)
return self.engine.execute(audio, language=lang) or ""
Likely invalid or redundant comment.
128-133
:
Improve error handling and parameter parsing in language detection endpoint
The endpoint needs better handling of the valid_langs parameter and error handling for invalid audio data.
Apply this diff:
@app.post("/lang_detect")
async def get_lang(request: Request):
- valid = request.query_params.get("valid_langs", [])
+ valid = request.query_params.get("valid_langs", "").split(",")
+ valid = [lang.strip() for lang in valid if lang.strip()]
audio_bytes = await request.body()
+ if not audio_bytes:
+ return {"error": "No audio data provided"}, 400
+ try:
+ lang, prob = model.detect_language(audio_bytes, valid_langs=valid or None)
+ return {"lang": lang, "conf": prob}
+ except Exception as e:
+ return {"error": str(e)}, 400
- lang, prob = model.detect_language(audio_bytes, valid_langs=valid)
- return {"lang": lang, "conf": prob}
Likely invalid or redundant comment.
121-125
: 🛠️ Refactor suggestion
Optimize configuration handling and improve error handling in STT endpoint
The endpoint creates a new Configuration instance on every request and needs better error handling.
Apply this diff:
+ config = Configuration() # Move to module level
+
@app.post("/stt")
async def get_stt(request: Request):
- lang = str(request.query_params.get("lang", Configuration().get("lang", "auto"))).lower()
+ lang = str(request.query_params.get("lang", config.get("lang", "auto"))).lower()
audio_bytes = await request.body()
+ if not audio_bytes:
+ return {"error": "No audio data provided"}, 400
audio = bytes2audiodata(audio_bytes)
+ try:
if lang == "auto":
lang, prob = model.detect_language(audio_bytes)
- return model.process_audio(audio, lang)
+ result = model.process_audio(audio, lang)
+ return {"text": result}
+ except Exception as e:
+ return {"error": str(e)}, 400
Likely invalid or redundant comment.
def process_audio(self, audio: AudioData, lang: str): | ||
engine = self.get_engine(lang) | ||
return engine.execute(audio, language=lang) or "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add language detection support in process_audio
Similar to ModelContainer, this method should handle "auto" language detection.
Apply this diff:
def process_audio(self, audio: AudioData, lang: str):
+ if lang == "auto":
+ if not self.lang_plugin:
+ raise ValueError("Language detection plugin is not available")
+ lang, _ = self.detect_language(audio)
engine = self.get_engine(lang)
return engine.execute(audio, language=lang) or ""
Committable suggestion skipped: line range outside the PR's diff.
needs OpenVoiceOS/ovos-plugin-manager#220
Summary by CodeRabbit
New Features
ovos-stt-server
, including--lang-engine
for audio language detection and--multi
for multi-language plugin instances./lang_detect
endpoint for language detection.Improvements
Dependency Updates
ovos-plugin-manager
in the requirements.Bug Fixes