-
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat/domain_engine #31
Conversation
WalkthroughThe changes in this pull request enhance the Changes
Possibly related PRs
Suggested labels
✨ Finishing Touches
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Outside diff range and nitpick comments (4)
ovos_padatious/domain_engine.py (2)
128-131
: Avoid reusing variable names to prevent confusionIn the
calc_intent
method, the parameterdomain
is reassigned within the method usingdomain = domain or self.domain_engine.calc_intent(query).name
. Reusing the variable namedomain
can lead to confusion and reduce code readability. Consider using a different variable name for the local assignment, such asresolved_domain
.Apply this diff to rename the local variable:
- domain: str = domain or self.domain_engine.calc_intent(query).name + resolved_domain = domain or self.domain_engine.calc_intent(query).name + if resolved_domain in self.domains: + return self.domains[resolved_domain].calc_intent(query) - if domain in self.domains: - return self.domains[domain].calc_intent(query)
147-154
: Clarify variable naming in loops to enhance readabilityIn the
calc_intents
method, the parameterdomain
and the loop variabledomain
represent different entities—one is a string, and the other is aMatchData
object. This reuse of the variable namedomain
may cause confusion. Consider renaming the loop variable to something likedomain_match
for clarity.Apply this diff to rename the loop variable:
matches = [] domains = self.calc_domains(query)[:top_k_domains] - for domain in domains: - if domain.name in self.domains: - matches += self.domains[domain.name].calc_intents(query) + for domain_match in domains: + if domain_match.name in self.domains: + matches += self.domains[domain_match.name].calc_intents(query)tests/test_domain.py (1)
5-5
: Remove placeholder comment in import statementLine 5 contains a placeholder comment that suggests replacing
'your_module'
with the actual module name. Since the module is correctly imported, the comment can be removed to clean up the code.Apply this diff to remove the comment:
- from ovos_padatious.domain_engine import DomainIntentEngine # Replace 'your_module' with the actual module name + from ovos_padatious.domain_engine import DomainIntentEngineovos_padatious/intent_container.py (1)
59-59
: Update docstring to reflect optionalcache_dir
parameterSince
cache_dir
is now optional and defaults to a computed path if not provided, the docstring should be updated to indicate thatcache_dir
is optional.Apply this diff to update the docstring:
""" Creates an IntentContainer object used to load and match intents Args: - cache_dir (str): Directory for caching the neural network models and intent/entity files. + cache_dir (Optional[str]): Directory for caching the neural network models and intent/entity files. If not provided, a default path is used. """
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (4)
ovos_padatious/domain_engine.py
(1 hunks)ovos_padatious/intent_container.py
(3 hunks)ovos_padatious/intent_manager.py
(1 hunks)tests/test_domain.py
(1 hunks)
🧰 Additional context used
🪛 Ruff (0.8.0)
tests/test_domain.py
3-3: ovos_padatious.intent_container.IntentContainer
imported but unused
Remove unused import: ovos_padatious.intent_container.IntentContainer
(F401)
🔇 Additional comments (2)
ovos_padatious/intent_manager.py (1)
25-28
: Addition of intent_names
property is appropriate
The new intent_names
property correctly provides a list of intent names from both trained intents and intents pending training. This enhances accessibility and aligns with best practices.
ovos_padatious/intent_container.py (1)
70-73
: The addition of intent_names
property enhances encapsulation
The new intent_names
property provides external access to the intent names managed by the IntentContainer
, promoting encapsulation and improving usability.
d15f923
to
582e12d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Outside diff range and nitpick comments (2)
ovos_padatious/domain_container.py (1)
135-135
: Avoid reassigning function parameters with type annotationsIn line 135, the parameter
domain
is reassigned with a type annotation:domain: str = domain or self.domain_engine.calc_intent(query).nameReassigning a function parameter can lead to confusion and is generally discouraged. Additionally, re-annotating the variable
domain
is unnecessary since it is already typed in the function signature. Consider assigning the value to a new local variable without re-annotation:-domain: str = domain or self.domain_engine.calc_intent(query).name +current_domain = domain or self.domain_engine.calc_intent(query).nametests/test_domain.py (1)
4-4
: Remove placeholder comment in import statementIn line 4, there is a placeholder comment that is no longer needed:
from ovos_padatious.domain_container import DomainIntentContainer # Replace 'your_module' with the actual module nameSince the module import is already correct, remove the placeholder comment to clean up the code:
-from ovos_padatious.domain_container import DomainIntentContainer # Replace 'your_module' with the actual module name +from ovos_padatious.domain_container import DomainIntentContainer
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
ovos_padatious/domain_container.py
(1 hunks)ovos_padatious/intent_container.py
(3 hunks)ovos_padatious/intent_manager.py
(1 hunks)ovos_padatious/opm.py
(11 hunks)tests/test_domain.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- ovos_padatious/intent_manager.py
- ovos_padatious/intent_container.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (1)
ovos_padatious/opm.py (1)
117-127
: Consider adding validation for engine_class parameterWhile the implementation is correct, it would be beneficial to validate that the provided engine_class is a valid PadatiousEngine type.
self.engine_class = engine_class or IntentContainer +if not issubclass(self.engine_class, (IntentContainer, DomainIntentContainer)): + raise ValueError(f"Invalid engine_class: {engine_class}. Must be IntentContainer or DomainIntentContainer")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
ovos_padatious/opm.py
(11 hunks)
🔇 Additional comments (7)
ovos_padatious/opm.py (7)
36-36
: Correct the import statement for DomainIntentContainer
The import statement is incorrect as noted in previous reviews.
-from ovos_padatious.domain_container import DomainIntentContainer
+from ovos_padatious.domain_engine import DomainIntentContainer
41-43
: LGTM! Well-defined type alias.
The PadatiousEngine
type alias improves code readability and maintainability by clearly defining the allowed engine types.
Line range hint 98-116
: LGTM! Well-implemented engine class selection.
The implementation provides flexibility in engine selection while maintaining backward compatibility through config.
241-246
: LGTM! Proper handling of intent removal for both container types.
The implementation correctly differentiates between container types and uses the appropriate removal method.
Line range hint 274-292
: LGTM! Well-structured registration logic.
The implementation properly handles both container types and correctly manages skill_id.
304-306
: Ensure skill_id
is properly obtained and handled
The skill_id handling could be improved to prevent None values.
skill_id = message.data.get("skill_id") or message.context.get("skill_id")
+if not skill_id:
+ LOG.error("Skill ID is missing. Cannot register intent without a valid skill ID.")
+ return
self._skill2intent[skill_id].append(message.data['name'])
422-422
: LGTM! Accurate type annotation.
The type annotation correctly represents the possible container types.
match the skill first, and then the intent
match the skill first, and then the intent
e103c75
to
15f91fd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (5)
ovos_padatious/domain_container.py (2)
24-27
: Consider using type hints for instance variables.The instance variables initialized in
__init__
should have type hints for better code maintainability and IDE support.- self.cache_dir = cache_dir - self.domain_engine = IntentContainer(cache_dir=cache_dir) - self.domains: Dict[str, IntentContainer] = {} - self.training_data: Dict[str, List[str]] = defaultdict(list) - self.must_train = True + self.cache_dir: Optional[str] = cache_dir + self.domain_engine: IntentContainer = IntentContainer(cache_dir=cache_dir) + self.domains: Dict[str, IntentContainer] = {} + self.training_data: Dict[str, List[str]] = defaultdict(list) + self.must_train: bool = True
163-171
: Add docstring to train method.The
train
method lacks a docstring explaining its purpose and behavior.def train(self): + """ + Train the domain engine and all domain-specific intent containers. + + This method trains both the top-level domain engine and all individual + domain containers. It sets must_train to False after successful training. + """ for domain, samples in self.training_data.items(): LOG.debug(f"Training domain: {domain}") self.domain_engine.add_intent(domain, samples)tests/test_domain.py (2)
4-4
: Remove outdated comment.The comment
# Replace 'your_module' with the actual module name
is no longer needed as the correct module is already imported.-from ovos_padatious.domain_container import DomainIntentContainer # Replace 'your_module' with the actual module name +from ovos_padatious.domain_container import DomainIntentContainer
89-111
: Consider moving test data to a separate fixture file.The test data dictionary is quite large and could be moved to a separate fixture file to improve test readability and maintainability.
Consider creating a
conftest.py
ortest_data.py
file to store the training data dictionary.ovos_padatious/intent_container.py (1)
74-77
: Add return type hint and docstring to intent_names property.The new property lacks a return type hint and proper documentation.
@property - def intent_names(self): + def intent_names(self) -> List[str]: + """ + Get the list of registered intent names. + + Returns: + List[str]: A list of intent names managed by this container. + """ return self.intents.intent_names
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
ovos_padatious/domain_container.py
(1 hunks)ovos_padatious/intent_container.py
(3 hunks)ovos_padatious/intent_manager.py
(1 hunks)ovos_padatious/opm.py
(11 hunks)tests/test_domain.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- ovos_padatious/intent_manager.py
🧰 Additional context used
🪛 Ruff (0.8.2)
ovos_padatious/opm.py
43-43: Redefinition of unused PadatiousIntent
from line 33
Remove definition: PadatiousIntent
(F811)
45-45: Redefinition of unused PadatiousIntentContainer
from line 32
(F811)
🔇 Additional comments (1)
ovos_padatious/opm.py (1)
486-488
:⚠️ Potential issueAdd null check for skill_id.
The skill_id could be None, which might lead to issues when managing intents.
skill_id = message.data.get("skill_id") or message.context.get("skill_id") + if not skill_id: + LOG.error("Skill ID is missing. Cannot register intent without a valid skill ID.") + return self._skill2intent[skill_id].append(message.data['name'])Likely invalid or redundant comment.
ovos_padatious/opm.py
Outdated
from ovos_padatious import IntentContainer | ||
from ovos_padatious.domain_container import DomainIntentContainer | ||
from ovos_padatious.match_data import MatchData as PadatiousIntent | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove duplicate imports and fix import order.
There are duplicate imports of IntentContainer
and MatchData
, and the import order could be improved.
-from ovos_padatious import IntentContainer
-from ovos_padatious.domain_container import DomainIntentContainer
-from ovos_padatious.match_data import MatchData as PadatiousIntent
-
-PadatiousIntentContainer = IntentContainer # backwards compat
Committable suggestion skipped: line range outside the PR's diff.
🧰 Tools
🪛 Ruff (0.8.2)
43-43: Redefinition of unused PadatiousIntent
from line 33
Remove definition: PadatiousIntent
(F811)
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (9)
ovos_padatious/domain_container.py (6)
14-30
: Add input validation in the constructor.Consider adding validation for the
cache_dir
path and ensuring it's writable when provided.def __init__(self, cache_dir: Optional[str] = None, disable_padaos: bool = False): + if cache_dir and not os.access(os.path.dirname(cache_dir), os.W_OK): + raise ValueError(f"Cache directory {cache_dir} is not writable") self.cache_dir = cache_dir self.disable_padaos = disable_padaos
31-44
: Add debug logging for domain operations.Consider adding debug logs to track domain removal operations.
def remove_domain(self, domain_name: str): + LOG.debug(f"Removing domain: {domain_name}") if domain_name in self.training_data: self.training_data.pop(domain_name) + LOG.debug(f"Removed training data for domain: {domain_name}") if domain_name in self.domains: self.domains.pop(domain_name) + LOG.debug(f"Removed domain container for: {domain_name}") if domain_name in self.domain_engine.intent_names: self.domain_engine.remove_intent(domain_name) + LOG.debug(f"Removed domain from engine: {domain_name}")
45-71
: Add input validation for intent registration.Consider validating input parameters and handling edge cases.
def add_domain_intent(self, domain_name: str, intent_name: str, intent_samples: List[str]): + if not domain_name or not intent_name: + raise ValueError("Domain name and intent name must not be empty") + if not intent_samples: + raise ValueError("Intent samples list must not be empty") if domain_name not in self.domains: self.domains[domain_name] = IntentContainer(cache_dir=self.cache_dir, disable_padaos=self.disable_padaos)
72-96
: Extract common validation logic.Consider creating a helper method for common validation logic used in both intent and entity management.
+def _validate_domain_object(domain_name: str, object_name: str, samples: List[str]) -> None: + if not domain_name or not object_name: + raise ValueError("Domain name and object name must not be empty") + if not samples: + raise ValueError("Samples list must not be empty") + def add_domain_entity(self, domain_name: str, entity_name: str, entity_samples: List[str]): + self._validate_domain_object(domain_name, entity_name, entity_samples) if domain_name not in self.domains: self.domains[domain_name] = IntentContainer(cache_dir=self.cache_dir, disable_padaos=self.disable_padaos)
97-125
: Consider caching domain calculation results.The domain calculation methods could benefit from caching frequently accessed results.
+from functools import lru_cache + +@lru_cache(maxsize=100) def calc_domain(self, query: str) -> MatchData: if self.must_train: self.train() return self.domain_engine.calc_intent(query)
167-175
: Add training progress tracking.Consider adding progress tracking for better monitoring of the training process.
def train(self): + total_domains = len(self.training_data) + trained_domains = 0 for domain, samples in self.training_data.items(): LOG.debug(f"Training domain: {domain}") self.domain_engine.add_intent(domain, samples) + trained_domains += 1 + LOG.debug(f"Training progress: {trained_domains}/{total_domains} domains")ovos_padatious/intent_container.py (1)
74-76
: Add property documentation.Consider adding docstring to the
intent_names
property.@property def intent_names(self): + """ + Get the list of registered intent names. + + Returns: + List[str]: List of registered intent names. + """ return self.intents.intent_namesovos_padatious/opm.py (2)
436-440
: Enhance logging for skill detachment.Consider adding more detailed logging for skill detachment operations.
skill_id = message.data.get("skill_id") or message.context.get("skill_id") if not skill_id: LOG.warning("Skill ID is missing. Detaching all anonymous intents") skill_id = "anonymous_skill" + LOG.debug(f"Detaching all intents for skill: {skill_id}") for i in self._skill2intent[skill_id]: + LOG.debug(f"Detaching intent: {i}") self.__detach_intent(i)
451-454
: Add validation for registration parameters.Consider validating registration parameters before processing.
skill_id = message.data.get("skill_id") or message.context.get("skill_id") if not skill_id: LOG.warning("Skill ID is missing. Registering under 'anonymous_skill'") skill_id = "anonymous_skill" + if not name: + raise ValueError("Object name must not be empty") + if not samples: + raise ValueError("Samples must not be empty")Also applies to: 479-483
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
ovos_padatious/domain_container.py
(1 hunks)ovos_padatious/intent_container.py
(3 hunks)ovos_padatious/opm.py
(12 hunks)
🔇 Additional comments (6)
ovos_padatious/domain_container.py (1)
139-139
:⚠️ Potential issueFix type hint reassignment.
The type hint for
domain
is being reassigned in the same line where it's used.- domain: str = domain or self.domain_engine.calc_intent(query).name + domain_name: str = domain or self.domain_engine.calc_intent(query).nameLikely invalid or redundant comment.
ovos_padatious/intent_container.py (1)
59-60
: 🛠️ Refactor suggestionUse
os.path.join
for path construction.String concatenation for paths can lead to issues across different operating systems.
- cache_dir = cache_dir or f"{xdg_data_home()}/{get_xdg_base()}/intent_cache" + cache_dir = cache_dir or os.path.join(xdg_data_home(), get_xdg_base(), "intent_cache")Likely invalid or redundant comment.
ovos_padatious/opm.py (4)
412-420
: LGTM! Good error handling.The implementation properly handles intent removal with appropriate error handling for different container types.
495-509
: LGTM! Well-structured intent registration.The implementation properly handles intent registration for different container types while maintaining skill-to-intent mappings.
521-524
: LGTM! Consistent with intent registration.The entity registration implementation follows the same pattern as intent registration, maintaining consistency.
617-617
: LGTM! Improved type safety.The type hint update properly reflects the possible container types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
ovos_padatious/__init__.py (2)
16-16
: Add DomainIntentContainer to all to properly expose it.The import is correctly added but should be explicitly exposed in the module's public API.
Add the following before the imports:
+__all__ = ['IntentContainer', 'DomainIntentContainer', 'MatchData']
This will:
- Silence the static analysis warning
- Explicitly document the public API
- Control what gets imported with
from ovos_padatious import *
🧰 Tools
🪛 Ruff (0.8.2)
16-16:
.domain_container.DomainIntentContainer
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
16-16
: Document the usage of DomainIntentContainer.Since this introduces a new way to handle intents with domain awareness, it would be helpful to:
- Add docstrings explaining when to use DomainIntentContainer vs IntentContainer
- Update the README with examples of domain-aware intent recognition
- Document how this implements the "match skill first, then intent" approach
Would you like me to help draft the documentation updates?
🧰 Tools
🪛 Ruff (0.8.2)
16-16:
.domain_container.DomainIntentContainer
imported but unused; consider removing, adding to__all__
, or using a redundant alias(F401)
ovos_padatious/opm.py (1)
Line range hint
257-282
: Add type validation for engine_class parameter.While the type hint helps with static type checking, adding runtime validation would prevent potential issues with invalid engine classes.
def __init__(self, bus: Optional[Union[MessageBusClient, FakeBus]] = None, config: Optional[Dict] = None, engine_class: Optional[PadatiousEngine] = IntentContainer): + if engine_class and not issubclass(engine_class, (IntentContainer, DomainIntentContainer)): + raise ValueError("engine_class must be IntentContainer or DomainIntentContainer") super().__init__(bus, config)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
ovos_padatious/__init__.py
(1 hunks)ovos_padatious/intent_manager.py
(1 hunks)ovos_padatious/opm.py
(13 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- ovos_padatious/intent_manager.py
🧰 Additional context used
🪛 Ruff (0.8.2)
ovos_padatious/__init__.py
16-16: .domain_container.DomainIntentContainer
imported but unused; consider removing, adding to __all__
, or using a redundant alias
(F401)
🔇 Additional comments (7)
ovos_padatious/opm.py (7)
Line range hint
19-47
: LGTM! Well-structured type definitions and imports.The new type definition
PadatiousEngine
properly constrains the engine class types, and the backward compatibility alias helps maintain API stability.
412-420
: LGTM! Robust error handling for intent removal.The try-except block and container type check ensure graceful handling of intent removal failures.
436-440
: LGTM! Proper handling of missing skill IDs.The code gracefully handles missing skill IDs with appropriate fallback and warning messages.
479-483
: LGTM! Clean separation of registration logic.The conditional registration properly handles both container types while maintaining code clarity.
495-509
: LGTM! Comprehensive intent registration handling.The code properly handles skill IDs, provides appropriate warnings, and correctly routes registration based on container type.
521-524
: LGTM! Consistent entity registration handling.The entity registration follows the same pattern as intent registration, maintaining code consistency.
617-617
: LGTM! Updated type hint for container flexibility.The type hint correctly reflects support for both container types.
match the skill first, and then the intent
Summary by CodeRabbit
New Features
DomainIntentContainer
class for domain-aware intent recognition.intent_names
property to bothIntentContainer
andIntentManager
classes for easier access to intent names.PadatiousPipeline
to accept different intent container types.Bug Fixes
Tests
DomainIntentContainer
to ensure correct intent registration and calculations.Chores
DomainIntentContainer
.