-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rename formatters submodule and classes to converters #101
Conversation
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughThe pull request introduces significant changes to the documentation and structure of the Changes
Assessment against linked issues
Possibly related PRs
Suggested reviewers
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #101 +/- ##
===========================================
+ Coverage 96.94% 98.04% +1.10%
===========================================
Files 10 10
Lines 458 460 +2
===========================================
+ Hits 444 451 +7
+ Misses 14 9 -5 ☔ View full report in Codecov by Sentry. |
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Outside diff range and nitpick comments (23)
tests/test_converters/test_edtf.py (3)
17-24
: Clarify EDTF support for dates with missing years.The TODO comment on line 22 raises an important question about EDTF support for dates with missing years. This should be clarified to ensure proper test coverage.
Would you like me to:
- Research EDTF specification to confirm support for this format?
- Open a GitHub issue to track this question?
29-33
: Expand range parsing test coverage.The current test only covers a basic year range. Consider adding more range scenarios.
Add these test cases:
def test_parse_range(self): # Current test assert EDTFDateConverter().parse("1800/1900") == UndateInterval( Undate(1800), Undate(1900) ) # Additional cases assert EDTFDateConverter().parse("1991-05/1991-06") == UndateInterval( Undate(1991, 5), Undate(1991, 6) ) assert EDTFDateConverter().parse("1991-05-01/1991-05-31") == UndateInterval( Undate(1991, 5, 1), Undate(1991, 5, 31) ) # Test with unknown components assert EDTFDateConverter().parse("19XX/20XX") == UndateInterval( Undate("19XX"), Undate("20XX") )
34-47
: Address TODO and enhance string conversion tests.The TODO on line 47 about overriding missing digits needs to be implemented. Also, consider adding tests for interval string conversion.
Would you like me to help implement the missing digit replacement tests?
Add these test cases:
def test_to_string_intervals(self): interval = UndateInterval(Undate(1800), Undate(1900)) assert EDTFDateConverter().to_string(interval) == "1800/1900" interval = UndateInterval(Undate(1991, 5), Undate(1991, 6)) assert EDTFDateConverter().to_string(interval) == "1991-05/1991-06"tests/test_converters/test_base.py (5)
8-18
: Consider enhancing converter interface verification.While the test correctly verifies the presence and mapping of ISO8601DateFormat, consider adding assertions to verify that the returned converter class adheres to the expected interface.
Add these assertions:
def test_available_converters(self): available_converters = BaseDateConverter.available_converters() assert isinstance(available_converters, dict) from undate.converters.iso8601 import ISO8601DateFormat assert ISO8601DateFormat.name in available_converters assert available_converters[ISO8601DateFormat.name] == ISO8601DateFormat + # Verify converter interface + converter_class = available_converters[ISO8601DateFormat.name] + assert hasattr(converter_class, 'parse') + assert hasattr(converter_class, 'to_string') + assert hasattr(converter_class, 'name')
19-23
: Enhance error message for duplicate converter detection.The current error message doesn't help identify which converters are duplicated.
Consider this enhancement:
def test_converters_are_unique(self): + available = BaseDateConverter.available_converters() + converter_names = [conv.name for conv in BaseDateConverter.__subclasses__()] + duplicates = [name for name in converter_names if converter_names.count(name) > 1] assert len(BaseDateConverter.available_converters()) == len( BaseDateConverter.__subclasses__() - ), "Formatter names have to be unique." + ), f"Converter names must be unique. Duplicates found: {duplicates}"
24-31
: Consider adding edge cases for abstract method tests.The tests verify the abstract nature of the methods but could be more comprehensive.
Consider adding these test cases:
def test_parse_not_implemented(self): with pytest.raises(NotImplementedError): BaseDateConverter().parse("foo bar baz") + with pytest.raises(NotImplementedError): + BaseDateConverter().parse(None) + with pytest.raises(NotImplementedError): + BaseDateConverter().parse("") def test_parse_to_string(self): with pytest.raises(NotImplementedError): BaseDateConverter().to_string(1991) + with pytest.raises(NotImplementedError): + BaseDateConverter().to_string(None) + with pytest.raises(NotImplementedError): + BaseDateConverter().to_string("")
33-51
: Consider verifying exact import count.The test verifies caching behavior but could be more specific about expected imports.
Consider enhancing the verification:
def test_import_converters_import_only_once(caplog): BaseDateConverter.import_converters.cache_clear() with caplog.at_level(logging.DEBUG): import_count = BaseDateConverter.import_converters() - assert import_count >= 1 + # Get actual subclass count for verification + expected_count = len(BaseDateConverter.__subclasses__()) + assert import_count == expected_count, f"Expected {expected_count} converters to be imported"
53-64
: Enhance duplicate converter test verification.While the test correctly verifies the duplicate name scenario, it could be more explicit about the error condition.
Consider this enhancement:
@pytest.mark.last def test_converters_unique_error(): - # confirm that unique converter check fails when it should + # Store initial converter count + initial_count = len(BaseDateConverter.__subclasses__()) class ISO8601DateFormat2(BaseDateConverter): name = "ISO8601" # duplicates existing formatter - assert len(BaseDateConverter.available_converters()) != len( - BaseDateConverter.__subclasses__() - ) + # Verify subclass was added but not included in available converters + assert len(BaseDateConverter.__subclasses__()) == initial_count + 1 + available = BaseDateConverter.available_converters() + assert len(available) == initial_count + assert "ISO8601" in availablesrc/undate/converters/edtf/converter.py (4)
15-21
: LGTM! Class renaming and structure look good.The class renaming aligns with the broader objective of transitioning from formatters to converters. The addition of the
name
class attribute improves class identification.Consider adding a class-level docstring to describe the purpose and capabilities of this converter, especially since it's part of a public API:
class EDTFDateConverter(BaseDateConverter): + """ + Converter for Extended Date/Time Format (EDTF) dates. + + Handles parsing and serialization of dates in EDTF format, supporting both + single dates and date intervals with various levels of precision. + """ #: converter name: EDTF name: str = "EDTF"
Line range hint
23-33
: Enhance error message with input value context.The error handling is good, but the error message could be more helpful by including the problematic input value.
Consider this enhancement:
try: parsetree = edtf_parser.parse(value) return self.transformer.transform(parsetree) except UnexpectedCharacters as err: - raise ValueError("Parsing failed due to UnexpectedCharacters: %s" % err) + raise ValueError(f"Failed to parse '{value}' as EDTF: {err}")
Line range hint
50-54
: Consider documenting the open interval vs unknown date distinction.The comment raises an important distinction between open intervals and unknown dates, but this behavior should be documented for API users.
Consider adding this information to the docstring:
def to_string(self, undate: Union[Undate, UndateInterval]) -> str: """ Convert an :class:`~undate.undate.Undate` or :class:`~undate.undate.UndateInterval` to EDTF format. + + Note: + For intervals, there's a distinction between open intervals (represented + as "..") and unknown dates (represented as empty string) in the EDTF + specification. """
Line range hint
71-71
: Track TODOs in issue tracker.There are multiple TODOs about handling uncertain/approximate dates. These should be tracked in the issue system for better visibility and follow-up.
Would you like me to create GitHub issues to track these TODOs for uncertain/approximate date handling? This would help ensure these improvements aren't forgotten.
Also applies to: 77-77, 84-84
src/undate/converters/base.py (6)
1-22
: Consider enhancing documentation with usage examples.The module documentation is comprehensive and well-structured. To make it even more helpful, consider adding a simple example demonstrating how parsing and conversion work together in a concrete use case.
""" :class:`undate.converters.BaseDateConverter` provides a base class for implementing date converters, which can provide support for parsing and generating dates in different formats and also converting dates between different calendars. To add support for a new date format or calendar conversion: - Create a new file under ``undate/converters/`` - For converters with sufficient complexity, you may want to create a submodule; see ``undate.converters.edtf`` for an example. - Extend ``BaseDateConverter`` and implement ``parse`` and ``to_string`` methods as desired/appropriate for your converter - Add unit tests for the new converter in ``tests/test_converters/`` - Optionally, you may want to create a notebook to demonstrate the use and value of the new converter. The new subclass should be loaded automatically and included in the converters returned by :meth:`BaseDateConverter.available_converters` +Example: + >>> from undate.converters import MyDateConverter + >>> converter = MyDateConverter() + >>> # Parse a date string into an Undate object + >>> undate_obj = converter.parse("2024-01-01") + >>> # Convert back to string representation + >>> date_str = converter.to_string(undate_obj) ------------------- """
33-39
: Consider making the class and name attribute abstract.Since this is a base class that requires subclasses to implement specific behavior, consider using
abc.ABC
and making thename
attribute abstract to enforce implementation by subclasses.+from abc import ABC, abstractmethod -class BaseDateConverter: +class BaseDateConverter(ABC): """Base class for parsing, formatting, and converting dates to handle specific formats and different calendars.""" #: Converter name. Subclasses must define a unique name. - name: str = "Base Converter" + @property + @abstractmethod + def name(self) -> str: + """Return the unique name of the converter.""" + pass
40-48
: Consider using string literal type hints to avoid circular imports.The comment about circular imports can be addressed by using string literal type hints, which are evaluated at runtime.
- def parse(self, value: str): + @abstractmethod + def parse(self, value: str) -> 'Union[Undate, UndateInterval]': """ Parse a string and return an :class:`~undate.undate.Undate` or :class:`~undate.undate.UndateInterval`. Must be implemented by subclasses. """ - # can't add type hint here because of circular import - # should return an undate or undate interval raise NotImplementedError
50-59
: Consider using string literal type hints for the parameter.Similar to the parse method, use string literal type hints for the parameter type to avoid circular import issues.
- def to_string(self, undate) -> str: + @abstractmethod + def to_string(self, undate: 'Union[Undate, UndateInterval]') -> str: """ Convert an :class:`~undate.undate.Undate` or :class:`~undate.undate.UndateInterval` to string. Must be implemented by subclasses. """ - # undate param should be of type Union[Undate, UndateInterval] but can't add type hint here because of circular import - # convert an undate or interval to string representation for this format raise NotImplementedError
62-87
: Rename unused loop variables.The loop has unused variables that should be prefixed with underscore as per Python conventions.
- for importer, modname, ispkg in pkgutil.iter_modules( + for _importer, modname, _ispkg in pkgutil.iter_modules( converter_path, converter_prefix ):🧰 Tools
🪛 Ruff
78-78: Loop control variable
importer
not used within loop bodyRename unused
importer
to_importer
(B007)
78-78: Loop control variable
ispkg
not used within loop bodyRename unused
ispkg
to_ispkg
(B007)
88-95
: Consider using TypeVar to improve type safety.Instead of using
# type: ignore
, consider using a TypeVar to properly type the subclasses.+from typing import TypeVar +T = TypeVar('T', bound='BaseDateConverter') @classmethod - def available_converters(cls) -> Dict[str, Type["BaseDateConverter"]]: + def available_converters(cls: Type[T]) -> Dict[str, Type[T]]: """ Dictionary of available converters keyed on name. """ # ensure undate converters are imported cls.import_converters() - return {c.name: c for c in cls.__subclasses__()} # type: ignore + return {c.name: c for c in cls.__subclasses__()}src/undate/converters/iso8601.py (1)
57-60
: Document ISO8601 format limitations in docstringThe implementation notes that ISO8601 doesn't support open-ended ranges, but this limitation isn't documented in the public API.
""" Convert an :class:`~undate.undate.Undate` or :class:`~undate.undate.UndateInterval` to ISO8601 string format. + + Note: + ISO8601 does not officially support open-ended ranges or uncertain dates. + Such values may not be strictly compliant with the standard. """src/undate/undate.py (4)
138-144
: Consider updating DEFAULT_FORMAT constant name.Since we're moving from formatters to converters, consider renaming
DEFAULT_FORMAT
toDEFAULT_CONVERTER
for consistency.- DEFAULT_FORMAT: str = "ISO8601" + DEFAULT_CONVERTER: str = "ISO8601" def __init__(self, ...): if converter is None: - converter_cls = BaseDateConverter.available_converters()[self.DEFAULT_FORMAT] + converter_cls = BaseDateConverter.available_converters()[self.DEFAULT_CONVERTER]
186-192
: Update error message to reflect converter terminology.The error message still uses "format" terminology. Consider updating it to align with the new converter terminology.
- raise ValueError(f"Unsupported format '{format}'") + raise ValueError(f"Unsupported converter '{format}'")
176-182
: Update parse method documentation and error message.The method implementation is correct, but the docstring and error message should be updated to reflect the converter terminology.
- """parse a string to an undate or undate interval using the specified format; - for now, only supports named converters""" + """parse a string to an undate or undate interval using the specified converter; + for now, only supports named converters""" converter_cls = BaseDateConverter.available_converters().get(format, None) if converter_cls: return converter_cls().parse(date_string) - raise ValueError(f"Unsupported format '{format}'") + raise ValueError(f"Unsupported converter '{format}'")
464-468
: Update UndateInterval format method documentation and error message.For consistency with the Undate class changes, update the docstring and error message to use converter terminology.
- """format this undate interval as a string using the specified format; - for now, only supports named converters""" + """format this undate interval as a string using the specified converter; + for now, only supports named converters""" converter_cls = BaseDateConverter.available_converters().get(format, None) if converter_cls: return converter_cls().to_string(self) - raise ValueError(f"Unsupported format '{format}'") + raise ValueError(f"Unsupported converter '{format}'")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (22)
docs/conf.py
(1 hunks)docs/index.rst
(1 hunks)docs/undate.rst
(0 hunks)docs/undate/converters.rst
(1 hunks)docs/undate/core.rst
(1 hunks)docs/undate/index.rst
(1 hunks)src/undate/converters/__init__.py
(1 hunks)src/undate/converters/base.py
(1 hunks)src/undate/converters/edtf/__init__.py
(1 hunks)src/undate/converters/edtf/converter.py
(2 hunks)src/undate/converters/iso8601.py
(3 hunks)src/undate/dateformat/__init__.py
(0 hunks)src/undate/dateformat/base.py
(0 hunks)src/undate/dateformat/edtf/__init__.py
(0 hunks)src/undate/undate.py
(7 hunks)tests/test_converters/edtf/test_edtf_parser.py
(1 hunks)tests/test_converters/edtf/test_edtf_transformer.py
(1 hunks)tests/test_converters/test_base.py
(1 hunks)tests/test_converters/test_edtf.py
(1 hunks)tests/test_converters/test_iso8601.py
(1 hunks)tests/test_dateformat/test_base.py
(0 hunks)tests/test_dateformat/test_edtf.py
(0 hunks)
💤 Files with no reviewable changes (6)
- docs/undate.rst
- src/undate/dateformat/init.py
- src/undate/dateformat/base.py
- src/undate/dateformat/edtf/init.py
- tests/test_dateformat/test_base.py
- tests/test_dateformat/test_edtf.py
✅ Files skipped from review due to trivial changes (10)
- docs/conf.py
- docs/index.rst
- docs/undate/converters.rst
- docs/undate/core.rst
- docs/undate/index.rst
- src/undate/converters/init.py
- src/undate/converters/edtf/init.py
- tests/test_converters/edtf/test_edtf_parser.py
- tests/test_converters/edtf/test_edtf_transformer.py
- tests/test_converters/test_iso8601.py
🧰 Additional context used
🪛 Ruff
src/undate/converters/base.py
78-78: Loop control variable importer
not used within loop body
Rename unused importer
to _importer
(B007)
78-78: Loop control variable ispkg
not used within loop body
Rename unused ispkg
to _ispkg
(B007)
🔇 Additional comments (13)
tests/test_converters/test_edtf.py (2)
1-6
: LGTM! Clean imports and class structure.
The imports are minimal and the test class follows pytest naming conventions.
7-16
: Verify support for partial dates without year.
The commented test case on line 15 suggests that parsing dates without years might be a desired feature. Let's verify if this is supported in the converter implementation.
Consider adding these test cases for better coverage:
# Edge cases for unknown components
assert EDTFDateConverter().parse("XXXX") == Undate("XXXX")
assert EDTFDateConverter().parse("20XX-XX") == Undate(20, "XX", "XX")
# Boundary year values
assert EDTFDateConverter().parse("0000") == Undate(0)
assert EDTFDateConverter().parse("9999") == Undate(9999)
tests/test_converters/test_base.py (1)
1-7
: LGTM! Clean test structure and appropriate imports.
The test file follows pytest conventions and includes necessary imports.
src/undate/converters/edtf/converter.py (2)
5-7
: LGTM! Import changes align with module renaming.
The updated imports correctly reflect the transition from formatters to converters.
43-46
: LGTM! Method documentation is clear and concise.
The docstring clearly describes the method's purpose and expected types.
src/undate/converters/base.py (1)
24-30
: LGTM!
The imports are well-organized and appropriate for the module's needs.
src/undate/converters/iso8601.py (3)
23-28
: LGTM! Documentation improvements are clear and helpful.
The docstring now clearly specifies the supported formats and return types.
Line range hint 1-92
: Verify test coverage for edge cases
The implementation handles various edge cases (unknown years, partial dates, intervals). Let's ensure adequate test coverage.
#!/bin/bash
# Check test coverage for edge cases
rg -l "test.*unknown.*year|test.*partial.*date|test.*interval" tests/test_converters/
29-31
: Address TODOs and type safety concerns
Several TODOs and type safety issues need attention:
- Handling of full ISO format dates with time components
- Invalid format handling
- Type errors in join operation
Let's check for existing time component handling in tests:
Would you like help implementing:
- Time component validation
- Format validation
- Type-safe string joining?
Also applies to: 91-92
src/undate/undate.py (4)
8-8
: LGTM: Import change aligns with module renaming.
The new import of BaseDateConverter
correctly reflects the transition from formatters to converters.
25-27
: LGTM: Class attribute properly renamed.
The attribute rename from formatter
to converter
is consistent with the module's new purpose.
44-45
: LGTM: Constructor parameter updated.
The parameter rename from formatter
to converter
maintains consistency with the class attribute change.
167-167
: LGTM: String conversion updated.
The __str__
method correctly uses the new converter interface.
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
resolves #100 and #86
Summary by CodeRabbit
New Features
undate
library, enhancing organization and accessibility.BaseDateConverter
class for date conversion functionality.EDTFDateConverter
andISO8601DateFormat
classes for specific date formats.Bug Fixes
Documentation
Tests
BaseDateConverter
andEDTFDateConverter
to validate functionality and error handling.