-
Notifications
You must be signed in to change notification settings - Fork 1
Implement UnInt and UnDelta for uncertain date durations #129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Based on previous experiments with uncertainties and portion libraries and feedback from @taylor-arnold
""" WalkthroughThe changes introduce two new classes, Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Undate
participant GregorianDateConverter
participant UnInt
participant UnDelta
User->>Undate: Call duration()
alt Month or year is unknown
Undate->>GregorianDateConverter: Query max_day() for all possible year/month combos
GregorianDateConverter-->>Undate: Return possible max days
Undate->>UnInt: Create uncertain integer range of days
Undate->>UnDelta: Wrap UnInt in uncertain timedelta
Undate-->>User: Return UnDelta (uncertain duration)
else Month and year are known
Undate->>GregorianDateConverter: Query max_day()
GregorianDateConverter-->>Undate: Return max day
Undate-->>User: Return Timedelta (certain duration)
end
Possibly related PRs
Poem
""" 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (5)
src/undate/date.py (3)
108-121
: Add return annotation & tighten param typing forUnInt.__sub__
__sub__
currently has no return annotation and acceptsobject
.
Consider:- def __sub__(self, other): + def __sub__(self, other: object) -> "UnInt":This keeps typing consistent with
__add__
and helps mypy/pyright catch misuse.
75-84
: Missing return-type on__contains__
__contains__
should be annotated as-> bool
for clarity and lint happiness.
153-158
: SimplifyUnDelta.__eq__
and reconsider equality semanticsStatic-analysis flagged that the method can be collapsed to
return other is self
.
If the intent is identity equality only, adopt the simpler form:- if other is self: - return True - return False + return other is selfHowever, you may want value-based equality (
self.days == other.days
) for testability and to avoid surprises (two freshly created identicalUnDelta
s are never equal right now).🧰 Tools
🪛 Ruff (0.8.2)
155-157: Return the condition
other is self
directlyReplace with
return other is self
(SIM103)
src/undate/converters/calendars/gregorian.py (2)
16-19
: Hard-coding leap/non-leap years is brittle
NON_LEAP_YEAR
andLEAP_YEAR
are used solely to feedcalendar.monthrange
.
You can avoid magic constants by callingcalendar.isleap(year)
or simply pass any year—monthrange
already handles leap logic correctly.Example:
from calendar import monthrange, isleap reference_year = year if year else (2020 if isleap(2020) else 2021) _, max_day = monthrange(reference_year, month)Alternatively, leave the constants but document why 2024/2022 were chosen.
42-45
: Consider returningUnInt
for unknown‐month max-day instead of scalarThere’s a TODO about returning an
IntervalRange
; withUnInt
now available, you could immediately returnUnInt(28, 31)
(or propagate that toUndate.duration
) for full consistency with the new uncertainty model.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
src/undate/converters/calendars/gregorian.py
(2 hunks)src/undate/date.py
(2 hunks)src/undate/undate.py
(3 hunks)tests/test_date.py
(2 hunks)tests/test_undate.py
(2 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
src/undate/date.py
155-157: Return the condition other is self
directly
Replace with return other is self
(SIM103)
🔇 Additional comments (20)
tests/test_undate.py (1)
407-424
: Great coverage for uncertain month/year durationsThese assertions verify both the returned type (
UnDelta
) and the exactUnInt
range, ensuring the new uncertainty logic behaves as designed. 👍tests/test_date.py (14)
2-13
: Appropriate imports for the new functionality.The imports correctly include pytest for testing and the necessary components from undate.date, including the newly implemented UnInt and UnDelta classes.
93-103
: Well-structured initialization tests for UnInt.The test cases verify that UnInt initializes properly with both positional and keyword arguments, covering a common use case for February days which can be either 28 or 29 days.
104-109
: Good validation testing for UnInt.The test properly verifies that the class validates its inputs by ensuring the lower value must be less than the upper value.
110-122
: Comprehensive containment testing.The test verifies the
in
operator behavior with integers, other UnInt instances, and non-numeric types, ensuring robust containment semantics.
123-126
: Iteration implementation test is clear.This test confirms that UnInt objects are iterable and yield all integers in their range, which is important for usability.
127-136
: Thorough arithmetic testing for addition.The test covers all scenarios: adding integers, adding other UnInt objects, and handling type errors appropriately.
137-151
: Comprehensive subtract operation tests with clear examples.The tests include detailed examples of subtraction between UnInt objects, explaining the expected behavior with comments. The error handling for invalid operations is also correctly tested.
153-166
: UnDelta initialization tests cover important cases.The test validates that UnDelta properly initializes with day ranges, stores them as UnInt, and handles the case of February which can have either 28 or 29 days in a leap year context.
167-172
: Test ensures proper handling of multiple values.This test confirms that UnDelta correctly handles multiple input values and puts them in the right order (min to max) when creating the internal UnInt object.
173-176
: Validation test for UnDelta.Good test to verify that UnDelta requires at least two values to represent uncertainty, which prevents creating a degenerate uncertain delta with just one value.
177-180
: String representation test.The test confirms that UnDelta has a useful string representation, which is important for debugging and inspecting values.
181-190
: Equality semantics test for uncertain deltas.The test correctly verifies that two separate but identical UnDelta instances are not considered equal, with a clear explanation in the comments. This is important for maintaining the semantics of uncertain values.
190-203
: Comprehensive less-than comparison tests.The tests cover comparisons between UnDelta and other UnDelta instances, as well as with Timedelta constants. The comments nicely explain the expected behavior in each case.
204-217
: Greater-than comparison tests are thorough.These tests mirror the less-than tests and verify the greater-than semantics work correctly, with clear explanations of the expected behavior for different comparison scenarios.
src/undate/undate.py (5)
23-23
: Updated import to include new uncertainty classes.The import has been properly updated to include UnDelta which is now used in the duration method.
442-449
: Clear method signature and documentation update.The return type annotation and docstring have been updated to accurately reflect that the method can now return either a Timedelta or an UnDelta, providing clear guidance for users of this method.
460-474
: Improved handling of unknown years for month durations.The code now handles both leap years and non-leap years when calculating month durations for dates with unknown years, which is more accurate than the previous approach.
475-493
: Robust implementation of uncertain duration calculation.This implementation appropriately collects all possible maximum day values for different month-year combinations and returns an UnDelta when there's uncertainty. The code elegantly handles the transition from the previous simpler logic to this more comprehensive approach.
501-502
: Consider calendar-specific maximum month handling.The FIXME comment about calendar dependency is correct. Different calendars have different maximum month lengths.
Could you implement a calendar-specific approach to determining the maximum month length, rather than using a fixed 31-day maximum? This would improve accuracy for different calendar systems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
tests/test_date.py (2)
123-136
: Consider avoiding Yoda conditions for improved readabilityThe tests for greater-than comparisons are thorough, but the code uses "Yoda conditions" (putting the literal first in comparisons) which reduces readability.
Consider rewriting these comparisons for improved readability:
- assert 13 > ten_twelve - assert not 12 > ten_twelve - assert not 9 > ten_twelve + assert ten_twelve < 13 + assert not ten_twelve < 12 + assert not ten_twelve < 9🧰 Tools
🪛 Ruff (0.8.2)
126-126: Yoda condition detected
Rewrite as
ten_twelve < 13
(SIM300)
127-127: Yoda condition detected
Rewrite as
ten_twelve < 12
(SIM300)
128-128: Yoda condition detected
Rewrite as
ten_twelve < 9
(SIM300)
135-135: Pointless comparison. Did you mean to assign a value? Otherwise, prepend
assert
or remove it.(B015)
137-150
: Consider avoiding Yoda conditions for improved readabilitySimilar to the previous comment, the less-than comparison tests use Yoda conditions.
Consider rewriting these comparisons:
- assert 9 < ten_twelve - assert not 12 < ten_twelve - assert not 13 < ten_twelve + assert ten_twelve > 9 + assert not ten_twelve > 12 + assert not ten_twelve > 13🧰 Tools
🪛 Ruff (0.8.2)
140-140: Yoda condition detected
Rewrite as
ten_twelve > 9
(SIM300)
141-141: Yoda condition detected
Rewrite as
ten_twelve > 12
(SIM300)
142-142: Yoda condition detected
Rewrite as
ten_twelve > 13
(SIM300)
149-149: Pointless comparison. Did you mean to assign a value? Otherwise, prepend
assert
or remove it.(B015)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/undate/date.py
(2 hunks)tests/test_date.py
(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- src/undate/date.py
🧰 Additional context used
🪛 Ruff (0.8.2)
tests/test_date.py
126-126: Yoda condition detected
Rewrite as ten_twelve < 13
(SIM300)
127-127: Yoda condition detected
Rewrite as ten_twelve < 12
(SIM300)
128-128: Yoda condition detected
Rewrite as ten_twelve < 9
(SIM300)
135-135: Pointless comparison. Did you mean to assign a value? Otherwise, prepend assert
or remove it.
(B015)
140-140: Yoda condition detected
Rewrite as ten_twelve > 9
(SIM300)
141-141: Yoda condition detected
Rewrite as ten_twelve > 12
(SIM300)
142-142: Yoda condition detected
Rewrite as ten_twelve > 13
(SIM300)
149-149: Pointless comparison. Did you mean to assign a value? Otherwise, prepend assert
or remove it.
(B015)
🔇 Additional comments (13)
tests/test_date.py (13)
2-13
: LGTM: Import statements appropriately updatedThe import statements have been appropriately updated to include pytest and the new classes/constants (
UnInt
andUnDelta
) that will be tested.
93-103
: LGTM: UnInt initialization tests are thoroughGood test coverage for UnInt initialization with both positional and keyword arguments. The tests verify the correct assignment of lower and upper bounds properties.
104-109
: LGTM: Good validation testingProper testing of the validation logic ensuring that the lower bound must be less than the upper bound, with appropriate error message verification.
110-122
: LGTM: Thorough containment testingGood test coverage for the containment operator, testing integers, UnInt instances, and other types.
151-154
: LGTM: Good iteration testingThe test verifies that UnInt can be iterated over and provides the correct range of values.
155-164
: LGTM: Thorough addition operation testingGood tests for addition with integers and UnInt instances, including proper error handling for unsupported types.
165-179
: LGTM: Well-documented subtraction testsExcellent tests for subtraction with integers and UnInt instances, with helpful comments explaining the logic and expected results.
182-200
: LGTM: Good initialization testing for UnDeltaThorough testing of UnDelta initialization, including handling of multiple values and out-of-order inputs. The note about portion interval comparison is valuable for future consideration.
201-204
: LGTM: Good validation testingProper testing of validation logic ensuring that at least two values are provided for UnDelta initialization.
205-208
: LGTM: String representation testGood test for the custom string representation of UnDelta.
209-217
: LGTM: Equality semantics are clearly testedGood tests for equality semantics, verifying that instances with identical ranges are not equal unless they're the same object.
218-231
: LGTM: Thorough less-than comparison testsComprehensive tests for less-than comparisons with both UnDelta instances and Timedelta constants. The note about future updates to Timedelta is valuable.
232-246
: LGTM: Thorough greater-than comparison testsComprehensive tests for greater-than comparisons with both UnDelta instances and Timedelta constants. The comments about ambiguous vs. unambiguous comparisons provide good context.
@taylor-arnold thanks for your feedback and thoughts on my previous attempts to implement uncertain durations (#116 #128). I've now implemented an I think it would be good to add a notebook showing how |
@taylor-arnold thanks also for your comment about use cases : #116 (comment) It reminded me of a project of mine that might work as a test case... At one point I was working on creating a dataset of the Hale/Eliot letters based on the digitized materials from the archives. Emily Hale's letters to Eliot were destroyed, but he references them sometimes so I was starting to build a shadow dataset of known letters based on those references. The dates are relative to his letters, I don't remember how precise, but might be something specific to experiment with. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
tests/test_date.py (1)
110-122
: Consider expanding containment tests.The containment tests cover basic scenarios well, but could benefit from adding edge cases or more complex nested UnInt containment tests.
Consider adding tests for:
# Edge cases for UnInt containment assert UnInt(28, 31) not in UnInt(29, 30) # Larger range not in smaller assert UnInt(28, 31) not in UnInt(27, 32) # Overlap but not contained
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
tests/test_date.py
(2 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
tests/test_date.py (1)
src/undate/date.py (4)
Timedelta
(11-31)UnDelta
(125-169)UnInt
(35-121)days
(29-31)
🪛 Ruff (0.8.2)
tests/test_date.py
135-135: Pointless comparison. Did you mean to assign a value? Otherwise, prepend assert
or remove it.
(B015)
149-149: Pointless comparison. Did you mean to assign a value? Otherwise, prepend assert
or remove it.
(B015)
🔇 Additional comments (7)
tests/test_date.py (7)
2-13
: Well-organized import structure.The imports are cleanly organized and correctly include pytest as well as all the necessary classes and constants from undate.date for testing the new functionality.
93-109
: Comprehensive initialization tests for UnInt.The initialization tests thoroughly verify both positional and keyword argument initialization patterns. The validation test properly ensures that the constraint of lower < upper is enforced with a clear error message.
123-150
: Thorough comparison operator tests.The greater-than and less-than comparison tests are well-implemented with comprehensive test cases for both integer and UnInt comparisons. The error handling for unsupported types is also properly tested.
The static analysis hint about "pointless comparison" on lines 135 and 149 is a false positive since these lines are correctly using pytest.raises to test for exceptions.
🧰 Tools
🪛 Ruff (0.8.2)
135-135: Pointless comparison. Did you mean to assign a value? Otherwise, prepend
assert
or remove it.(B015)
149-149: Pointless comparison. Did you mean to assign a value? Otherwise, prepend
assert
or remove it.(B015)
151-179
: Good coverage of arithmetic operations.The tests for iteration, addition, and subtraction are comprehensive and include good explanatory comments, especially in the subtraction tests that clarify the expected behavior when handling uncertain ranges.
181-200
: Well-designed UnDelta initialization tests.The tests verify correct behavior for different initialization scenarios, including out-of-order arguments. The comment on lines 190-193 suggests potential concerns about the default comparison behavior.
The comment mentions "default portion interval comparison may not be what we want here". Consider clarifying whether this is a TODO item or if the current implementation satisfies requirements despite the comment.
201-217
: Good handling of equality semantics.The tests properly verify that UnDelta instances with identical values are not considered equal unless they're the same object. This is an important distinction for uncertain values where identity matters more than value equivalence.
218-246
: The comparison operator tests indicate future improvements needed.The tests for less-than and greater-than comparisons are thorough but contain comments about limitations in the current implementation regarding Timedelta comparisons.
The comments on lines 225-226 and 238-241 indicate that Timedelta needs to be updated to support reverse comparisons with UnDelta. Is there a plan to address this in a future PR? This limitation should be documented clearly if it's not going to be addressed immediately.
The tests correctly verify both unambiguous cases (where one range is clearly greater/less than another) and ambiguous cases (where ranges overlap).
if not self.lower < self.upper: | ||
raise ValueError( | ||
f"Lower value ({self.lower}) must be less than upper ({self.upper})" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to allow UnInts for exact values, meaning to allow an UnInt(1,1)
? I'm thinking of cases where maybe someone would iterate over an UnInt making the error margin smaller and smaller until it reaches an exact number. Not sure if that's a use case or if this would need to be allowed here, but I thought I raise the question :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for asking, this is why I need someone else thinking about it with me!
I think in that case we would want to return an actual integer but without a use case it's hard for me to think about how we would implement something like that....
Right now the only way the code implements an UnInt
is through the uncertain time delta UnDelta
object - there are more places I need to integrate it, so maybe as we use it more we'll have more clarity on what functionality we might need.
src/undate/undate.py
Outdated
return UnDelta(*possible_max_days) | ||
|
||
# otherwise, calculate timedelta normally | ||
max_day = list(possible_max_days)[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in case that if self.earliest.month is not None and self.latest.month is not None:
is not true, I think this here would throw an list index out of range error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for catching; I wonder if I got the nesting wrong... I'll try to figure out a unit test case to trigger that error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jdamerow tl;dr is that I got the nesting wrong.
The Date
object allows month to be None, but undate earliest/latest are always calculated as full-precision dates. Not sure if there's something we can do to improve this ambiguity on the Date object + precision.
I added some notes about ambiguous year duration, but want to handle that in a separate branch - it's tricky because it depends on the calendar!
Co-authored-by: Julia Damerow <[email protected]>
Added TODOs for ambiguous year duration, will be handled separately
Based on previous experiments with uncertainties and portion libraries and feedback from @taylor-arnold
Changes in this PR:
UnInt
dataclass for range of uncertain integer valuesUnDelta
dataclass for uncertain timedelta (datedelta); usesUnInt
for duration in daysUnDate
class to return uncertain delta for some durations (not yet used in all cases where it should be)Summary by CodeRabbit
New Features
Bug Fixes
Tests