Skip to content

Implement UnInt and UnDelta for uncertain date durations #129

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: develop
Choose a base branch
from

Conversation

rlskoeser
Copy link
Member

@rlskoeser rlskoeser commented May 9, 2025

Based on previous experiments with uncertainties and portion libraries and feedback from @taylor-arnold

Changes in this PR:

  • UnInt dataclass for range of uncertain integer values
  • UnDelta dataclass for uncertain timedelta (datedelta); uses UnInt for duration in days
  • update UnDate class to return uncertain delta for some durations (not yet used in all cases where it should be)
  • unit tests for associated classes and methods

Summary by CodeRabbit

  • New Features

    • Introduced support for uncertain durations and integer ranges, allowing representation of ambiguous date and time intervals.
    • Added new classes to handle uncertain integer ranges and durations.
  • Bug Fixes

    • Improved duration calculations for partially known or ambiguous months and years, now accurately reflecting leap year and month variability.
  • Tests

    • Added comprehensive tests for uncertain integer and duration classes.
    • Enhanced tests for partially known date durations to verify correct handling of uncertainty.

rlskoeser added 2 commits May 9, 2025 09:55
Based on previous experiments with uncertainties and portion libraries
and feedback from @taylor-arnold
Copy link
Contributor

coderabbitai bot commented May 9, 2025

Walkthrough

The changes introduce two new classes, UnInt and UnDelta, to represent uncertain integer ranges and durations, respectively. The Undate.duration method is updated to return these types when the duration is ambiguous, particularly for unknown months or years. Tests are added and updated to verify the new behavior and types, and minor clarifications are made to calendar-related constants.

Changes

File(s) Change Summary
src/undate/converters/calendars/gregorian.py Added LEAP_YEAR class attribute and clarified the docstring for NON_LEAP_YEAR in GregorianDateConverter. Added a comment in max_day about possible future changes.
src/undate/date.py Introduced new classes UnInt (for uncertain integer ranges) and UnDelta (for uncertain timedeltas), with arithmetic, comparison, and containment operations. No changes to existing classes or functions.
src/undate/undate.py Modified Undate.duration to return Timedelta or UnDelta based on ambiguity in months/years, aggregating possible durations for unknown components and returning an uncertain range when needed. Expanded docstring to clarify return types.
tests/test_date.py Added comprehensive tests for UnInt and UnDelta, including initialization, arithmetic, containment, comparison, and error handling. Updated imports accordingly. No changes to existing tests.
tests/test_undate.py Expanded test_partiallyknown_duration to check for UnDelta and UnInt types when month or year is unknown, verifying correct uncertain ranges for days in month. No changes to other tests or signatures.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Undate
    participant GregorianDateConverter
    participant UnDelta
    participant UnInt

    User->>Undate: Call duration()
    alt Month or year is unknown
        Undate->>GregorianDateConverter: Query max_day() for all possible years/months
        GregorianDateConverter-->>Undate: Return possible max days
        Undate->>UnInt: Create range of possible days
        Undate->>UnDelta: Wrap UnInt in UnDelta
        Undate-->>User: Return UnDelta
    else Month and year are known
        Undate->>GregorianDateConverter: Query max_day()
        GregorianDateConverter-->>Undate: Return max day
        Undate-->>User: Return Timedelta
    end
Loading

Poem

In fields of time, uncertainty grows,
With UnInt and UnDelta, the range now shows.
Months unknown, years unclear,
Rabbits hop through dates without fear!
Ambiguous days, yet tests all pass—
🐇 A leap for calendars, in Python class!

Tip

⚡️ Faster reviews with caching
  • CodeRabbit now supports caching for code and dependencies, helping speed up reviews. This means quicker feedback, reduced wait times, and a smoother review experience overall. Cached data is encrypted and stored securely. This feature will be automatically enabled for all accounts on May 16th. To opt out, configure Review - Disable Cache at either the organization or repository level. If you prefer to disable all data retention across your organization, simply turn off the Data Retention setting under your Organization Settings.

Enjoy the performance boost—your workflow just got faster.

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (5)
src/undate/date.py (3)

108-121: Add return annotation & tighten param typing for UnInt.__sub__

__sub__ currently has no return annotation and accepts object.
Consider:

-    def __sub__(self, other):
+    def __sub__(self, other: object) -> "UnInt":

This keeps typing consistent with __add__ and helps mypy/pyright catch misuse.


75-84: Missing return-type on __contains__

__contains__ should be annotated as -> bool for clarity and lint happiness.


153-158: Simplify UnDelta.__eq__ and reconsider equality semantics

Static-analysis flagged that the method can be collapsed to return other is self.
If the intent is identity equality only, adopt the simpler form:

-        if other is self:
-            return True
-        return False
+        return other is self

However, you may want value-based equality (self.days == other.days) for testability and to avoid surprises (two freshly created identical UnDeltas are never equal right now).

🧰 Tools
🪛 Ruff (0.8.2)

155-157: Return the condition other is self directly

Replace with return other is self

(SIM103)

src/undate/converters/calendars/gregorian.py (2)

16-19: Hard-coding leap/non-leap years is brittle

NON_LEAP_YEAR and LEAP_YEAR are used solely to feed calendar.monthrange.
You can avoid magic constants by calling calendar.isleap(year) or simply pass any year—monthrange already handles leap logic correctly.

Example:

from calendar import monthrange, isleap

reference_year = year if year else (2020 if isleap(2020) else 2021)
_, max_day = monthrange(reference_year, month)

Alternatively, leave the constants but document why 2024/2022 were chosen.


42-45: Consider returning UnInt for unknown‐month max-day instead of scalar

There’s a TODO about returning an IntervalRange; with UnInt now available, you could immediately return UnInt(28, 31) (or propagate that to Undate.duration) for full consistency with the new uncertainty model.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e664a81 and 4e5b243.

📒 Files selected for processing (5)
  • src/undate/converters/calendars/gregorian.py (2 hunks)
  • src/undate/date.py (2 hunks)
  • src/undate/undate.py (3 hunks)
  • tests/test_date.py (2 hunks)
  • tests/test_undate.py (2 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
src/undate/date.py

155-157: Return the condition other is self directly

Replace with return other is self

(SIM103)

🔇 Additional comments (20)
tests/test_undate.py (1)

407-424: Great coverage for uncertain month/year durations

These assertions verify both the returned type (UnDelta) and the exact UnInt range, ensuring the new uncertainty logic behaves as designed. 👍

tests/test_date.py (14)

2-13: Appropriate imports for the new functionality.

The imports correctly include pytest for testing and the necessary components from undate.date, including the newly implemented UnInt and UnDelta classes.


93-103: Well-structured initialization tests for UnInt.

The test cases verify that UnInt initializes properly with both positional and keyword arguments, covering a common use case for February days which can be either 28 or 29 days.


104-109: Good validation testing for UnInt.

The test properly verifies that the class validates its inputs by ensuring the lower value must be less than the upper value.


110-122: Comprehensive containment testing.

The test verifies the in operator behavior with integers, other UnInt instances, and non-numeric types, ensuring robust containment semantics.


123-126: Iteration implementation test is clear.

This test confirms that UnInt objects are iterable and yield all integers in their range, which is important for usability.


127-136: Thorough arithmetic testing for addition.

The test covers all scenarios: adding integers, adding other UnInt objects, and handling type errors appropriately.


137-151: Comprehensive subtract operation tests with clear examples.

The tests include detailed examples of subtraction between UnInt objects, explaining the expected behavior with comments. The error handling for invalid operations is also correctly tested.


153-166: UnDelta initialization tests cover important cases.

The test validates that UnDelta properly initializes with day ranges, stores them as UnInt, and handles the case of February which can have either 28 or 29 days in a leap year context.


167-172: Test ensures proper handling of multiple values.

This test confirms that UnDelta correctly handles multiple input values and puts them in the right order (min to max) when creating the internal UnInt object.


173-176: Validation test for UnDelta.

Good test to verify that UnDelta requires at least two values to represent uncertainty, which prevents creating a degenerate uncertain delta with just one value.


177-180: String representation test.

The test confirms that UnDelta has a useful string representation, which is important for debugging and inspecting values.


181-190: Equality semantics test for uncertain deltas.

The test correctly verifies that two separate but identical UnDelta instances are not considered equal, with a clear explanation in the comments. This is important for maintaining the semantics of uncertain values.


190-203: Comprehensive less-than comparison tests.

The tests cover comparisons between UnDelta and other UnDelta instances, as well as with Timedelta constants. The comments nicely explain the expected behavior in each case.


204-217: Greater-than comparison tests are thorough.

These tests mirror the less-than tests and verify the greater-than semantics work correctly, with clear explanations of the expected behavior for different comparison scenarios.

src/undate/undate.py (5)

23-23: Updated import to include new uncertainty classes.

The import has been properly updated to include UnDelta which is now used in the duration method.


442-449: Clear method signature and documentation update.

The return type annotation and docstring have been updated to accurately reflect that the method can now return either a Timedelta or an UnDelta, providing clear guidance for users of this method.


460-474: Improved handling of unknown years for month durations.

The code now handles both leap years and non-leap years when calculating month durations for dates with unknown years, which is more accurate than the previous approach.


475-493: Robust implementation of uncertain duration calculation.

This implementation appropriately collects all possible maximum day values for different month-year combinations and returns an UnDelta when there's uncertainty. The code elegantly handles the transition from the previous simpler logic to this more comprehensive approach.


501-502: Consider calendar-specific maximum month handling.

The FIXME comment about calendar dependency is correct. Different calendars have different maximum month lengths.

Could you implement a calendar-specific approach to determining the maximum month length, rather than using a fixed 31-day maximum? This would improve accuracy for different calendar systems.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
tests/test_date.py (2)

123-136: Consider avoiding Yoda conditions for improved readability

The tests for greater-than comparisons are thorough, but the code uses "Yoda conditions" (putting the literal first in comparisons) which reduces readability.

Consider rewriting these comparisons for improved readability:

-        assert 13 > ten_twelve
-        assert not 12 > ten_twelve
-        assert not 9 > ten_twelve
+        assert ten_twelve < 13
+        assert not ten_twelve < 12
+        assert not ten_twelve < 9
🧰 Tools
🪛 Ruff (0.8.2)

126-126: Yoda condition detected

Rewrite as ten_twelve < 13

(SIM300)


127-127: Yoda condition detected

Rewrite as ten_twelve < 12

(SIM300)


128-128: Yoda condition detected

Rewrite as ten_twelve < 9

(SIM300)


135-135: Pointless comparison. Did you mean to assign a value? Otherwise, prepend assert or remove it.

(B015)


137-150: Consider avoiding Yoda conditions for improved readability

Similar to the previous comment, the less-than comparison tests use Yoda conditions.

Consider rewriting these comparisons:

-        assert 9 < ten_twelve
-        assert not 12 < ten_twelve
-        assert not 13 < ten_twelve
+        assert ten_twelve > 9
+        assert not ten_twelve > 12
+        assert not ten_twelve > 13
🧰 Tools
🪛 Ruff (0.8.2)

140-140: Yoda condition detected

Rewrite as ten_twelve > 9

(SIM300)


141-141: Yoda condition detected

Rewrite as ten_twelve > 12

(SIM300)


142-142: Yoda condition detected

Rewrite as ten_twelve > 13

(SIM300)


149-149: Pointless comparison. Did you mean to assign a value? Otherwise, prepend assert or remove it.

(B015)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4e5b243 and 20a9f8d.

📒 Files selected for processing (2)
  • src/undate/date.py (2 hunks)
  • tests/test_date.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/undate/date.py
🧰 Additional context used
🪛 Ruff (0.8.2)
tests/test_date.py

126-126: Yoda condition detected

Rewrite as ten_twelve < 13

(SIM300)


127-127: Yoda condition detected

Rewrite as ten_twelve < 12

(SIM300)


128-128: Yoda condition detected

Rewrite as ten_twelve < 9

(SIM300)


135-135: Pointless comparison. Did you mean to assign a value? Otherwise, prepend assert or remove it.

(B015)


140-140: Yoda condition detected

Rewrite as ten_twelve > 9

(SIM300)


141-141: Yoda condition detected

Rewrite as ten_twelve > 12

(SIM300)


142-142: Yoda condition detected

Rewrite as ten_twelve > 13

(SIM300)


149-149: Pointless comparison. Did you mean to assign a value? Otherwise, prepend assert or remove it.

(B015)

🔇 Additional comments (13)
tests/test_date.py (13)

2-13: LGTM: Import statements appropriately updated

The import statements have been appropriately updated to include pytest and the new classes/constants (UnInt and UnDelta) that will be tested.


93-103: LGTM: UnInt initialization tests are thorough

Good test coverage for UnInt initialization with both positional and keyword arguments. The tests verify the correct assignment of lower and upper bounds properties.


104-109: LGTM: Good validation testing

Proper testing of the validation logic ensuring that the lower bound must be less than the upper bound, with appropriate error message verification.


110-122: LGTM: Thorough containment testing

Good test coverage for the containment operator, testing integers, UnInt instances, and other types.


151-154: LGTM: Good iteration testing

The test verifies that UnInt can be iterated over and provides the correct range of values.


155-164: LGTM: Thorough addition operation testing

Good tests for addition with integers and UnInt instances, including proper error handling for unsupported types.


165-179: LGTM: Well-documented subtraction tests

Excellent tests for subtraction with integers and UnInt instances, with helpful comments explaining the logic and expected results.


182-200: LGTM: Good initialization testing for UnDelta

Thorough testing of UnDelta initialization, including handling of multiple values and out-of-order inputs. The note about portion interval comparison is valuable for future consideration.


201-204: LGTM: Good validation testing

Proper testing of validation logic ensuring that at least two values are provided for UnDelta initialization.


205-208: LGTM: String representation test

Good test for the custom string representation of UnDelta.


209-217: LGTM: Equality semantics are clearly tested

Good tests for equality semantics, verifying that instances with identical ranges are not equal unless they're the same object.


218-231: LGTM: Thorough less-than comparison tests

Comprehensive tests for less-than comparisons with both UnDelta instances and Timedelta constants. The note about future updates to Timedelta is valuable.


232-246: LGTM: Thorough greater-than comparison tests

Comprehensive tests for greater-than comparisons with both UnDelta instances and Timedelta constants. The comments about ambiguous vs. unambiguous comparisons provide good context.

@rlskoeser
Copy link
Member Author

rlskoeser commented May 9, 2025

@taylor-arnold thanks for your feedback and thoughts on my previous attempts to implement uncertain durations (#116 #128). I've now implemented an UnInt and a revised UnDelta with support for basic comparisons (>,<) and basic math (addition, subtraction). It's not yet everywhere in undate that we should have ambiguous durations, but I'm hopeful it's a good start. Would appreciate your input whenever you get a chance to look.

I think it would be good to add a notebook showing how UnInt and UnDelta objects operate and what the assumptions are, but hoping to get a basic version merged in first.

@rlskoeser
Copy link
Member Author

@taylor-arnold thanks also for your comment about use cases : #116 (comment)

It reminded me of a project of mine that might work as a test case... At one point I was working on creating a dataset of the Hale/Eliot letters based on the digitized materials from the archives. Emily Hale's letters to Eliot were destroyed, but he references them sometimes so I was starting to build a shadow dataset of known letters based on those references. The dates are relative to his letters, I don't remember how precise, but might be something specific to experiment with.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant