Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support partially known dates #36

Merged
merged 14 commits into from
Oct 23, 2023
Merged

Conversation

rlskoeser
Copy link
Member

@rlskoeser rlskoeser commented Nov 10, 2022

ref #3 and #57

@rlskoeser rlskoeser marked this pull request as draft November 10, 2022 21:49
@rlskoeser rlskoeser changed the base branch from main to develop November 15, 2022 20:18
@codecov
Copy link

codecov bot commented Mar 9, 2023

Codecov Report

❗ No coverage uploaded for pull request base (develop@5ec72aa). Click here to learn what that means.
The diff coverage is n/a.

@@            Coverage Diff             @@
##             develop      #36   +/-   ##
==========================================
  Coverage           ?   98.65%           
==========================================
  Files              ?        4           
  Lines              ?      223           
  Branches           ?        0           
==========================================
  Hits               ?      220           
  Misses             ?        3           
  Partials           ?        0           

@rlskoeser rlskoeser marked this pull request as ready for review March 9, 2023 02:07
@rlskoeser
Copy link
Member Author

@ColeDCrawford I think you and I worked on the early stages of this months ago. Thought maybe you our @jdamerow could look over the more recent changes so we could get this merged.

I've added logic to track known granularity of the date with the degrees of precision that we support now, and updated string method and duration for Undate objects with partially known dates. It occurs to me as I write this up that I didn't do anything about duration for UndateInterval — but I'm not sure there's a meaningful way to calculate it, so maybe it's just not implemented.

You'll see there are some TODOs — I think there are decisions still to be made that will be easier to make once we add support for more formats.

@rlskoeser
Copy link
Member Author

rlskoeser commented Jul 14, 2023

I've added an example notebook comparing duration calculation in undate and in my Shakespeare and Company Project based on the exported values (durations are calculated by the code), as I had proposed doing in #57

This was a helpful exercise because I found and corrected an error in the logic for dates with unknown years that wrap from one year to the next.

I also found there's a difference in logic between the two implementations, which is whether you count both the start and end date or only one of them (or half of each). It seems ok to me to leave it as is for now, but might be something that should be configurable.

You can review the notebook in this branch.

update: I enabled reviewnb for this repository, so the notebook can be reviewed here: https://app.reviewnb.com/dh-tech/undate-python/pull/36/

Copy link
Collaborator

@ColeDCrawford ColeDCrawford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to merge to me.

Should this be flagged as an issue for future follow-up?

I also found there's a difference in logic between the two implementations, which is whether you count both the start and end date or only one of them (or half of each). It seems ok to me to leave it as is for now, but might be something that should be configurable.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running through the notebook was great, thanks for adding that as an example!

sphinx_rtd_theme
m2r2
# pin sphinx because 7.0 currently not compatible with rtd theme
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like RTD is now: sphinx >=5,<8 so we could try to upgrade Sphinx at some point. Not a blocker for this though

@@ -136,4 +264,12 @@ def test_duration(self):
month_noyear_duration = UndateInterval(
Undate(None, 12, 1), Undate(None, 1, 1)
).duration()
assert month_noyear_duration.days == 31
assert month_noyear_duration.days == 32
# this seems wrong, but we currently count both start and dates
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like something we should create an issue to address?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

created #63

Comment on lines +21 to +24
# TODO: should not allow initializing year/day without month;
# should we infer unknown month? or raise an exception?
# assert str(Undate(2022, day="2X")) == "2022-XX-2X" # currently returns 2022-2X
# assert str(Undate(2022, day=7)) == "2022-XX-07" @ currently returns 2022-07
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could see a situation with a corrupted text where we legitimately only know the year and day of the month, but we can't do much with the additional day info in this case. I don't think it should error; it's also unlikely we can infer the month in this situation. Is it possible to bump to the known granularity (year) for calculation purposes, and leave the day in a string rendering of the undate? Can't remember

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

created #64 based on your comment here

@rlskoeser
Copy link
Member Author

@ColeDCrawford thank you for the review and comments! Glad the notebook was helpful. I think you're right, we have at least two new issues coming out of this: how to calculate duration, how to handle missing information. I'll create them and ask for input on slack, and then I'll go ahead and merge this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants