-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for converting from Hijri calendar to undate and undate interval #107
Changes from 38 commits
a2dfae6
ed23f6c
646f739
51850cc
50f2331
778c67b
99c0611
4a7a1d8
454382f
315ad7a
18c8f25
f3ce58b
11cc007
2cc596e
0aac63a
e2444ed
3aa462b
fe41545
3a43e6d
7c9ccb7
b6b6376
e91b7ba
6c6f09a
5cc19fd
d26574c
5660fa2
c6ed817
88e4d17
f908cd5
c24cd34
5773bf7
3032785
9137608
920f736
867e018
333e740
b7ae594
759d0c7
d9fd4ba
4372b23
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
from undate.converters.calendars.gregorian import GregorianDateConverter | ||
from undate.converters.calendars.hijri import HijriDateConverter | ||
from undate.converters.calendars.hebrew import HebrewDateConverter | ||
|
||
__all__ = ["HijriDateConverter", "GregorianDateConverter", "HebrewDateConverter"] |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,51 @@ | ||
from calendar import monthrange | ||
|
||
from undate.converters.base import BaseCalendarConverter | ||
|
||
|
||
class GregorianDateConverter(BaseCalendarConverter): | ||
""" | ||
Calendar converter class for Gregorian calendar. | ||
""" | ||
|
||
#: converter name: Gregorian | ||
name: str = "Gregorian" | ||
#: calendar | ||
calendar_name: str = "Gregorian" | ||
|
||
#: known non-leap year | ||
NON_LEAP_YEAR: int = 2022 | ||
|
||
def min_month(self) -> int: | ||
"""First month for the Gregorian calendar.""" | ||
return 1 | ||
|
||
def max_month(self, year: int) -> int: | ||
"""maximum numeric month for the specified year in the Gregorian calendar""" | ||
return 12 | ||
|
||
def max_day(self, year: int, month: int) -> int: | ||
"""maximum numeric day for the specified year and month in this calendar""" | ||
# if month is known, use that to calculate | ||
if month: | ||
# if year is known, use it; otherwise use a known non-leap year | ||
# (only matters for February) | ||
year = year or self.NON_LEAP_YEAR | ||
|
||
# Use monthrange from python builtin calendar module. | ||
# returns first day of the month and number of days in the month | ||
# for the specified year and month. | ||
_, max_day = monthrange(year, month) | ||
else: | ||
# if year and month are unknown, return maximum possible | ||
max_day = 31 | ||
|
||
return max_day | ||
|
||
Comment on lines
+27
to
+44
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 💡 Codebase verification Missing test coverage for Gregorian calendar implementation The codebase lacks dedicated test files for the Gregorian calendar implementation. While other calendars (Hebrew, Hijri) have comprehensive test suites, there are no tests specifically covering the Gregorian calendar's
🔗 Analysis chainAdd test coverage for max_day edge cases The fallback case where both year and month are unknown (line 41) is not covered by tests. Let's verify the test coverage: Would you like me to help generate test cases for this scenario? 🏁 Scripts executedThe following scripts were executed for the analysis: Script: #!/bin/bash
# Search for tests of max_day method
rg -A 5 "def test.*max_day" tests/
Length of output: 36 Script: #!/bin/bash
# Let's first check if the tests directory exists and find test files
fd "test.*\.py$" tests/
# Then let's search for any test files related to gregorian calendar
fd ".*gregorian.*\.py$" tests/
# Finally, let's look for any test class or function containing 'max_day' or 'maxday' case-insensitive
rg -i "max.?day" tests/
Length of output: 1040 Script: #!/bin/bash
# Let's check if there are any test files specifically for gregorian calendar
fd "test_gregorian.*\.py$" tests/
# Let's also check the test_base.py as it might contain base calendar tests
cat tests/test_converters/test_base.py
# Let's also look for any calendar test base classes that might contain shared tests
rg -l "class.*TestCase" tests/test_converters/
Length of output: 2391 🧰 Tools🪛 GitHub Check: codecov/patch[warning] 41-41: src/undate/converters/calendars/gregorian.py#L41 |
||
def to_gregorian(self, year, month, day) -> tuple[int, int, int]: | ||
"""Convert to Gregorian date. This returns the specified by year, month, | ||
and day unchanged, but is provided for consistency since all calendar | ||
converters need to support conversion to Gregorian calendar for | ||
a common point of comparison. | ||
""" | ||
return (year, month, day) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
from undate.converters.calendars.hebrew.converter import HebrewDateConverter | ||
|
||
__all__ = ["HebrewDateConverter"] |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,78 @@ | ||
from typing import Union | ||
|
||
from convertdate import hebrew # type: ignore | ||
from lark.exceptions import UnexpectedCharacters | ||
|
||
from undate.converters.base import BaseCalendarConverter | ||
from undate.converters.calendars.hebrew.parser import hebrew_parser | ||
from undate.converters.calendars.hebrew.transformer import HebrewDateTransformer | ||
from undate.undate import Undate, UndateInterval | ||
|
||
|
||
class HebrewDateConverter(BaseCalendarConverter): | ||
""" | ||
Converter for Hebrew Anno Mundicalendar. | ||
|
||
Support for parsing Anno Mundi dates and converting to Undate and UndateInterval | ||
objects in the Gregorian calendar. | ||
""" | ||
|
||
#: converter name: Hebrew | ||
name: str = "Hebrew" | ||
calendar_name: str = "Anno Mundi" | ||
|
||
def __init__(self): | ||
self.transformer = HebrewDateTransformer() | ||
|
||
def min_month(self) -> int: | ||
"""Smallest numeric month for this calendar.""" | ||
return 1 | ||
|
||
def max_month(self, year: int) -> int: | ||
"""Maximum numeric month for this calendar. In Hebrew calendar, this is 12 or 13 | ||
depending on whether it is a leap year.""" | ||
return hebrew.year_months(year) | ||
|
||
def first_month(self) -> int: | ||
"""First month in this calendar. The Hebrew civil year starts in Tishri.""" | ||
return hebrew.TISHRI | ||
|
||
def last_month(self, year: int) -> int: | ||
"""Last month in this calendar. Hebrew civil year starts in Tishri, | ||
Elul is the month before Tishri.""" | ||
return hebrew.ELUL | ||
|
||
def max_day(self, year: int, month: int) -> int: | ||
"""maximum numeric day for the specified year and month in this calendar""" | ||
# NOTE: unreleased v2.4.1 of convertdate standardizes month_days to month_length | ||
return hebrew.month_days(year, month) | ||
|
||
def to_gregorian(self, year: int, month: int, day: int) -> tuple[int, int, int]: | ||
"""Convert a Hebrew date, specified by year, month, and day, | ||
to the Gregorian equivalent date. Returns a tuple of year, month, day. | ||
""" | ||
return hebrew.to_gregorian(year, month, day) | ||
|
||
def parse(self, value: str) -> Union[Undate, UndateInterval]: | ||
""" | ||
Parse a Hebrew date string and return an :class:`~undate.undate.Undate` or | ||
:class:`~undate.undate.UndateInterval`. | ||
The Hebrew date string is preserved in the undate label. | ||
""" | ||
if not value: | ||
raise ValueError("Parsing empty string is not supported") | ||
|
||
# parse the input string, then transform to undate object | ||
try: | ||
# parse the string with our Hebrew date parser | ||
parsetree = hebrew_parser.parse(value) | ||
# transform the parse tree into an undate or undate interval | ||
undate_obj = self.transformer.transform(parsetree) | ||
# set the original date as a label, with the calendar name | ||
undate_obj.label = f"{value} {self.calendar_name}" | ||
return undate_obj | ||
except UnexpectedCharacters as err: | ||
raise ValueError(f"Could not parse '{value}' as a Hebrew date") from err | ||
|
||
# do we need to support conversion the other direction? | ||
# i.e., generate a Hebrew date from an abitrary undate or undate interval? |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,56 @@ | ||
%import common.WS | ||
%ignore WS | ||
|
||
// only support day month year format for now | ||
// parser requires numeric day and year to be distinguished based on order | ||
hebrew_date: day month year | month year | year | ||
|
||
// TODO: handle date ranges? | ||
|
||
// TODO: add support for qualifiers? | ||
// PGP dates use qualifiers like "first decade of" (for beginning of month) | ||
// "first third of", seasons (can look for more examples) | ||
|
||
// Hebrew calendar starts with year 1 in 3761 BCE | ||
year: /\d+/ | ||
|
||
// months | ||
month: month_1 | ||
| month_2 | ||
| month_3 | ||
| month_4 | ||
| month_5 | ||
| month_6 | ||
| month_7 | ||
| month_8 | ||
| month_9 | ||
| month_10 | ||
| month_11 | ||
| month_12 | ||
| month_13 | ||
// months have 29 or 30 days; we do not expect leading zeroes | ||
day: /[1-9]/ | /[12][0-9]/ | /30/ | ||
Comment on lines
+31
to
+32
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🛠️ Refactor suggestion Enhance day pattern validation. The current day pattern allows some invalid dates (e.g., "29" for months with only 28 days). Consider implementing month-specific day validation in the transformer: -day: /[1-9]/ | /[12][0-9]/ | /30/
+day: /[1-9]/ | /[12][0-9]/ | /30/ // Validation in transformer
|
||
|
||
// months, in order; from convertdate list | ||
// with variants from Princeton Geniza Project | ||
// support matching with and without accents | ||
month_1: "Nisan" | ||
// Iyar or Iyyar | ||
month_2: /Iyy?ar/ | ||
month_3: "Sivan" | ||
month_4: "Tammuz" | ||
month_5: "Av" | ||
month_6: "Elul" | ||
// Tishrei or Tishri | ||
month_7: /Tishre?i/ | ||
month_8: "Heshvan" | ||
month_9: "Kislev" | ||
// Tevet or Teveth | ||
month_10: /[ṬT]eveth?/ | ||
month_11: "Shevat" | ||
// Adar I or Adar | ||
month_12: /Adar( I)?/ | ||
// Adar II or Adar Bet | ||
month_13: /Adar (II|Bet)/ | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
import pathlib | ||
|
||
from lark import Lark | ||
|
||
grammar_path = pathlib.Path(__file__).parent / "hebrew.lark" | ||
|
||
with open(grammar_path) as grammar: | ||
# NOTE: LALR parser is faster but can't be used to ambiguity between years and dates | ||
hebrew_parser = Lark(grammar.read(), start="hebrew_date", strict=True) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
from lark import Transformer, Tree | ||
|
||
from undate.undate import Undate, Calendar | ||
|
||
|
||
class HebrewUndate(Undate): | ||
"""Undate convience subclass; sets default calendar to Hebrew.""" | ||
|
||
calendar = Calendar.HEBREW | ||
|
||
|
||
class HebrewDateTransformer(Transformer): | ||
"""Transform a Hebrew date parse tree and return an Undate or | ||
UndateInterval.""" | ||
|
||
def hebrew_date(self, items): | ||
parts = {} | ||
for child in items: | ||
if child.data in ["year", "month", "day"]: | ||
# in each case we expect one integer value; | ||
# anonymous tokens convert to their value and cast as int | ||
value = int(child.children[0]) | ||
parts[str(child.data)] = value | ||
|
||
# initialize and return an undate with islamic year, month, day and | ||
# islamic calendar | ||
return HebrewUndate(**parts) | ||
|
||
# year translation is not needed since we want a tree with name year | ||
# this is equivalent to a no-op | ||
# def year(self, items): | ||
# return Tree(data="year", children=[items[0]]) | ||
|
||
def month(self, items): | ||
# month has a nested tree for the rule and the value | ||
# the name of the rule (month_1, month_2, etc) gives us the | ||
# number of the month needed for converting the date | ||
tree = items[0] | ||
month_n = tree.data.split("_")[-1] | ||
return Tree(data="month", children=[month_n]) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
from undate.converters.calendars.hijri.converter import HijriDateConverter | ||
|
||
__all__ = ["HijriDateConverter"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
Testing against Python 3.13 alpha may cause unreliable CI results
The workflow is currently testing against Python 3.13, which is in alpha stage. The project has minimal version constraints (only for strenum < 3.11) but testing against pre-release Python versions can lead to:
Consider:
Example modification:
🔗 Analysis chain
Consider the stability implications of Python 3.13.
Python 3.13 is currently in alpha/beta stage. While testing against future Python versions is good practice, including it in the test matrix this early might lead to false failures due to pre-release issues.
Let's check the package compatibility:
Consider adding Python 3.13 to an optional test matrix or waiting until it reaches release candidate status.
Also applies to: 24-24
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
Length of output: 131
Script:
Length of output: 1373
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is a GenAI artifact since its training cutoff date is prior to October 2024? Python 3.13 is not in alpha
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ColeDCrawford, you're absolutely right—Python 3.13 is not in alpha as of December 2024. Including it in the CI test matrix is appropriate. Thank you for pointing out the oversight.
✏️ Learnings added