Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

date2num conversion wrong for (some) python datetimes #354

Closed
kmuehlbauer opened this issue Jan 8, 2025 · 8 comments
Closed

date2num conversion wrong for (some) python datetimes #354

kmuehlbauer opened this issue Jan 8, 2025 · 8 comments

Comments

@kmuehlbauer
Copy link

To report a non-security related issue, please provide:

  • the version of the software with which you are encountering an issue
    cftime v1.6.4

  • environmental information (i.e. Operating System, compiler info, java version, python version, etc.)

python: 3.12.4 | packaged by conda-forge | (main, Jun 17 2024, 10:23:07) [GCC 12.3.0]
python-bits: 64
OS: Linux
OS-release: 5.14.21-150500.55.83-default
machine: x86_64
  • a description of the issue with the steps needed to reproduce it

I'm trying to wrap my head around how cftime converts datetimes of different provencience to the numerical representation (via cftime.date2num). First a working example with python datetime and cftime datetime with standard and proleptic gregorian calendars. The output is as expected, two proleptic_gregorian and one standard calendar. Also the roundtrip works, when using the appropriate calendar.

import numpy as np
import cftime
# create dates 
# 1 - python datetimes (proleptic_gregorian)
# 2 - cftime datetime standard-calendar
# 3 - cftime datetime proleptic_gregorian-calendar
dates1 = np.array(["0002"], dtype="datetime64[s]").astype("M8[us]").astype(datetime.datetime)
calendar1 ="standard"
calendar2 ="proleptic_gregorian"
dates2 = np.array([cftime.datetime(t0.year, t0.month, t0.day, calendar=calendar1) for t0 in dates1])
dates3 = np.array([cftime.datetime(t0.year, t0.month, t0.day, calendar=calendar2) for t0 in dates1])
print(dates1)
print(dates2)
print(dates3)
num1 = cftime.date2num(dates1, "seconds since 2000-01-01")
num2 = cftime.date2num(dates2, "seconds since 2000-01-01")
num3 = cftime.date2num(dates3, "seconds since 2000-01-01")
print(num1)
print(num2)
print(num3)
date1 = cftime.num2date(num1, "seconds since 2000-01-01", calendar2)
date2 = cftime.num2date(num2, "seconds since 2000-01-01", calendar1)
date3 = cftime.num2date(num3, "seconds since 2000-01-01", calendar2)
print(date1)
print(date2)
print(date3)

Output

[datetime.datetime(2, 1, 1, 0, 0)]
[cftime.datetime(2, 1, 1, 0, 0, 0, 0, calendar='standard', has_year_zero=False)]
[cftime.datetime(2, 1, 1, 0, 0, 0, 0, calendar='proleptic_gregorian', has_year_zero=True)]
[-63050745600]
[-63050918400]
[-63050745600]
[cftime.DatetimeProlepticGregorian(2, 1, 1, 0, 0, 0, 0, has_year_zero=True)]
[cftime.DatetimeGregorian(2, 1, 1, 0, 0, 0, 0, has_year_zero=False)]
[cftime.DatetimeProlepticGregorian(2, 1, 1, 0, 0, 0, 0, has_year_zero=True)]

Now, when selecting the output calendar this doesn't work anymore and a shift of two days occur for the python datetimes in the roundtrip:

import numpy as np
import cftime
# create dates 
# 1 - python datetimes (proleptic_gregorian)
# 2 - cftime datetime standard-calendar
# 3 - cftime datetime proleptic_gregorian-calendar
dates1 = np.array(["0002"], dtype="datetime64[s]").astype("M8[us]").astype(datetime.datetime)
calendar1 ="standard"
calendar2 ="proleptic_gregorian"
dates2 = np.array([cftime.datetime(t0.year, t0.month, t0.day, calendar=calendar1) for t0 in dates1])
dates3 = np.array([cftime.datetime(t0.year, t0.month, t0.day, calendar=calendar2) for t0 in dates1])
print(dates1)
print(dates2)
print(dates3)
num1 = cftime.date2num(dates1, "seconds since 2000-01-01", calendar1)
num2 = cftime.date2num(dates2, "seconds since 2000-01-01", calendar1)
num3 = cftime.date2num(dates3, "seconds since 2000-01-01", calendar1)
print(num1)
print(num2)
print(num3)
date1 = cftime.num2date(num1, "seconds since 2000-01-01", calendar1)
date2 = cftime.num2date(num2, "seconds since 2000-01-01", calendar1)
date3 = cftime.num2date(num3, "seconds since 2000-01-01", calendar1)
print(date1)
print(date2)
print(date3)

Output

[datetime.datetime(2, 1, 1, 0, 0)]
[cftime.datetime(2, 1, 1, 0, 0, 0, 0, calendar='standard', has_year_zero=False)]
[cftime.datetime(2, 1, 1, 0, 0, 0, 0, calendar='proleptic_gregorian', has_year_zero=True)]
[-63050745600]
[-63050918400]
[-63050918400]
[cftime.DatetimeGregorian(2, 1, 3, 0, 0, 0, 0, has_year_zero=False)]
[cftime.DatetimeGregorian(2, 1, 1, 0, 0, 0, 0, has_year_zero=False)]
[cftime.DatetimeGregorian(2, 1, 1, 0, 0, 0, 0, has_year_zero=False)]

I'd expect that cftime correctly converts the python datetimes to the wanted output calendar. I did not find anything related in the documentation so any pointers are welcome.

@jswhit
Copy link
Collaborator

jswhit commented Jan 9, 2025

I believe this is the correct answer, since the 'standard' calendar switches from julian to gregorian in 1582, skipping two calendar days.

@kmuehlbauer
Copy link
Author

kmuehlbauer commented Jan 9, 2025

How do you explain the difference in the second example between python.datetime (dates1) and cftime.datetime (update: dates3) which both use proleptic_gregorian calendar?

@jswhit
Copy link
Collaborator

jswhit commented Jan 9, 2025

You will get the answer you expect if you change the calendar kwarg to 'calendar2' in

date1 = cftime.num2date(num1, "seconds since 2000-01-01", calendar1)

@jswhit
Copy link
Collaborator

jswhit commented Jan 9, 2025

ignore the previous comment - the calendar kwarg should override the calendar associated with the datetime instances.

@kmuehlbauer
Copy link
Author

kmuehlbauer commented Jan 9, 2025

I'm thinking that when I provide python datetime.datetime date2num just assumes "proleptic_gregorian" instead of using the wanted "standard" calendar.

For the cftime.datetime(..., calendar="proleptic_gregorian") this works instead as intended.

Update: to make the above more clear and simple

import datetime
import cftime

dt1a = cftime.datetime(2, 1, 1, calendar="proleptic_gregorian")
num1 = cftime.date2num(dt1a, "seconds since 2000-01-01", "standard")
dt1b = cftime.num2date(num1, "seconds since 2000-01-01", "standard")
dt2a = datetime.datetime(2, 1, 1)
num2 = cftime.date2num(dt2a, "seconds since 2000-01-01", "standard")
dt2b = cftime.num2date(num2, "seconds since 2000-01-01", "standard")
print(dt1a, num1, dt1b)
print(dt2a, num2, dt2b)
0002-01-01 00:00:00 -63050918400 0002-01-01 00:00:00
0002-01-01 00:00:00 -63050745600 0002-01-03 00:00:00

@jswhit
Copy link
Collaborator

jswhit commented Jan 9, 2025

Yes, I think you are correct. I think this is a bug, since it works as expected if you give it an array of cftime.datetime instances with the proleptic_gregorian calendar.

jswhit added a commit that referenced this issue Jan 9, 2025
jswhit added a commit that referenced this issue Jan 9, 2025
@jswhit
Copy link
Collaborator

jswhit commented Jan 9, 2025

This should now be fixed in master with the merging of PR #355

@jswhit jswhit closed this as completed Jan 9, 2025
@kmuehlbauer
Copy link
Author

Thanks @jswhit for the quick fix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants