Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing and unparsing leads to 1h time shift? #17

Open
lindig opened this issue Feb 16, 2020 · 24 comments
Open

Parsing and unparsing leads to 1h time shift? #17

lindig opened this issue Feb 16, 2020 · 24 comments

Comments

@lindig
Copy link

lindig commented Feb 16, 2020

I assume I'm doing something wrong here but I am baffled by this behaviour:

utop # ISO8601.Permissive.datetime_tz "2020-02-14T06:56:42Z";;
- : float * float option = (1581667002., Some 0.)

utop # ISO8601.Permissive.datetime_tz "2020-02-14T06:56:42Z" |> fst |> ISO8601.Permissive.string_of_datetime;;
- : string = "2020-02-14T07:56:42"

The time printed is shifted by one hour from the time that was parsed. How would I prevent that?

@lindig
Copy link
Author

lindig commented Feb 16, 2020

The timestamp 1581667002 refers to Friday, 14 February 2020 07:56:42 GMT+00:00. So it appears to me that the unparsing (printing) is correct but that the parsing of 2020-02-14T06:56:42Z = 1581667002 is not correct.

@c-cube
Copy link
Member

c-cube commented Feb 17, 2020

Is that on master? I think on master there's some changes to make it correct in #5 which has been rebased, but it caused a (small) regression in toml. Could you try on #5?

@lindig
Copy link
Author

lindig commented Feb 17, 2020

This was not on master but on 0.2.6. I'll try on master.

@lindig
Copy link
Author

lindig commented Feb 17, 2020

I believe this happens on master as well. Running the tests on master (on macOS) I also see a test failure:

:ISO8601.ml $ make test
Done: 48/50 (jobs: 1).........................................................F...
==============================================================================
Failure: ISO8601:2:[UTILS]:1:[UTILS mkdate]:4:583804800.

expected: 583804800. but got: 583808400.
------------------------------------------------------------------------------
Ran: 61 tests in: 0.04 seconds.
FAILED: Cases: 61 Tried: 61 Errors: 0 Failures: 1 Skip:  0 Todo: 0 Timeouts: 0.
ISO8601_TEST alias tests/runtest (exit 1)

I tried to use the branch from #5 but it is quite out of date (using ocamlbuild). So far did not try it.

@c-cube
Copy link
Member

c-cube commented Feb 17, 2020

Ah my bad, what about wip-merge-pr-5 (which is the rebase of #5)?

@lindig
Copy link
Author

lindig commented Feb 17, 2020

On branch wip-merge-pr-5 the issue no longer exists. Running make test I see some test failures.

utop # ISO8601.Permissive.datetime_tz "2020-02-14T06:56:42Z" |> fst |> ISO8601.Permissive.string_of_datetime;;
- : string = "2020-02-14T06:56:42"

@c-cube
Copy link
Member

c-cube commented Feb 17, 2020

Interesting, I don't see a test error on my end. I'd like to merge this at some point but am not confident enough in its correctness. Frankly, help would be very appreciated.

@undu
Copy link

undu commented Feb 19, 2020

Tests only work for me on branch wip-merge-pr-5 when adding --profile=release to dune runtest

@lindig
Copy link
Author

lindig commented Feb 20, 2020

This looks suspicious (on master):

src/ISO8601_lexer.mll:    let offset = fst (Unix.mktime (Unix.gmtime 0. )) 

This computes an offset from the date of the epoch and applies it. For me this is

utop # 
fst (Unix.mktime (Unix.gmtime 0.));;
- : float = -3600.

I don't understand what it tries to do.

@c-cube
Copy link
Member

c-cube commented Feb 20, 2020

well, me neither. I think it's different on #5 ?

@undu
Copy link

undu commented Feb 20, 2020

What's the timezone of your system set to?

val mktime : tm -> float * tm

Convert a date and time, specified by the tm argument, into a time in seconds, as returned by Unix.time. The tm_isdst, tm_wday and tm_yday fields of tm are ignored. Also return a normalized copy of the given tm record, with the tm_wday, tm_yday, and tm_isdst fields recomputed from the other fields, and the other fields normalized (so that, e.g., 40 October is changed into 9 November). The tm argument is interpreted in the local time zone.

@lindig
Copy link
Author

lindig commented Feb 20, 2020

: tmp $ timedatectl 
               Local time: Thu 2020-02-20 17:02:47 GMT
           Universal time: Thu 2020-02-20 17:02:47 UTC
                 RTC time: Thu 2020-02-20 17:02:47
                Time zone: Europe/London (GMT, +0000)
System clock synchronized: yes
              NTP service: inactive
          RTC in local TZ: no

@undu
Copy link

undu commented Feb 20, 2020

the Stdlib does a system call, don't see anything egregious at first sight.

@sagotch
Copy link
Collaborator

sagotch commented Feb 21, 2020

mktime is interpreted in local timezone, so this is meant to compute the offset between local timezone and UTC, and suppress this offset from the given date in order to get the UTC time.

On my computer, I am at GMT+1, do I have to add 3600 seconds to get the correct UTC time from mktime.

$ timedatectl
               Local time: Fri 2020-02-21 09:32:22 CET
           Universal time: Fri 2020-02-21 08:32:22 UTC
                 RTC time: Fri 2020-02-21 08:32:22
                Time zone: Europe/Paris (CET, +0100)
System clock synchronized: yes
              NTP service: active
          RTC in local TZ: no
utop # fst (Unix.mktime (Unix.gmtime 0.));;
- : float = -3600.

EDIT: In #5, I see that it has been removed. It would be nice to test the library in different time zones. It might give good results in UK but false ones in other places.
https://github.com/ocaml-community/ISO8601.ml/pull/5/files#diff-4c4ab45718b4810726f53bea3f21b0a9R6

@lindig
Copy link
Author

lindig commented Feb 21, 2020

As you will have noticed, I'm in GMT/UTC and still get the same -3600 offset.

@sagotch
Copy link
Collaborator

sagotch commented Feb 21, 2020

As you will have noticed, I'm in GMT/UTC and still get the same -3600 offset.

Yes indeed, but according to what documentation says, you should not. Unless I misunderstood the documentation.

@lindig
Copy link
Author

lindig commented Feb 21, 2020

I had asked @undu and he sees it on his system (also in GMT/UTC) as well. And given that this is coming from a system call, I am reluctant to believe that this is a bug in Unix.gmtime.

@kit-ty-kate
Copy link

kit-ty-kate commented Jul 26, 2020

A similar issue seems to appear during the tests of toml.5.0.0 (which uses this library) on opam-repository's CI.

- expected: title = "TOML Example"
[...]
- dob = 1979-05-27T08:32:00+00:00
[...]
- 
- but got: title = "TOML Example"
[...]
- dob = 1979-05-27T07:32:00+00:00
[...]

From what I understand of the discussion it seems to be a known issue so I'm just going to disable the tests for this package.

kit-ty-kate added a commit to kit-ty-kate/opam-repository that referenced this issue Jul 26, 2020
@darrenldl
Copy link

darrenldl commented Apr 8, 2021

Heard about this issue from Discord and made some comments over there, thought I'd copy the gist of my observation over here.

My speculation is that line 6 let offset = fst (Unix.mktime (Unix.gmtime 0. )) in in ISO8601_lexer.mll fails to take into account whether the local time zone was observing DST at the Unix epoch time, thus resulting in off-by-one-hour "base" offset calculation. This would seem to match @lindig's experience with Europe/London time zone, which observes DST for the entirety of 1970.

A straightforward fix would be to tune said offset calculation according to whether DST was being observed at the time of Unix epoch. But this would still assume the base offset remains unchanged over the years, which might hold for contemporary periods, but I'll need to verify.

EDIT: I haven't read any of the PR's in detail, so I'm not sure if there's already a fix somewhere.

@darrenldl
Copy link

darrenldl commented Apr 8, 2021

My above observation as it turns out was somewhat getting close, but not entirely correct.

I made some code to test if any of the time zones observe changed "base" offset, and a lot of them failed, which includes Europe/London:

let l =
  let open Timere in
  let unix_epoch = 0L in
  (* Time_zone.available_time_zones *)
  [ "Europe/London" ]
  |> List.map (fun name ->
      let tz = Time_zone.make_exn name in
      (name,
       Time_zone.Raw.to_transitions tz
       |> List.filter (fun ((start, end_exc), _entry) ->
            unix_epoch < end_exc
          )
       |> List.map (fun ((start, end_exc), Time_zone.{ is_dst; offset }) ->
            (* Printf.printf "start: %Ld, end_exc: %Ld, is_dst: %b, offset: %d\n" start end_exc is_dst offset; *)
            offset - if is_dst then 3600 else 0
          )
       |> List.sort_uniq compare
      )
     )
  |> List.filter (fun (name, offsets) -> List.length offsets > 1)

gives

[("Europe/London", [0; 3600])]

indicating two unique "base" offsets.

In other words, a "base" offset does not seem to be necessarily well defined even if we only start at Unix epoch (1970 Jan 1 00:00:00 UTC). Examining the incongruence further, we can indeed observe that Europe/London "breaks" the link between isdst, base offset, and actual offset during the period around 1970 (not always the case that actual offset = base offset + if isdst then 3600 else 0):

via zdump command (lines of interest are marked with <== on the right hand side)

$ zdump -V -c 1968,1974 Europe/London
Europe/London  Sun Feb 18 01:59:59 1968 UT = Sun Feb 18 01:59:59 1968 GMT isdst=0 gmtoff=0
Europe/London  Sun Feb 18 02:00:00 1968 UT = Sun Feb 18 03:00:00 1968 BST isdst=1 gmtoff=3600
Europe/London  Sat Oct 26 22:59:59 1968 UT = Sat Oct 26 23:59:59 1968 BST isdst=1 gmtoff=3600
Europe/London  Sat Oct 26 23:00:00 1968 UT = Sun Oct 27 00:00:00 1968 BST isdst=0 gmtoff=3600 <==
Europe/London  Sun Oct 31 01:59:59 1971 UT = Sun Oct 31 02:59:59 1971 BST isdst=0 gmtoff=3600 <==
Europe/London  Sun Oct 31 02:00:00 1971 UT = Sun Oct 31 02:00:00 1971 GMT isdst=0 gmtoff=0
Europe/London  Sun Mar 19 01:59:59 1972 UT = Sun Mar 19 01:59:59 1972 GMT isdst=0 gmtoff=0
Europe/London  Sun Mar 19 02:00:00 1972 UT = Sun Mar 19 03:00:00 1972 BST isdst=1 gmtoff=3600
Europe/London  Sun Oct 29 01:59:59 1972 UT = Sun Oct 29 02:59:59 1972 BST isdst=1 gmtoff=3600
Europe/London  Sun Oct 29 02:00:00 1972 UT = Sun Oct 29 02:00:00 1972 GMT isdst=0 gmtoff=0
Europe/London  Sun Mar 18 01:59:59 1973 UT = Sun Mar 18 01:59:59 1973 GMT isdst=0 gmtoff=0
Europe/London  Sun Mar 18 02:00:00 1973 UT = Sun Mar 18 03:00:00 1973 BST isdst=1 gmtoff=3600
Europe/London  Sun Oct 28 01:59:59 1973 UT = Sun Oct 28 02:59:59 1973 BST isdst=1 gmtoff=3600
Europe/London  Sun Oct 28 02:00:00 1973 UT = Sun Oct 28 02:00:00 1973 GMT isdst=0 gmtoff=0

via TZ=Europe/London utop

utop # Unix.mktime (Unix.gmtime 0.);;
- : float * Unix.tm =
(-3600.,
 {Unix.tm_sec = 0; tm_min = 0; tm_hour = 0; tm_mday = 1; tm_mon = 0;
  tm_year = 70; tm_wday = 4; tm_yday = 0; tm_isdst = false})

So at least for Europe/London specifically, checking for isdst in base offset calculation would not have fixed the issue.

Namely we have a case where sampling at Unix epoch does not yield a useful base offset.

I'll add that my initial observation was based on information from https://www.timeanddate.com/time/change/uk/london where it reads "DST observed all year" for 1969 and 1970, but this doesn't seem to line up properly with the isdst flag (unless I'm misinterpeting the outputs attached above).

@darrenldl
Copy link

A small update:

I've patched ISO8601 to use Timere on my fork at try-timere branch https://github.com/darrenldl/ISO8601.ml/tree/try-timere ,
and it passes all 57 tests.

Timere does bring in a lot more complexity and dependencies, so I am not entirely sure if pushing toward using Timere under the hood is necessarily a good option for all users, but it doesn't seem to be a bad tradeoff if having difficult to resolve bugs is the only other option.

@lindig
Copy link
Author

lindig commented Dec 27, 2021

The actual Unix mktime call takes 3 possible values for the tm_isdst field: 0, 1, or -1. Looking at the OCaml bindings, the bool value from Unix.mktime is ignored (but at some time in the past wasn't):

This causes the Unix implementation to guess DST, which is almost always possible except for the hour when DST comes in or out of effect.

I believe we should not guess any offset but just accept the result of Unix.mktime as the correct timestamp for the argument. Note that Unix.mktime returns an updated tm structure in addition to the timestamp.

@lindig
Copy link
Author

lindig commented Dec 27, 2021

I suspect there is some misconception about the Unix epoch at play here: On a Linux system the timestamp for Jan 01 1970 is not zero, as we (and the unit tests) expect:

lindig@pi:~ $ cat /etc/timezone 
Europe/London
lindig@pi:~ $ date -d 'Jan 01 1970 00:00:00' +%s
-3600

This agrees with what I see on macOS as well. The timestamp at the epoch is not zero.

lindig added a commit to lindig/ISO8601.ml that referenced this issue Dec 27, 2021
The current implmentation calculates a an offset based on Daylight
Saving Time being in effect at the Epoch:

  t -. offset +. (if tm.Unix.tm_isdst then 3600. else 0.)

This leads to suprises in the Europe/London timezone where unparsing and
parsing a date leads to an offset - see this issue:

  ocaml-community#17

The following change eliminates this and makes all unit tests pass.
Somewhat surprisingly the knowledge about this offset is encoded both in
the implementation and the unit tests.

Likewise surprising, the offset is not zero in the Europe/London
timezone but it seems to be consistent over various Unix
implementations.

Signed-off-by: Christian Lindig <[email protected]>
@lindig
Copy link
Author

lindig commented Jan 2, 2022

The failure of the unit test can be easily reproduced (on macOS for me, on two systems):

TZ=Europe/London make test

There is no unit test failure for TZ=US/Eastern, TZ=US/Central, or TZ=Europe/Berlin. Is Europe/London failing universally or could this be a macOS problem?

Looking at @darrenldl's comment I agree that looking at the UTC offset at timestamp 0 is not helpful because it is wrong for London and this is coming from the underlying time zone information in the OS and not the OCaml implementation.

Europe/London  Sat Oct 26 23:00:00 1968 UT = Sun Oct 27 00:00:00 1968 BST isdst=0 gmtoff=3600
Europe/London  Sun Oct 31 01:59:59 1971 UT = Sun Oct 31 02:59:59 1971 BST isdst=0 gmtoff=3600
Europe/London  Sun Oct 31 02:00:00 1971 UT = Sun Oct 31 02:00:00 1971 GMT isdst=0 gmtoff=0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants