Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Duration and shift/2 for calendar types #13385

Merged
merged 97 commits into from
Apr 4, 2024

Conversation

tfiedlerdejanze
Copy link
Contributor

@tfiedlerdejanze tfiedlerdejanze commented Mar 2, 2024

https://groups.google.com/g/elixir-lang-core/c/-el4xZbA9sw

Preface

We currently have add/2-3 to manipulate calendar types in the standard library. These functions allow adding a specified amount of time of given unit to a date/time. The standard library currently misses means to apply more complex, or logical durations to calendar types. e.g. adding a month, a week, or one month and 10 days to a date.

Reasons for it

While similar functionality exists in libraries, such as CLDR, Timex, Tox, adding this functionality to the standard library has already been requested and discussed at multiple occasions over the past years. To list a few examples:

Furthermore the shift behaviour in the extremely popular library Timex changed in Elixir >= 1.14.3 which may have complicated the mostly lean and non-breaking language upgrade Elixir has to offer.

Elixir has a great set of modules and functions that deal with date and time, the APIs are consistent and shift/2-3 should fit right in, solving many standard needs of various industries, be it for reporting, appointments, events, scheduling, finance... the list goes on, engineers probably face the need to shift time logically more often than not in their careers.

  • shifting date times is a common use-case (supported at the core of many programming languages)
  • shift/2-3 integrates with the calendar behaviour, enabling libraries to adapt the shift behaviour for non-iso calendars
  • many commonly used libraries implement their own Duration types which could be replaced or derived from a type in the standard library. There is potential for the Duration type to be applied to other data types in the ecosystem
  • having this functionality in the standard library provides stability, making it less likely that a change to the calendar types breaks the shift implementation in other libraries
  • libraries that provide a variant of shift/2-3 often include much more functionality that isn't needed in most projects. This complicates the search to find the right library for the project and contributes to "dependency bloat"

Technical details

Duration
A date or time must be shifted by a duration. There is an ISO8601 standard for durations, which the initial implementation is loosely following. The structure of a Duration lives in its own module with its own set of functions to create and manipulate durations. One example of where it diverts from the ISO standard, is that it implements microseconds. Microseconds in a duration are stored in the same format as in the time calendar types, meaning they integrate well and provide consistency.

The new Duration type comes with a set of arithmetic functions: add/2, subtract/2, multiply/2, negate/1. Up to discussion if we need divide/2 as well.

As of time of the proposal, duration units are intentionally singular. While not uncommon to use the plural form of years, days, etc. in the context of durations, we want to be consistent with the definition of time units in Elixir in general. These are singular as used in NaiveDateTime.add/3 or defined in types like System.time_unit

Shift
The shift behaviour is implemented as a callback on Calendar and supported by all calendar types: Date, DateTime, NaiveDateTime and Time. Date, Time and NaiveDateTime each have their own implementation of a "shift", while DateTime gets converted to a NaiveDateTime before applying the shift, and is then rebuilt to a DateTime in its original timezone. shift/2-3 also has guaranteed output types (which isn't a given in many libraries) and follows the consistent API which is established in the calendar modules.

Benchmarks

Results
Operating System: macOS
CPU Information: Apple M1 Pro
Number of Available Cores: 10
Available memory: 16 GB
Elixir 1.17.0-dev
Erlang 25.1
JIT enabled: true

Benchmark suite executing with the following configuration:
warmup: 1 s
time: 5 s
memory time: 2 s
reduction time: 2 s
parallel: 1
inputs: none specified
Estimated total run time: 1 min 40 s

Benchmarking Date.shift ...
Benchmarking DateTime.shift ...
Benchmarking DateTime.shift_non_utc ...
Benchmarking FamousLibrary.shift_date ...
Benchmarking FamousLibrary.shift_datetime ...
Benchmarking FamousLibrary.shift_datetime_non_utc ...
Benchmarking FamousLibrary.shift_naive_datetime ...
Benchmarking FamousLibrary.shift_time ...
Benchmarking NaiveDateTime.shift ...
Benchmarking Time.shift ...
Calculating statistics...
Formatting results...

Name                                           ips        average  deviation         median         99th %
Time.shift                                  2.93 M        0.34 μs  ±4539.63%        0.33 μs        0.46 μs
Date.shift                                  2.56 M        0.39 μs  ±6350.41%        0.29 μs        0.46 μs
NaiveDateTime.shift                         1.56 M        0.64 μs  ±3767.03%        0.54 μs        0.88 μs
DateTime.shift                              1.33 M        0.75 μs  ±3524.26%        0.63 μs        0.79 μs
FamousLibrary.shift_time                    1.09 M        0.92 μs  ±3084.96%        0.67 μs        1.38 μs
FamousLibrary.shift_date                    0.60 M        1.67 μs  ±1111.06%        1.29 μs           3 μs
FamousLibrary.shift_datetime                0.33 M        3.00 μs   ±594.36%        2.50 μs        5.04 μs
FamousLibrary.shift_naive_datetime          0.33 M        3.07 μs   ±483.01%        2.63 μs           5 μs
DateTime.shift_non_utc                      0.24 M        4.17 μs   ±389.34%        3.88 μs        5.38 μs
FamousLibrary.shift_datetime_non_utc      0.0715 M       13.98 μs   ±187.69%       12.04 μs       49.46 μs

Comparison:
Time.shift                                  2.93 M
Date.shift                                  2.56 M - 1.14x slower +0.0488 μs
NaiveDateTime.shift                         1.56 M - 1.88x slower +0.30 μs
DateTime.shift                              1.33 M - 2.19x slower +0.41 μs
FamousLibrary.shift_time                    1.09 M - 2.70x slower +0.58 μs
FamousLibrary.shift_date                    0.60 M - 4.88x slower +1.32 μs
FamousLibrary.shift_datetime                0.33 M - 8.78x slower +2.66 μs
FamousLibrary.shift_naive_datetime          0.33 M - 8.97x slower +2.72 μs
DateTime.shift_non_utc                      0.24 M - 12.20x slower +3.83 μs
FamousLibrary.shift_datetime_non_utc      0.0715 M - 40.93x slower +13.64 μs

Memory usage statistics:

Name                                    Memory usage
Time.shift                                   1.16 KB
Date.shift                                   0.80 KB - 0.70x memory usage -0.35156 KB
NaiveDateTime.shift                             2 KB - 1.73x memory usage +0.84 KB
DateTime.shift                               2.19 KB - 1.89x memory usage +1.03 KB
FamousLibrary.shift_time                     1.41 KB - 1.22x memory usage +0.26 KB
FamousLibrary.shift_date                     3.60 KB - 3.11x memory usage +2.45 KB
FamousLibrary.shift_datetime                 5.15 KB - 4.45x memory usage +3.99 KB
FamousLibrary.shift_naive_datetime           5.58 KB - 4.82x memory usage +4.42 KB
DateTime.shift_non_utc                       5.39 KB - 4.66x memory usage +4.23 KB
FamousLibrary.shift_datetime_non_utc         8.09 KB - 6.99x memory usage +6.93 KB

Reduction count statistics:

Name                                 Reduction count
Time.shift                                       103
Date.shift                                        94 - 0.91x reduction count -9
NaiveDateTime.shift                              212 - 2.06x reduction count +109
DateTime.shift                                   224 - 2.17x reduction count +121
FamousLibrary.shift_time                         133 - 1.29x reduction count +30
FamousLibrary.shift_date                         290 - 2.82x reduction count +187
FamousLibrary.shift_datetime                     473 - 4.59x reduction count +370
FamousLibrary.shift_naive_datetime               485 - 4.71x reduction count +382
DateTime.shift_non_utc                           552 - 5.36x reduction count +449
FamousLibrary.shift_datetime_non_utc            1143 - 11.10x reduction count +1040
shift_bench.exs
Mix.install([:famous_library, :tzdata, :benchee])
Application.put_env(:elixir, :time_zone_database, Tzdata.TimeZoneDatabase)

date = ~D[2000-01-01]
time = ~T[00:00:00]

datetime = DateTime.new!(date, time)
datetime_non_utc = DateTime.new!(date, time, "Asia/Shanghai")

naive_datetime = NaiveDateTime.new!(date, time)

datetime_opts = [year: 1, month: 14, hour: 8, minute: 30, second: 40]
date_opts = [year: 1, month: 14, week: 3, day: 2]
time_opts = [hour: 8, minute: 30, second: 40]

to_famous_opts = fn opts ->
  Keyword.new(opts, fn {key, value} -> {:"#{key}s", value} end)
end

Benchee.run(
  %{
    "FamousLibrary.shift_datetime" => fn -> FamousLibrary.shift(datetime, to_famous_opts.(datetime_opts)) end,
    "FamousLibrary.shift_datetime_non_utc" => fn -> FamousLibrary.shift(datetime_non_utc, to_famous_opts.(datetime_opts)) end,
    "FamousLibrary.shift_naive_datetime" => fn -> FamousLibrary.shift(naive_datetime, to_famous_opts.(datetime_opts)) end,
    "FamousLibrary.shift_date" => fn -> FamousLibrary.shift(date, to_famous_opts.(date_opts)) end,
    "FamousLibrary.shift_time" => fn -> FamousLibrary.shift(time, to_famous_opts.(time_opts)) end,
    "DateTime.shift" => fn -> DateTime.shift(datetime, datetime_opts) end,
    "DateTime.shift_non_utc" => fn -> DateTime.shift(datetime_non_utc, datetime_opts) end,
    "NaiveDateTime.shift" => fn -> NaiveDateTime.shift(naive_datetime, datetime_opts) end,
    "Date.shift" => fn -> Date.shift(date, date_opts) end,
    "Time.shift" => fn -> Time.shift(time, time_opts) end
  },
  warmup: 1,
  time: 5,
  memory_time: 2,
  reduction_time: 2
)

Outlook

After adding the Duration type and shift behaviour to the standard library, the following things could be explored and derived from the initial work:

  • Implementing a protocol that allows Duration to be applied to any data type, not just dates and times.
  • Implement calendar callbacks for durations: duration_to_seconds/1 and duration_from_seconds/1.
  • Implement more utility functions for Duration like Duration.between/2
  • Implement a range-like data type that allows us to do recurring constructs on any data type. For example, Duration.interval(~D[2000-01-01], month: 1), when iterated, would emit {:ok, date} | {:error, start, duration, reason} entries
  • Implement a sigil for easy creation of durations: ~P[3 hours and 10 minutes] or respecting the ISO format ~P[T3H10M]
  • Refactoradd/2-3 to reuse the calendar shift functions
  • Support for shift options, e.g. Date.shift(~D[2021-02-28], [month: 1], round_month: true) to result in ~D[2021-03-31]

Reasons against it

While I am convinced that adding shift/2-3 to the standard library would be very beneficial, nothing really speaks against the points mentioned above to be implemented in a library instead. However, something as crucial and central as date/time manipulation should still be part of the standard library, negating the risk of breaking changes, inconsistent behaviour and outdated or too unique ergonomics which aren't widely applicable, unlike what should be part of the standard library.

@josevalim
Copy link
Member

Hi folks, I have been thinking about this and, the more I think about it, the more I wonder if we should instead introduce a proper Duration type. Ecto could benefit from such a type. As well as Explorer.

The Duration module could also provide regular arithmetic operations between durations which are well defined (+, - and *), it could provide a protocol for applying the duration to any data type, as well as the ability to provide random ranges (aka intervals) given a start value and a duration (ranges can either be multiplicative or additive).

The specification of durations can also be according to the iso standard: https://en.wikipedia.org/wiki/ISO_8601#Durations, including a ~P sigil. Although I would skip floating points and weeks from the initial implementation. It probably makes sense to first scratch this itch as a library though.

Thoughts?

@kipcole9
Copy link
Contributor

kipcole9 commented Mar 2, 2024

Hi folks, I have been thinking about this and, the more I think about it, the more I wonder if we should instead introduce a proper Duration type. Ecto could benefit from such a type. As well as Explorer.

Yes please. I implemented Cldr.Duration which might help kick-start things. I also have a full ISO8601 duration parser as part of Tempo in case thats useful to anyone.

@tfiedlerdejanze tfiedlerdejanze changed the title Date.shift/2 for ISO dates Implement Date.shift/2 in Calendar.ISO Mar 2, 2024
@tfiedlerdejanze tfiedlerdejanze changed the title Implement Date.shift/2 in Calendar.ISO Implement Date.shift/2 for Calendar.ISO Mar 2, 2024
@josevalim
Copy link
Member

Hi @tfiedlerdejanze. After more pondering, I think we should go with a duration datatype. The datatype will provide a general abstraction, including ranges, that works with several datatypes. So instead of adding several functions to four modules (Date, Time, NaiveDateTime, DateTime), we can add them to a single one.

In any case, I wanted to thank you for the PR. We have been thinking about adding Date.shift for quite a while and, once we see the code, it really helps move the conversation forward and make us ask the right questions. This would not have happened without your work. ❤️

@josevalim josevalim closed this Mar 3, 2024
@tfiedlerdejanze
Copy link
Contributor Author

tfiedlerdejanze commented Mar 3, 2024

Thank you @josevalim and @kipcole9, appreciate the reviews and thoughts made around that topic! I like the idea of the Duration type, and actually have needed this sort of shift behaviour mostly while working with structures like Postgrex.Interval, which appears to close the loop to that idea :)

I am wondering about what the Duration api would actually look like though. One of the gripes i had with libraries implementing similar shift behaviours, was that they exposed a single interface to receive and return the same date/time struct, which often makes it hard to judge with what kind of date/time struct one is working with.
One of the things I like about the add/2 and add/3 functions is that there is none of this ambiguity. Something like Duration.shift/2 which receives a duration struct and one of date, datetime, naivedatetime could result in the same ambiguity i was looking to escape.
Also, i was wondering how the api would look for something equivalent to Date.shift(date, year: 2, month: -4) given durations tend to be non-negative, would this need to be computed using two durations, one added and one substracted?

Sorry for all the questions, if you were open to discuss the topic elsewhere in more detail, or maybe even share part of the vision, i'd love to contribute where possible!

@josevalim
Copy link
Member

I think your point on API is valid, and maybe that's argument to add a {Date,Time,...}.shift in the future. However, it is worth pointing out that Date is still a bit loose in that it allows downcasts:

iex(1)> Date.add ~N[0001-01-01 00:00:00], 1
~D[0001-01-02]

But the output type is indeed clearer.

Regarding positive/negative, I found this discussion to be interesting: moment/moment#2408. It seems to be a common needed and other libraries made extensions to allow it.

Perhaps a better way to frame it is: Duration will give us a better foundation. But we might still add Date/Time/DateTime/NaiveDateTime.shift for readability and its downcast properties, on top of the Duration foundation. This (and the need for Calendar.shift_date and Calendar.shift_naive_date_time) may be enough reasons to add the feature back to Elixir.

@kipcole9
Copy link
Contributor

kipcole9 commented Mar 3, 2024

Negative time scale components are permitted per ISO-8601-2. Here's a screenshot of some examples from the standard (which I purchased). So it is part of the standard, just part of ISO-8601-2 which itself is confusingly named "Extensions".

image

@tfiedlerdejanze
Copy link
Contributor Author

Negative time scale components are permitted per ISO-8601-2

Thanks for sharing that, glad to read it's generally supported!

But the output type is indeed clearer.

This is actually the most important to me and one of the many things the standard library really shines at.

Daring suggestion, how about moving towards supporting Date.shift/2 while keeping the broader purpose of a Duration type in mind:

  1. Add lib/calendar/duration.ex only defining the type as specified by iso8601
  2. Use the Duration type in Calendar.shift_date/2 (it might still be useful to implement the shifting behaviour in the Calendar)
  3. Change what is in this PR so a Date can also be shifted by time duration attributes.
  4. Implement Calendar.shift_naive_datetime/2

In parallel to the above, the Duration protocol could be refined + defined to then partially replace the shift implementations as they already work with Duration.t()

@josevalim
Copy link
Member

Daring suggestion, how about moving towards supporting Date.shift/2 while keeping the broader purpose of a Duration type in mind:

Unfortunately I am worried this may leave us with an outdated or wrong API. For example, I brought we may need to return {:ok, ...} | {:error, ...} because of DateTime. It is hard to say all other decisions we may miss if we solve just one particular problem instead of looking at the whole. Plus, I would rather prefer to pass a Duration to Calendar.ISO.shift_date, if we ever have such a type, rather than a keyword list.

@tfiedlerdejanze
Copy link
Contributor Author

tfiedlerdejanze commented Mar 3, 2024

Plus, I would rather prefer to pass a Duration to Calendar.ISO.shift_date, if we ever have such a type, rather than a keyword list.

That's my suggestion in 2). It'd be Calendar.ISO.shift_date(Calendar.date(), Calendar.Duration.t()) after we implemented the bare type as

defstruct [year: 0, month: 0, day: 0, hour: 0, minute: 0, second: 0, microsecond: 0]

@type t :: %Calendar.Duration{
        year: integer(),
        month: integer(),
        week: integer(),
        day: integer(),
        hour: integer(),
        minute: integer(),
        second: integer(),
        microsecond: integer()
      }

in lib/calendar/duration.ex. I acknowledge the danger of missing something else though. Would you be opposed if i gave this approach a shot, maybe including NaiveDateTime.shift/2 and DateTime.shift/2? In the worst case it may reveal other things we missed before tackling the issue from another angle.

@josevalim
Copy link
Member

You are definitely welcome to give it a shot but I just want to be clear that it is unlikely we would merge it at this stage. But I agree that we are all likely to learn from it.

For example, just in the proposal above, I can think of:

  1. Should unset fields be zero or nil? For example, if I specify Time.shift(time, years: 0), should it raise?
  2. Are fields singular or plural?
  3. Should we support weeks? cc @kipcole9

@tfiedlerdejanze
Copy link
Contributor Author

tfiedlerdejanze commented Mar 3, 2024

if I specify Time.shift(time, years: 0), should it raise?

Given the plan would be to add the Duration type first i think all fields should be required, as the spec would be: shift(Time.t(), Duration.t()) :: {:ok, Time.t()}.

Are fields singular or plural?

For consistency with the add functions, i'd keep the duration fields singular.

Should we support weeks

imho weeks on Duration could be supported. It should be up to the calendars and/or other implementations building on top of Duration to decide if they support it or not.

@josevalim
Copy link
Member

Ok, let me slightly rephrase the question. Imagine we add a ~P sigil to follow ISO. Should this be valid?

Time.shift(time, ~P"0YT1M")

Or should that be the same as:

Time.shift(time, ~P"T1M")

imho weeks on Duration could be supported. It should be up to the calendars and/or other implementations building on top of Duration to decide if they support it or not.

That would be confusing in my opinion unless the huge majority of types can handle weeks properly, which @kipcole9 would be able to tell us. But I'd think so. :)

@kipcole9
Copy link
Contributor

kipcole9 commented Mar 3, 2024

Time scale unit values of 0

ISO 8601 is strangely silent on the issue of 0 time scale components. But since it says explicitly:

each durational unit represented by a time scale component (such as [monthE]) may be omitted.

Therefore I think that Time.shift(time, ~P"0YT1M") == Time.shift(time, ~P"T1M").

Support :week in a duration

I think :week is reasonable to support. All contemporary calendars define a week to be 7 days. Thats not true historically - the French Revolutionary Calendar has "weeks" of 10 days long - but they aren't called "weeks" so that can be an escape hatch.

I see two approaches:

  1. Add a callback days_in_week/1
  2. Fix a week to mean 7 days

I would prefer (1) since its a small addition and easily understood.

Notes singular v plural

I think for consistency, the singular form is to be preferred since the rest of Calendar is singular. In Cldr.Calendar.Duration I settled on singular field names for that reason. But I ended up with plural forms in Cldr.Calendar.plus/3 because it feld a more natural expression. It was probably not the right idea.

Notes on weeks

While I think it's safe to add a :week field to Calendar.Duration, weeks in general aren't so simple. If we take the proleptic Gregorian calendar and ask "how many weeks in the year" the answer is "it depends". Week 1, by ISO standards, doesn't start on the first day of the year. And since neither 365 nor 366 is divisible by 7, the last "week" could end up being less than 7 days depending on when Week 1 starts. Then you have "week of month" concept which just exacerbates the problem.

I have all of this implemented in ex_cldr_calendars but until (or if) Elixir implements a Calendar.ISOWeek calendar I don't think :week should be part of the implementation beyond Calendar.Duration.

@tfiedlerdejanze
Copy link
Contributor Author

Yes, sorry i misread your initial question actually, unset fields should be treated as zero :) thanks for these elaborate notes @kipcole9 especially on the topic of weeks!

@tfiedlerdejanze
Copy link
Contributor Author

Looks great, really exciting to see this get into core! 💜

Appreciate the review @sabiwara!

Copy link
Member

@josevalim josevalim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @tfiedlerdejanze!

This looks great and we got the go ahead to merge it. 🎉

I have dropped more suggestions regarding the docs. I also think we are missing two things in the docs:

  • We should explicitly mention that we round down. We have examples, but we should also mention it, and call it out in the example like we did for leap years by adding a tiny comment on top

  • We should have at least one example with a negative unit :) you can change one of the existing ones in each data type to be negative

Once those are in place and comments have been resolved, we can ship it!

@tfiedlerdejanze
Copy link
Contributor Author

Thanks for the suggestions and reviews everyone, all were very helpful and appreciated! I think I addressed all the latest comments, feel free to take another look.

I also have a couple more PRs in the pipeline to

  • consolidate and dry up some code we're now using for add as well as shift
  • propose an ISO 8601 formatted Duration sigil ~P

that I am looking forward to discuss! :)

@josevalim
Copy link
Member

I also have a couple more PRs in the pipeline to

Just as a heads up. We will definitely accept the consolidation (thank you!), but there is no consensus on sigil ~P at the moment. I think the ISO syntax is not helpful to be made into a sigil and we are divided between a natural language one. We should probably wait for those until we use the durations more in practice, and see if a sigil is really necessary. I just dropped a comment on how we can improve the default inspect implementation.

@josevalim josevalim merged commit 38f62a8 into elixir-lang:main Apr 4, 2024
9 checks passed
@josevalim
Copy link
Member

Thank you @tfiedlerdejanze for your patience and delivering this important feature! 💚 💙 💜 💛 ❤️

@tfiedlerdejanze tfiedlerdejanze changed the title Implement calendar callbacks for shift/2-3 on calendar types Implement Duration and shift/2 for calendar types Apr 4, 2024
gldubc pushed a commit to gldubc/elixir that referenced this pull request Apr 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

8 participants