Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add formatting API #61

Open
wants to merge 2 commits into
base: v0.6
Choose a base branch
from
Open

Add formatting API #61

wants to merge 2 commits into from

Conversation

jiripudil
Copy link
Contributor

@jiripudil jiripudil commented Jun 20, 2022

This is my take on the formatting API (#3). It adds the format() method to Local(Date|DateTime|Time) and ZonedDateTime. The method accepts an implementation of the DateTimeFormatter interface.

The API is designed pretty much like parsing, only backwards. The format() method in each date-time class creates an instance of DateTimeFormatContext that holds individual fields of the value. This context is then sent into the formatter's format() method and a formatted string is returned.

The context contains the individual fields, but also exposes the original value as a whole. This gives custom implementations finer control over the specifics of their formatting.

Built-in formatters

This PR provides two elementary implementations of a DateTimeFormatter:

  • NativeFormatter is the go-to implementation for developers used to the native PHP's formatting options and format string alphabet. It delegates to the native DateTime's format() method, making sure that the format string doesn't refer to any fields that are not available on the formatted value.

    $localDate->format(NativeFormatter::of('d.m.Y'));
    // returns '20.06.2022'
    
    $localTime->format(NativeFormatter::of('d.m.Y'));
    // throws exception, format string refers to fields not available on LocalTime
  • IntlFormatter works on top of ext-intl. It has several factory methods:

    • ofDate(), ofTime(), and ofDateTime() take a locale and one of the predefined ICU formats for date, time, or both, respectively. Again, it prevents you from formatting a LocalTime with a date-based formatter.

      $localDate->format(IntlFormatter::ofDate('en_US', IntlFormatter::LONG));
      // returns 'June 20, 2022'
      
      $localTime->format(IntlFormatter::ofDate('en_US', IntlFormatter::LONG));
      // throws exception, cannot use a date-based formatter on LocalTime
    • ofPattern() gives you the liberty to set your own custom pattern. This uses the ICU SimpleFormat a.k.a. the one Java uses:

      $localDate->format(IntlFormatter::of('en_US', 'MMMM d, y'));
      // returns 'June 20, 2022'
      
      $localTime->format(IntlFormatter::ofDate('en_US', 'MMMM d, y'));
      // throws exception, cannot use a date-based formatter on LocalTime
    • I am particularly fond of ofSkeleton() which utilizes the IntlDatePatternGenerator added in PHP 8.1. This method only requires you to specify what kind of data you want in the pattern, and it generates the most fitting pattern for you:

      $localDate->format(IntlFormatter::ofSkeleton('en_US', 'yMMMMd'));
      // returns 'June 20, 2022'
      
      $localTime->format(IntlFormatter::ofSkeleton('en_US', 'yMMMMd'));
      // throws exception, cannot use a date-based formatter on LocalTime

Parity with parser

I was shortly considering sharing the same classes for formatting and parsing, like Java does. I decided not to, though.

I've skimmed through the several discussions in ECMAScript's Temporal API proposal and the outcome appears to be that parsing non-standard localized date strings is fragile, undeterministic guesswork and should be left to the application code that knows best what formats it works with. And there's already PatternParser for that.

While we technically could write a strict parser that works on top of DateTime::createFromFormat or IntlDateFormatter::parse, it would add additional difficulties such as splitting the result of these method calls to a set of fields in DateTimeParseResult. We can revisit the idea later if it is desired, and introduce it as a new DateTimeParser implementation.


public function hasField(string $name): bool
{
return isset($this->fields[$name]) && $this->fields[$name];

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does && $this->fields[$name]; check for? a non-empty-string?
maybe better to be explicit for readability? && '' !== $this->fields[$name];

}
}

return '';

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return null here instead? empty string feels a bit odd, unless i'm missing something

public function getOptionalField(string $name): string
{
if (isset($this->fields[$name])) {
if ($this->fields[$name]) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here? '' !== $this->fields[$name] ?


public function addField(string $name, string $value): void
{
$this->fields[$name][] = $value;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not prevent adding an empty $value here? is there any reason to allow it when it seems they are filtered out below?


foreach ($this->supportedValueTypes as $supportedValueType) {
if ($value instanceof $supportedValueType) {
return $value->toNativeDateTimeImmutable()->format($this->format);
Copy link
Contributor

@solodkiy solodkiy Jun 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prior php 8.0 format could return false on some error, so I think we should convert this case to Exception

/**
* @return list<string>
*/
private static function getSupportedValueTypes(string $format): array
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May we make it public in some Helper class to add unit tests to this method?
It looks tricky.

Copy link
Contributor

@tigitz tigitz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this! It's definitely a big step towards having a proper formatting API.

I agree feature partiy with Java API is tricky and could be implemented in a second step anyway if there's enough traction for it. I know @BenMorel has strong opinions about matching the Java API as much as possible so let's see what he thinks about it.

Also from a DX perspective it could become daunting to type the Formatter class every time. I'm wondering if a toNativeFormat and toIntlFormat could be considered.

tests/Formatter/DateTimeFormatContextTest.php Show resolved Hide resolved
$this->assertSame('159', $context->getField(Field\DayOfYear::NAME));
$this->assertSame('23', $context->getField(Field\WeekOfYear::NAME));
$this->assertSame('6', $context->getField(Field\MonthOfYear::NAME));
$this->assertSame('2022', $context->getField(Field\Year::NAME));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't a getFields here be helpful? You could then do a in_array test here instead of testing fields that are present and some that shouldn't be.

Comment on lines +117 to +119
if ($value === '') {
throw new DateTimeFormatException(sprintf('Field %s is not present in the formatting context.', $name));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't seem to be tested.

Also, can't hasField be used here?

}

if (in_array($character, ['G', 'y', 'Y', 'u', 'U', 'r', 'Q', 'q', 'M', 'L', 'w', 'W', 'd', 'D', 'F', 'g', 'E', 'e', 'c'], true)) {
$supportedTypesMap[LocalTime::class] = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm reading the code correctly, if a s Y pattern is passed, this line will consider the time part not supported when reaching the Y character? While it should be because there's the s before in the pattern right?

Shouldn't you initialize the supported types array as false and put a true value instead when parsing a character?

$supportedTypesMap[LocalTime::class] = false;
}

if (in_array($character, ['a', 'h', 'H', 'k', 'K', 'm', 's', 'S', 'A'], true)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick, I would put these arrays of characters in named constant to ease readability

}

/**
* @return list<string>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be narrowed down to a specific list of class strings

$this->timeFormat = $timeFormat;
$this->pattern = $pattern;

if (! extension_loaded('intl')) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a symfony polyfill for intl that works great, I believe it it should test the existence of a specific intl class instead of the extension to allow polyfill usage

/**
* Returns a formatter with a pattern that best matches given skeleton.
*/
public static function ofSkeleton(string $locale, string $skeleton): self
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skeleton word isn't quite helpful to convey the meaning of what’s doing here IMO.

Could it be closer to what's used underneath? Like ofGeneratorPattern or something better?

public static function ofSkeleton(string $locale, string $skeleton): self
{
if (PHP_VERSION_ID < 80100) {
throw new DateTimeFormatException('IntlFormatter::ofSkeleton() is only available in PHP 8.1 and above.');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__FUNCTION__ could be used

$pattern = $generator->getBestPattern($skeleton);

if ($pattern === false) {
throw new DateTimeFormatException('Failed to resolve the best formatting pattern for given locale and skeleton.');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing locale and pattern in message that throw this exception could be useful I guess

@tigitz
Copy link
Contributor

tigitz commented Jul 19, 2022

@jiripudil friendly pring 🙂 , do you plan to keep working on this ?

@jiripudil
Copy link
Contributor Author

Hi, yes, sure, as soon as I'm able :) I was on a few vacations recently and now I need some time to catch up with work stuff first

@bendavies
Copy link

I had a need for this today. looks like a great PR!

@solodkiy
Copy link
Contributor

solodkiy commented Jul 4, 2023

@BenMorel what do you think about this PR? Maybe some one should help @jiripudil to make it mergeble?
I think, custom formatting is one of the things that this lib is really miss.

@jiripudil
Copy link
Contributor Author

Admittedly, this doesn't scratch my itch that much anymore, and I keep putting it off because it feels scary to get back into all of it after so much time. But I guess it's one of those things where you just need to start and then it gets done – which I'd like to do because I've already set this off, and I too believe this library and its users could benefit from a formatting API.

I'd like to hear @BenMorel's thoughts on the direction of it too. There are some great suggestions regarding the API in the review comments, such as

Also from a DX perspective it could become daunting to type the Formatter class every time. I'm wondering if a toNativeFormat and toIntlFormat could be considered.

I think that's a discussion that needs to be had before we can move this forward properly.

@BenMorel
Copy link
Member

BenMorel commented Oct 15, 2023

@jiripudil First of all a big thank you for your PR and sorry for the (huge) delay!

I took the time to review your PR today, this is excellent work, so let's move forward with this API.

I learned from your PR that the format Java Time uses is actually the ICU format, and it's super cool that php-intl already has support for it and that we don't have to re-implement it!

A bit of early feedback:

  • let's target the next version (0.6) that will require PHP 8.1 (I plan to release it before then end of the year)
    • could you please make your PR target the v0.6 branch and rebase it?
    • you can then remove the code that checks for PHP_VERSION_ID < 80100
  • instead of introducing a DateTimeFormatContext with a factory method for each supported date/time class, I'd like to move forward with Interface for any object representing an interval of time #67 first, and have each date/time class report directly the fields it supports. This will bring two improvements:
    • we won't need a DateTimeFormatContext at all
    • we should be able to make formatters more generic, and work with more date/time classes, for example YearMonth: instead of determining the date/time classes supported by each symbol, we should be able to map each symbol to a date/time field that's required for formatting!

I'd like to hear @BenMorel's thoughts on the direction of it too. There are some great suggestions regarding the API in the review comments, such as

Also from a DX perspective it could become daunting to type the Formatter class every time. I'm wondering if a toNativeFormat and toIntlFormat could be considered.

That's my point of view as well. format(DateTimeFormatter) may be left as is (but may just become a proxy for $formatter->format($this) if all objects implement a Temporal interface), but adding convenience methods such as formatNative() and formatIntlPattern() will be appreciable. Proposals for namings welcome.

A couple questions:

  • How does $locale affect IntlFormatter::ofPattern()? Should we keep this parameter, or just hardcode an arbitrary value if the locale doesn't affect the output?
  • How do $dateType and $timeType affect an IntlDateFormatter when $pattern is set to a non-empty string? From my early testing it looks like these parameters are simply ignored in this case, but this doesn't seem to be documented.

(It looks like IntlDateFormatter's constructor accepts too many parameters at the same time, and would have benefited from factory methods similar to those you created on IntlFormatter!)

@jiripudil
Copy link
Contributor Author

Hi, thanks for the feedback!

let's target the next version (0.6) that will require PHP 8.1

Ok, sure, I'll rebase the PR and update the code accordingly.

instead of introducing a DateTimeFormatContext with a factory method for each supported date/time class, I'd like to move forward with #67 first

That sounds great, looking forward to it!

How does $locale affect IntlFormatter::ofPattern()?

It does greatly, because some patterns need to be localized, such as day-of-week (eeee) or month names (LLLL).

How do $dateType and $timeType affect an IntlDateFormatter when $pattern is set to a non-empty string? From my early testing it looks like these parameters are simply ignored in this case, but this doesn't seem to be documented.

Yep, I couldn't find it documented either, but my understanding is the same as yours: they are used to create the internal formatter and assemble the initial pattern, which then gets overridden by a custom pattern if you provide one (see code).

@BenMorel
Copy link
Member

How does $locale affect IntlFormatter::ofPattern()?

It does greatly, because some patterns need to be localized, such as day-of-week (eeee) or month names (LLLL).

Could we detect if the given pattern contains symbols that need to be localized, and make $locale optional if none are detected?

Yep, I couldn't find it documented either, but my understanding is the same as yours: they are used to create the internal formatter and assemble the initial pattern, which then gets overridden by a custom pattern if you provide one (see code).

Thank you for digging into the php-src code! 🙂

@jiripudil
Copy link
Contributor Author

Could we detect if the given pattern contains symbols that need to be localized, and make $locale optional if none are detected?

Well, then there are locales that use different numeral systems (mainly Eastern Arabic numerals), so pretty much every symbol might be affected by the locale...

@jiripudil jiripudil changed the base branch from master to v0.6 October 30, 2023 12:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants