-
-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performances for __toString() methods. #85
Improve performances for __toString() methods. #85
Conversation
32644ce
to
2410d4e
Compare
Hi @gnutix, thanks a lot for the time you've spent trying new optimizations! Your tricks are clever but they hurt readability quite a lot 😕 Before we move forward, one question: are you casting to string many different objects? Could caching |
They are definitely not how anyone would write this kind of code at first, that's true. I thought about adding a comment "This code is optimized for performances" above it. And yet using str_pad is quite intuitive, and even though the conditions
At the scale we're at, we're creating/casting both new objects and the same objects multiple times a fair share, so surely it would be a great benefit too (I don't think both optimizations are exclusive though 😉 ). However, I've already given this idea a lot of thoughts, and it's the implementation that's a problem. The obvious way would be to have a private $date1 = LocalDate::of(2023, 9, 29);
$date2 = LocalDate::of(2023, 9, 29);
assertEquals($date1, $date2); // true
echo (string) $date2; // calling __toString() sets the `iso` property
assertEquals($date1, $date2); // false : the `iso` property is not set on $date1 but is on $date2... I've discovered this because I wanted to do exactly that for our A way to circumvent this downside would be to always set the A second approach would be to use a Of course, if you have another idea how we could implement this kind of cache, I'd be very happy to explore it. Meanwhile, I've opened a PR for something I knew how to fix. 🤷♂️ I hope the readability decrease is still acceptable, considering it's an implementation detail hidden in the methods. If not, I can always try to rewrite it in a style closer to |
I would do that, too.
Definitely not mutually exclusive, but I was wondering which one would affect you the most!
That's how I would have implemented it, too.
This is why I always advocate to not use loose comparison (
I don't think that's an acceptable solution either.
That's an interesting solution, where would you put the Both the
I guess it's acceptable indeed. Let me get back to it over the weekend. |
Agree, though that can get hard to avoid when you have more "integration tests" going on, and you might compare arrays of In a sense, these objects are Value Objects, and we shouldn't have to care about the object's identities, only about their values. If I add these private properties on
Not that it should dictate your decision as a library maintainer, for sure.. but to be taken into consideration ? 😅
We did it on a static property in our case. I'll see what I can do over the weekend too. :) |
diff --git a/src/LocalDate.php b/src/LocalDate.php
index 49da41d..a3165fc 100644
--- a/src/LocalDate.php
+++ b/src/LocalDate.php
@@ -13,6 +13,7 @@ use DateTime;
use DateTimeImmutable;
use DateTimeInterface;
use JsonSerializable;
+use WeakMap;
use function abs;
use function intdiv;
@@ -63,6 +64,14 @@ final class LocalDate implements JsonSerializable
*/
private int $day;
+ /**
+ * @var WeakMap<self, string>
+ * @internal
+ *
+ * This property needs to be public, because it is initialized at the end of this file (outside the class)
+ */
+ public static $isoCache;
+
/**
* Private constructor. Use of() to obtain an instance.
*
@@ -762,7 +771,8 @@ final class LocalDate implements JsonSerializable
*/
public function __toString(): string
{
- return ($this->year < 1000 ? ($this->year < 0 ? '-' : '') . str_pad((string) abs($this->year), 4, '0', STR_PAD_LEFT) : $this->year)
+ return self::$isoCache[$this] ??=
+ ($this->year < 1000 ? ($this->year < 0 ? '-' : '') . str_pad((string) abs($this->year), 4, '0', STR_PAD_LEFT) : $this->year)
. '-'
. ($this->month < 10 ? '0' . $this->month : $this->month)
. '-'
@@ -790,3 +800,5 @@ final class LocalDate implements JsonSerializable
return $this->year * 12 + $this->month - 1;
}
}
+
+LocalDate::$isoCache = new WeakMap(); Then a benchmark PHP script calling
On a more "realistic" scenario (closer to our case at least), calling 5 million times takes (average of 5 runs) :
<?php
use Brick\DateTime\LocalDate;
use Brick\DateTime\Stopwatch;
require __DIR__ . '/../vendor/autoload.php';
$stopwatch = new Stopwatch();
$date = LocalDate::of(2023, 9, 8);
$stopwatch->start();
for ($i = 0; $i < 100_000_000; ++$i) {
(string) $date;
}
$stopwatch->stop();
var_dump((string) $stopwatch->getElapsedTime()); However, I don't think that's a very representative "real world" scenario.. because we're using exactly the same instance of the If I change the benchmark to do diff --git a/src/LocalDate.php b/src/LocalDate.php
index 907469a..e3e9bdf 100644
--- a/src/LocalDate.php
+++ b/src/LocalDate.php
@@ -47,6 +47,7 @@ final class LocalDate implements JsonSerializable
public const DAYS_PER_CYCLE = 146097;
public static $isoCache;
+ public static $instances = [];
/**
* The year.
@@ -92,7 +93,7 @@ final class LocalDate implements JsonSerializable
Field\MonthOfYear::check($month);
Field\DayOfMonth::check($day, $month, $year);
- return new LocalDate($year, $month, $day);
+ return self::$instances[$year . '-' . $month . '-' . $day] ??= new LocalDate($year, $month, $day);
}
/** Then the benchmarks to calling 5 million times
So the gain is not that great, and there's memory issues because there's no garbage collection on the instances array. |
I've also tried adding a static array as cache for the ISO values only : diff --git a/src/LocalDate.php b/src/LocalDate.php
index 49da41d..1e66486 100644
--- a/src/LocalDate.php
+++ b/src/LocalDate.php
@@ -63,6 +63,11 @@ final class LocalDate implements JsonSerializable
*/
private int $day;
+ /**
+ * @var array<string, string>
+ */
+ private static $isoCache = [];
+
/**
* Private constructor. Use of() to obtain an instance.
*
@@ -762,7 +767,8 @@ final class LocalDate implements JsonSerializable
*/
public function __toString(): string
{
- return ($this->year < 1000 ? ($this->year < 0 ? '-' : '') . str_pad((string) abs($this->year), 4, '0', STR_PAD_LEFT) : $this->year)
+ return self::$isoCache[$this->year . '-' . $this->month . '-' . $this->day] ??=
+ ($this->year < 1000 ? ($this->year < 0 ? '-' : '') . str_pad((string) abs($this->year), 4, '0', STR_PAD_LEFT) : $this->year)
. '-'
. ($this->month < 10 ? '0' . $this->month : $this->month)
. '-' 5 million calls to
It seems the differences between concatenating the properties to make the array key and concatenating the properties to make the ISO string is not great enough. So I tried again with 100 million calls, which takes (average of 5 runs) :
The difference starts being noticeable, but it's not amazing either. And this solution might not be optimal regarding memory, as there's no garbage collection of such strings in a static array. |
Conclusion : an hour and a half later, I still don't know how to add a cache to
Done 👍 Let met know if there's anything else I can do. |
2410d4e
to
98889c9
Compare
Wow, you're indeed relying a lot on this. Funnily enough, I once advised a coworker against using
OK, let's forget about this for now. I'm still not convinced about whether or not a local cache of computed values is acceptable (I would not consider breaking loose equality a breaking change, but maybe I can be convinced otherwise), but the lack of evidence of a real-world performance improvement (in addition to the extra memory consumed) makes it a no-go for now. your sentence is suspended 🙂 |
Our test suite is not directly relying on assertEquals(LocalDate $a, LocalDate $b), but indirectly by comparing objects that have private properties that are themselves LocalDate (or other similar) Value objects. Glad the bullet is dodged for now. ;) |
Shouldn't these objects have an |
They probably should. Never took the time to implement a complex method we did not needed, I guess. 🤷♂️ And then there's the matter of the inexistant standard for equal methods. |
497b11e
to
40b1dea
Compare
I pushed 2 commits adding
As usual, I've improved the test suite by adding data providers where there were none, and by using them between |
40b1dea
to
d47b51a
Compare
Could you please move this to a separate PR? This change is self-contained and bloats the current PR. We'll try to get it merged quickly so that you can rebase the current PR. Also, I think I'm not sold on
For these reasons, I think I would call it But we can continue this discussion in the other PR! |
d47b51a
to
98889c9
Compare
Done, cf. #87. PR is ready for merge IMHO. |
49c0099
to
539df9b
Compare
f4d83e8
to
8464379
Compare
8464379
to
bde7625
Compare
@BenMorel Rebased :) Also fixed a variable wrongly named |
@BenMorel LGTM 👍 I tend to preserve acronym case in naming, helps readability IMO. |
Hey @BenMorel ! Any chance we could get this merged and released someday soon ? 🙏 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gnutix Sorry for the delay, after these changes I think we'll be ready to merge!
bde7625
to
d404ce4
Compare
@BenMorel I've pushed all the requested changes (nice catches in there, thanks for the review! 🙏), except the code regarding the year's calculation, for which I'm not convinced (see my comment above). |
7550102
to
2726c58
Compare
…d YearWeek's __toString() methods. And slight improvement to Duration's __toString() method.
2726c58
to
8424509
Compare
@BenMorel There you go, I've applied the last change for the year's code, and squashed the commits. |
Hello @BenMorel,
Another performance PR.
__toString()
can be called a lot in our code, especially onLocalDate
andLocalTime
(throughLocalDateTime
and our ownLocalDateInterval
/LocalDateTimeInterval
classes). Like, this much :So I've spend a few hours doing lots of experimentation and profiling to find the fastest way to produce the ISO-8601 representation of these objects. The highlights of which are :
$this->year < 1000
)$this->month < 10 ? '0' . $this->month : $this->month
are about 30% faster thansubstr('0' . $this->month, -2)
(another idea we've tried with @BastienClement to replacesprintf
calls)I also checked other classes'
__toString()
methods, and optimized those ofMonthDay
,YearMonth
andYearWeek
(which have the same needs asLocalDate
).Finally, I've fixed some bugs infixed in #87YearMonth::__toString()
, which was doingsprintf('%02u', $this->year)
(which seems wrong).And added some more tests where I felt it was needed.
If you merge it, it would be great if you could make a minor release directly. 🙏