-
Notifications
You must be signed in to change notification settings - Fork 7.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement acyclic object tracking #17130
base: master
Are you sure you want to change the base?
Conversation
1bd8088
to
72c581b
Compare
Acyclic objects don't need to be added to the GC. We know an object is acyclic if all properties are typed as acyclic, and the object does not have any dynamic properties. This is possible to track at runtime with minimal overhead. Fixes phpGH-17127
72c581b
to
f6e49bb
Compare
I like the idea, but a few things to mention:
|
Yeah, I had the same thought. Could be considered still.
Unfortunately, object handlers are set directly by extensions after calling
They are with this PR, given that all properties are acyclic. enum E {
case C;
}
var_dump(new ReflectionEnum(E::class)->mayBeCyclic());
|
@iluuu1994 Out of curiosity (not related to the PR): I noticed you didn't touch the GC at all, which suggests to me that the functionality to respect this flag is already present. If I were to, say, make an extension which allows setting GC_NOT_COLLECTABLE on certain objects, would that work within currently supported versions of PHP? I'm looking for a way to get around the performance impacts described in #17131. |
@dktapps Yes, you are correct. We may also think about an attribute where users can explicitly opt classes into |
Removing acyclic objects from the root buffer turned out about 20 ns per object in my testing. But the impact of complex long-lived objects landing in the GC root buffer was orders of magnitude larger. I think we'd probably get a lot more mileage out of #17131, although ignoring acyclic objects will definitely help too. |
There may be some more benefits to get out of this flag. E.g. we may skip over acyclic objects in the GC itself, which I currently have not done. So, even if some root object is added to the GC, it may at least skip over the references to acyclic objects it holds. Note that our GC implicitly implements something similar as generations. After every GC run, the buffer is empty. Objects are only added to the buffer when they are released, i.e. they are stored in some new variable and then unset. This effectively does the same thing, the object is only considered for cycle collection when it has been worked with. |
An object that survived collection will still end up in the root buffer over and over again & waste CPU time if it's frequently ref'd & unref'd (basically, objects are always treated as if they're generation 0). I've found this to cause major performance impacts in my own code. |
Do I understand it right that this cyclic-check is pretty trivial and avoids using knowledge of already declared types or such (i.e. if the type specified by the typed property is known to not be cyclic, the object will also not be cyclic)? |
On a more technical note - can we simply assume acyclic for internal classes if their declared properties are all scalar and it does not have a get_gc handler? Rather than explicitly specifying cyclic (which, given the opt-out nature, is prone to be missed). |
Yes, this is pretty much it. Edit: I just re-read your sentence and missed the word "avoids", now I'm lost what you mean. :)
No. This PR adds the
Why would final classes matter? This is operating on specific instances of one class, if this class is inferred to be acyclic, the only way for it not to be is to add dynamic properties, regardless of any potential subclasses.
As mentioned above, this was my first thought. However, since object handlers are set by extensions after calling register_class_xyz, there is no obvious hook for this. We could loop all internal classes after MINIT I suppose. We may also miss some internal classes that actually acyclic even though they set |
If get_gc is customized, assume cyclic classes for now.
Don't rely on the cycle collector for objects released in destructor.
Yep. Without having GC pay attention to this flag, this change mostly only helps ephemeral objects in local variables and tmpvars. I made an extension to allow me to explicitly mark objects as not collectable (to try and workaround both #17131 and #17127), but it was mostly unsuccessful because indirectly referenced object still seem to get their dependencies scanned by the GC whether they are marked as GC_NOT_COLLECTABLE or not. I think the GC really needs to be modified to account for this, otherwise this improvement will be largely ineffective. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this. This is probably the most beneficial optimization we could implement on the current GC.
Zend/zend_object_handlers.c
Outdated
@@ -70,6 +70,8 @@ ZEND_API HashTable *rebuild_object_properties_internal(zend_object *zobj) /* {{{ | |||
zend_class_entry *ce = zobj->ce; | |||
int i; | |||
|
|||
GC_TYPE_INFO(zobj) &= ~(GC_NOT_COLLECTABLE << GC_FLAGS_SHIFT); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would break badly if this changed during GC, if the GC used this to decide when to traverse nodes. I suggest to add an assertion here to check that GC is not running.
/* Mark classes with custom get_gc handler as potentially cyclic, even if | ||
* their properties don't indicate so. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also mark classes with custom get_properties
handler as potentially cyclic, as this is called by zend_std_get_gc
.
Otherwise this looks right to me, as objects with std get_gc
and get_properties
can not expose anything other than standard properties to the GC.
Lazy objects need to take care of updating GC_NOT_COLLECTABLE
: Remove GC_NOT_COLLECTABLE
in zend_object_make_lazy()
when the initializer is cyclic or may have a ref to the object itself, and add it again in zend_lazy_object_init()
.
/* FIXME: Potentially infer ZEND_ACC_MAY_BE_CYCLIC during construction of | ||
* closure? static closures not binding by references can't be cyclic. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 this looks worth it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm afraid my comment was incorrect.
class Test {
public \Closure $c;
}
$test = new Test();
$c = static function () use ($test) {}; // acyclic
$test->c = $c; // now it's cyclic
unset($test);
gc_collect_cycles();
unset($c); // leaks
This is only ok if $test
itself is acyclic. As mentioned previously, this check would currently be unsound because the object may become cyclic at a later point in time by adding dynamic properties. However, once we're on PHP 9, it should be ok!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be worth mentioning: readonly
classes don't permit dynamic properties. So readonly classes could benefit from this optimization immediately without waiting for PHP 9.0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, your test case doesn't show any dynamic property usage @iluuu1994, wouldn't that still be an issue in non-readonly cases even in PHP 9.0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be worth mentioning: readonly classes don't permit dynamic properties. So readonly classes could benefit from this optimization immediately without waiting for PHP 9.0.
I was referring to:
This is only ok if $test itself is acyclic. As mentioned previously, this check would currently be unsound
I.e. it's safe to assume Closure
is acyclic only if it captures values that aren't themselves acyclic. If Test
may point to cyclic values, as Closure
itself, then it won't be marked as acyclic. But today, even if Test
were inferrably acyclic, this is not guaranteed to remain this way due to dynamic properties. Hence, we need to be more careful about relations between objects.
Actually, your test case doesn't show any dynamic property usage @iluuu1994, wouldn't that still be an issue in non-readonly cases even in PHP 9.0?
Yes, but it's not a priority since it's a rare case and unlikely to make a big difference.
@@ -432,6 +432,8 @@ public function getNamespaceName(): string {} | |||
public function getShortName(): string {} | |||
|
|||
public function getAttributes(?string $name = null, int $flags = 0): array {} | |||
|
|||
public function mayBeCyclic(): bool {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for exposing this. This may be beneficial for understanding the GC behaviour as PHP user.
I mean that you're not actually looking at the type of referenced properties. |
Ah ok. Yes, atm we're not looking at class types, only scalar types (excluding arrays) are considered acyclic. This would be possible, but only in PHP 9, because then dynamic properties cannot be added to objects at runtime anymore, which can implicitly make some other object it references a root. It's also going to be ineffective without cross-file analysis. |
I see this statement from time to time, but IIUC it will still be allowed with |
Yes, but if we know about the class at comp-time, we know whether this attribute is present too. |
* Actually skip non-cyclic objects in GC. * Add MAY_BE_CYCLIC to more internal classes * Fix dynamic property creation trigger And also breaking phpGH-10932 for now. :) I will try to fix this later.
Adding strings to the worklist is useless, because they never contribute to cycles. The assembly size on x86_64 does not change. This significantly improves performance in this synthetic benchmark by 33%. function test($a) {} $a = new stdClass(); $a->self = $a; $a->prop1 = str_repeat('a', 10); $a->prop2 = str_repeat('a', 10); $a->prop3 = str_repeat('a', 10); $a->prop4 = str_repeat('a', 10); $a->prop5 = str_repeat('a', 10); $a->prop6 = str_repeat('a', 10); $a->prop7 = str_repeat('a', 10); $a->prop8 = str_repeat('a', 10); $a->prop9 = str_repeat('a', 10); $a->prop10 = str_repeat('a', 10); for ($i = 0; $i < 10_000_000; $i++) { test($a); gc_collect_cycles(); } This requires adding IS_TYPE_COLLECTABLE to IS_REFERENCE_EX to ensure these values continue to be pushed onto the stack. Luckily, IS_TYPE_COLLECTABLE is currently only used in gc_check_possible_root(), where the checked value cannot be a reference. Note that this changes the output of gc_collect_cycles(). Non-cyclic, refcounted values no longer count towards the total reported values collected. Also, there is some obvious overlap with GH-17130. This change should be good nonetheless, especially if we can remove the GC_COLLECTABLE(Z_COUNTED_P(zv)) condition in PHP 9 and rely on Z_COLLECTABLE_P() exclusively, given we can assume an object doesn't become cyclic at runtime anymore. Closes GH-17194
Acyclic objects don't need to be added to the GC. We know an object is acyclic if all properties are typed as acyclic, and the object does not have any dynamic properties. This is possible to track at runtime with minimal overhead.
Fixes GH-17127