Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Objects which can't possibly be GC roots take up slots in GC root buffer #17127

Open
dktapps opened this issue Dec 12, 2024 · 6 comments · May be fixed by #17130
Open

Objects which can't possibly be GC roots take up slots in GC root buffer #17127

dktapps opened this issue Dec 12, 2024 · 6 comments · May be fixed by #17130

Comments

@dktapps
Copy link
Contributor

dktapps commented Dec 12, 2024

Description

This has mystified me for quite some time.

In PocketMine-MP, we process and cache many simple, acyclic objects like Vector3. When these objects are unref'd (but not destroyed, as they're cached), this causes the object to land in the GC root buffer, despite the object having 0 GC-able children. This causes a couple of problems:

  1. GC (initially) runs more frequently because the root buffer fills up with non-cyclic objects
  2. GC runs get less frequent (because the GC threshold increases), but pause for longer, because GC time appears to scale with root buffer size, not the number of cycles.

Since these objects are clearly acyclic, it's just wasting CPU time.

My question in a nutshell

Is there a technical reason why objects with no collectable children are included in the GC root buffer? Any object without array or object properties should be able to be ignored, shouldn't it?

Demo code

Here's a test script to demonstrate what I mean.

The V3 class in this sample clearly cannot contain any reference cycles as it only references float. You can see that the GC time scales according to the size of the root buffer full of these acyclic objects.

final class V3{
    public function __construct(
        public float $x,
        public float $y,
        public float $z
    ){}
}

$array = [];

$gcTime = 0;
while(true){
    $lastCollectorTime = gc_status()["collector_time"];
    $runs = gc_status()["runs"];
    foreach($array as $v3){
        //object goes back into gc root buffer
    }
    for($i = 0; $i < 10001; $i++){
        $array[] = new V3(0, 0, 0);
    }

    $newRuns = gc_status()["runs"];
    if($newRuns > $runs){
        $gcTime = (gc_status()["collector_time"] - $lastCollectorTime) * 1_000_000_000;
        echo "Auto GC has now run $newRuns times, last run took " . number_format($gcTime) . "ns (" . ($gcTime / count($array)) . " ns/object) with " . count($array) . " acyclic objects\n";
    }

    sleep(1);
}

Example output:

Auto GC has now run 1 times, last run took 156,700ns (15.668433156684 ns/object) with 10001 acyclic objects
Auto GC has now run 2 times, last run took 352,400ns (17.618238176182 ns/object) with 20002 acyclic objects
Auto GC has now run 3 times, last run took 507,800ns (16.92497416925 ns/object) with 30003 acyclic objects
Auto GC has now run 4 times, last run took 786,400ns (19.65803419658 ns/object) with 40004 acyclic objects
Auto GC has now run 5 times, last run took 1,265,900ns (25.315468453155 ns/object) with 50005 acyclic objects
Auto GC has now run 6 times, last run took 1,161,700ns (19.359730693597 ns/object) with 60006 acyclic objects
Auto GC has now run 7 times, last run took 1,483,800ns (21.195023354807 ns/object) with 70007 acyclic objects
Auto GC has now run 8 times, last run took 1,481,700ns (18.519398060194 ns/object) with 80008 acyclic objects
Auto GC has now run 9 times, last run took 1,826,400ns (20.291304202913 ns/object) with 90009 acyclic objects
Auto GC has now run 10 times, last run took 2,648,300ns (26.480351964804 ns/object) with 100010 acyclic objects
Auto GC has now run 11 times, last run took 2,453,100ns (22.298679222987 ns/object) with 110011 acyclic objects
Auto GC has now run 12 times, last run took 3,073,000ns (25.605772756058 ns/object) with 120012 acyclic objects
Auto GC has now run 13 times, last run took 3,423,200ns (26.329674724835 ns/object) with 130013 acyclic objects
Auto GC has now run 14 times, last run took 3,467,300ns (24.763952176211 ns/object) with 140014 acyclic objects
Auto GC has now run 15 times, last run took 3,573,900ns (23.823617638236 ns/object) with 150015 acyclic objects
Auto GC has now run 16 times, last run took 2,809,800ns (17.559494050595 ns/object) with 160016 acyclic objects
Auto GC has now run 17 times, last run took 3,289,700ns (19.349241546434 ns/object) with 170017 acyclic objects
Auto GC has now run 18 times, last run took 3,098,500ns (17.212167672122 ns/object) with 180018 acyclic objects

PHP Version

8.3.13

Operating System

Windows

@iluuu1994
Copy link
Member

Related. #9979. As mentioned in that PR, once PHP 9 inhibits the addition of dynamic objects, it becomes possible to not to objects with exclusively non-recursively typed properties to the GC buffer. It might also be possible to achieve something similar by appending such objects to the buffer only once any dynamic properties are added to them.

Another note on the PR above: After some time I realized the approach is unsound. The engine adds objects to the buffer on DELREF under the assumption that one of the nodes in the graph is eventually no longer referenced from the outside world (e.g. stack, static var, etc.). Selectively excluding nodes from the buffer is problematic then when they are the actual root nodes, i.e. the ones originally referenced from the outside, as they are the only ones guaranteed to be DELREFed. Anyway, not too relevant for this particular issue.

@iluuu1994 iluuu1994 self-assigned this Dec 12, 2024
@dktapps
Copy link
Contributor Author

dktapps commented Dec 12, 2024

it becomes possible to not to objects with exclusively non-recursively typed properties to the GC buffer

Not quite sure I understand why typed properties are important. Isn't it possible to just set a flag on the object during write_property if the written zval may have cycles?

@iluuu1994
Copy link
Member

@dktapps Yes, but writing to objects without releasing them is also very common, so it might actually achieve the opposite effect, we don't want to handle this logic in fast paths.

@iluuu1994
Copy link
Member

iluuu1994 commented Dec 12, 2024

Quick test on internal extensions:

.eeeeeeeeeeee..e......e...e.....eeeeeeeeee.e....ee..eeeeeeeeee......eeeeeeeeeeeeeeeeeeeeeeeee...eeeeeeeeeeeeee......................................e.....e.e.eeee....e.e........................e.........

Each . indicates a class that can be inferred to be acyclic, and never has to be added to the GC buffer (as long as it doesn't have dynamic properties), e is potentially cyclic. Edit: Actually, just noticed the check is incomplete, we also need to mark objects with a custom get_gc handler as potentially cyclic.

I'll check Symfony Demo now.

@dktapps
Copy link
Contributor Author

dktapps commented Dec 12, 2024

Also worth mentioning: gc_adjust_threshold() doesn't account for the removal of acyclic objects from the GC root buffer during collections, which causes the threshold to keep increasing if only working with acyclic objects. This code should probably look at the difference in root counts, instead of the number of cycles collected:

if (count < GC_THRESHOLD_TRIGGER || GC_G(num_roots) >= GC_G(gc_threshold)) {

called from here:

gc_adjust_threshold(gc_collect_cycles());

However this will have a performance impact for code where acyclic objects are still flooding the buffer. In PocketMine's case in particular, a 10k GC threshold has a noticeable impact if the threshold adjustment accounts for acyclic objects. 100k works a lot better.

Fixing this would probably still be a good idea just to deal with objects with acyclic dynamic properties.

iluuu1994 added a commit to iluuu1994/php-src that referenced this issue Dec 12, 2024
Acyclic objects don't need to be added to the GC. We know an object is acyclic
if all properties are typed as acyclic, and the object does not have any dynamic
properties. This is possible to track at runtime with minimal overhead.

Fixes phpGH-17127
@iluuu1994 iluuu1994 linked a pull request Dec 12, 2024 that will close this issue
@iluuu1994
Copy link
Member

@dktapps PoC here: #17130 It runs your script without triggering the GC.

iluuu1994 added a commit to iluuu1994/php-src that referenced this issue Dec 12, 2024
Acyclic objects don't need to be added to the GC. We know an object is acyclic
if all properties are typed as acyclic, and the object does not have any dynamic
properties. This is possible to track at runtime with minimal overhead.

Fixes phpGH-17127
iluuu1994 added a commit to iluuu1994/php-src that referenced this issue Dec 12, 2024
Acyclic objects don't need to be added to the GC. We know an object is acyclic
if all properties are typed as acyclic, and the object does not have any dynamic
properties. This is possible to track at runtime with minimal overhead.

Fixes phpGH-17127
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants