Skip to content

IBX-10186 Add limits to count and subtree queries #600

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 17 commits into
base: 4.6
Choose a base branch
from

Conversation

ryanolee
Copy link

@ryanolee ryanolee commented Jun 19, 2025

🎫 Issue IBX-10186

Warning

Please check ezsystems/ezplatform-kernel#413 for detailed reasons for this PR. Though the TLDR is count queries cause disproportionate impact on Ibexa Platform instances with large amounts of content (100,000+ content objects)

Related PRs:

Required by

ibexa/admin-ui#1605

Description:

This is a forward facing patch from issues present in Ibexa 3.3 (The version we are still using) to port the fixes and public API updates into 4.6.

Understandably considering this PR makes changes to Contracts it might need to go into 4.7 (If a release is planned beyond LTS) or v5 given implementing clients will break. I am not sure on the release cadence for contracts so this might be somewhat problematic.

Given Ibexa 4.6 has the same database schema as 3.3 (And makes the same DB Queries when performing counts) It is susceptible to the same issues present in 3.3 when querying large amounts of content.

For more details please read here where performance implications are described more clearly.
ezsystems/ezplatform-kernel#413

For QA:

Test using instructions given in ibexa/admin-ui#1605

Documentation:

Check ezsystems/ezplatform-kernel#413 for relevant details.

Notes

  • This has been manually tested and confirmed to work on Ibexa 3.3 (And forward ported to v4.6.21)
  • Tests Pass and have been updated with new test cases where they already existed

Copy link
Member

@adamwojs adamwojs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for valuable contribution @ryanolee! I've add some minor comments from my side and asked team to review PR.

Copy link
Contributor

@Steveb-p Steveb-p left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense from the functional standpoint. I was considering how those COUNT queries impact larger databases, and I'm happy to research this approach. Thanks for showing me this :)

I've given a few remarks that I would have in the following comments.

In general, I would love to have this trait converted into a class, and unit tests added for it (which should be easier once it's not a trait).
For 3.3 version, since it's in "maintenance mode", it will probably not be added there. It's supposed to be security fixes only.

As for the API changes, we might consider taking Symfony's approach and adding additional argument via func_get_arg() + comment. Depends on how likely it is that those services are reimplemented. In majority of cases a decorator exists (afaik) that should be used, but I don't want to break someone's application regardless of our API contract.

We'll discuss it internally probably at the start of next week.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not a fan of traits under certain circumstances. I somewhat prefer object composition approach unless there are valid reasons.

In this case, trait is used to prevent code duplication only, and it requires presence of connection property (which is an implicit assumption not visible outside of trait).

Since the method added is using protected visibility, it is immediately available to descendants. While not a big deal, I would personally prefer it to use private visibility (if left as-is, that is).

Additionally, since it is a trait, unit tests are more difficult to provide.

Overall, I'd suggest making it it's own class, and injecting it into relevant gateways via constructor. To facilitate usage of the correct Connection object, it could be passed as part of the method arguments.

WDYT?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit awkward given it is multiple inheritance of a shared bit of functionality. Though do agree it should probably be refactored into it's own thing. Will take a look at refactoring into something that can be unit tested and refactor. The connection was misued here as it can be pulled from the passed query builder so there is no implicit requirement on there being a connection present as seen with a few lines below where $queryBuilder->getConnection()->getDatabasePlatform()->getCountExpression('*') is used. But based on everything else that is a mute point.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes, right, now I've noticed it.

By the by, note that Platform::getCountExpression() is deprecated and removed in DBAL v3. We are now expected to use "COUNT(*)" directly, without asking platform for their variant. I assume that is because COUNT() is ANSI SQL, and there were never any platform differences to begin with.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah that makes sense I am currently aiming to keep things as closeley aligned with the way ibexa currently generates it's queries internally. It is likely Platform::getCountExpression is better placed to be updated globally in a separate ticket considering that is what is already in use now and this is not a wide sweeping change targeting specifically that

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is updated in 5.x. I'd only ask to not use it here, as it will make merge up simpler :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh that makes sense! Sorry saw all of the other references to it and got a bit confused. Updated as outlined here https://github.com/doctrine/dbal/blob/4.2.x/UPGRADE.md#deprecated-redundant-abstractplatform-methods

@ryanolee
Copy link
Author

ryanolee commented Jun 21, 2025

Makes sense from the functional standpoint. I was considering how those COUNT queries impact larger databases, and I'm happy to research this approach. Thanks for showing me this :)

I've given a few remarks that I would have in the following comments.

In general, I would love to have this trait converted into a class, and unit tests added for it (which should be easier once it's not a trait). For 3.3 version, since it's in "maintenance mode", it will probably not be added there. It's supposed to be security fixes only.

As for the API changes, we might consider taking Symfony's approach and adding additional argument via func_get_arg() + comment. Depends on how likely it is that those services are reimplemented. In majority of cases a decorator exists (afaik) that should be used, but I don't want to break someone's application regardless of our API contract.

We'll discuss it internally probably at the start of next week.

Thanks for the review 🤩

Just as a follow up for this the func_get_arg is somewhat problematic due to https://github.com/FriendsOfPHP/proxy-manager-lts dropping the extra arg in lazy loaded servies

See the following:
https://github.com/Ocramius/ProxyManager/blob/2.15.x/docs/lazy-loading-value-holder.md#known-limitations

In order to add BC break compatibility this PR might need pushing to main and the BC break be introduced to v5 where the major version change might allow for contracts I have confirmed the behavior where the value proxy drops the extra limit parameter.
image

The only other option I can think of is in 4.6 to have the admin UI ensure it uses instances of the Location and Content services that are either not lazy loaded or have back channels used specifically by the admin UI/ For instance directly call the underlying persistence handler for queries until v5 where the fix can be pulled up to the overall API . In either case this is really less than ideal 😓

It does work when the "lazy" option is removed from the inner service. But not making it lazy will likely have much more far widespread performance implications so likely needs to be weighed if the internal admin optimisations make sense with it. For now have removed the "lazy" from the location service to get this to work
image

Let me know what you think appropriate next steps are @adamwojs @Steveb-p as the Symfony BC break strategy will probably not work due reliably to the number of lazy loaded services in Ibexa core 🤔

The suggested changes as given above

  • Trait to -> DI Class
  • Symfony style contract changes have been made available
  • Unit tests have been added to new class
  • Tests pass and CS Runs

See: ibexa/admin-ui#1605 for the relevant places the queries have been changed

If you would like me to port this further to Ibexa version 5 please do considering the impact of these queries as shown in prior PR's can become significant.

@ryanolee ryanolee force-pushed the feature/IBX-10186 branch from 522d5b1 to 804f1a7 Compare June 21, 2025 01:24
@ryanolee ryanolee marked this pull request as ready for review June 21, 2025 01:35
@ryanolee
Copy link
Author

ryanolee commented Jun 21, 2025

Quality Gate Failed Quality Gate failed

Failed conditions 3.6% Duplication on New Code (required ≤ 3%)

See analysis details on SonarQube Cloud

1 liners required for BC breaks. Would make no sense to extract into it's own function and based on the func_* function requirements nigh impossible. Seems to be complaining about a single function call also?

@ryanolee
Copy link
Author

All comments should be addressed 🙌

@adamwojs @Steveb-p
This should be ready for a re-review pending further discussion of BC break strategies for updating Contracts and the problems that come from it (especially surrounding problems with value proxies)

In line with this ibexa/admin-ui#1605 leverages these new changes to make the admin UI much faster for users with larger databases (Many sub items in commonly traveled paths)

Thanks for having a look an let me know if there is anything else you need from me with relation to this!

@Steveb-p
Copy link
Contributor

Steveb-p commented Jun 22, 2025

All comments should be addressed 🙌

@adamwojs @Steveb-p This should be ready for a re-review pending further discussion of BC break strategies for updating Contracts and the problems that come from it (especially surrounding problems with value proxies)

In line with this ibexa/admin-ui#1605 leverages these new changes to make the admin UI much faster for users with larger databases (Many sub items in commonly traveled paths)

Thanks for having a look an let me know if there is anything else you need from me with relation to this!

I was thinking how to deal with proxies being generated without the right amount of arguments, and there is technically an option of asking the proxy generator to generate a proxy against a specific interface.

And while the original interface does not contain $limit argument, we could create a new, forward facing one, which would extend the original and add the missing argument (without actually introducing said interface to implementations, nor intending to use it in 5.x - just for the purposes of lazy).

Weird and going against common sense maybe, but it would solve the functional issue of proxy generation.

Regardless of approach, I'm pretty sure we will also introduce a rector rule to allow this new concept of $limit on count to be introduced in all implementations of a specific interface. ArgumentAdder already exists in Rector, so all that will be needed is to configure the rule properly.

@ryanolee
Copy link
Author

Ah thanks for the feedback @Steveb-p that actually seems to work quite nicely.
I have stolen naming from the python's concept of [__future__](https://docs.python.org/3/library/__future__.html) given this seems to fill a similar role. Was looking for a way of controlling the underlying code-gen but could not not find one at a cursory glance. That all seems to be working now without any api bc breaks and lazy loaded value holders working as intended. Prefixed them with Future as to avoid polluting IDE intelisense with compatibility layers given no end user should really ever be aiming to use them. Or using then with caution and complete lack of guarantee if they do.
image

Code gen is also working as expected

 public function getSubtreeSize(\Ibexa\Contracts\Core\Repository\Values\Content\Location $location, ?int $limit = null) : int
    {
        $this->initializerdb498 && ($this->initializerdb498->__invoke($valueHolder69884, $this, 'getSubtreeSize', array('location' => $location, 'limit' => $limit), $this->initializerdb498) || 1) && $this->valueHolder69884 = $valueHolder69884;

        if ($this->valueHolder69884 === $returnValue = $this->valueHolder69884->getSubtreeSize($location, $limit)) {
            return $this;
        }

        return $returnValue;
    }

All working as expected. Thanks!

@ryanolee
Copy link
Author

ryanolee commented Jun 22, 2025

Ok should all be fixed up. Tests pass on PHP 7.4 and 8.1
PHP Stan really does not like the bc break stuff

Note: Using configuration file [PATH]/core/phpstan.neon.dist.
 1/1 [▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓] 100%

 ------- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 
  Line    LocationServiceTest.php                                                                                                                                                                                                     
 ------- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 
          Ignored error pattern #^Method Ibexa\\Tests\\Integration\\Core\\Repository\\LocationServiceTest\:\:testGetSubtreeSize\(\) should return Ibexa\\Contracts\\Core\\Repository\\Values\\Content\\Location but returns           
          Ibexa\\Contracts\\Core\\Repository\\Values\\Content\\Location\|null\.$# (return.type) in path /home/ryanolee/projects/personal/ibexa-new/core/tests/integration/Core/Repository/LocationServiceTest.php was not matched in  
          reported errors.                                                                                                                                                                                                            
          Ignored error pattern #^Parameter \#1 \$location of method Ibexa\\Contracts\\Core\\Repository\\LocationService\:\:getSubtreeSize\(\) expects Ibexa\\Contracts\\Core\\Repository\\Values\\Content\\Location,                 
          Ibexa\\Contracts\\Core\\Repository\\Values\\Content\\Location\|null given\.$# (argument.type) in path /home/ryanolee/projects/personal/ibexa-new/core/tests/integration/Core/Repository/LocationServiceTest.php was not     
          matched in reported errors.                                                                                                                                                                                                 
  :1757   Ignored error pattern #^Cannot access property \$id on Ibexa\\Contracts\\Core\\Repository\\Values\\Content\\Location\|null\.$# (property.nonObject) in path                                                                 
         [PATH]/core/tests/integration/Core/Repository/LocationServiceTest.php is expected to occur 11 times, but occurred only 9 times.                                                         
 ------- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 


                                                                                                                        
 [ERROR] Found 3 errors                                                                                                 
                                                                                                                        
                                                                                      

Whittled down the errors to these. I don't have any IDE tooling for controlling ignore errors in the phpstan.dist.neon. Looks like you might need a php stan pro licence to properly configure it?

In either case spammed a bunch of @php-stan-ignore *'s in the relevant places and fixed the rest. If someone has better tooling to update the phpstan.dist.neon it would be much appreciated if they could make the relevant changes 🙏 .

Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
9.4% Duplication on New Code (required ≤ 3%)

See analysis details on SonarQube Cloud

@ryanolee
Copy link
Author

Fixed remaining issues beyond the aforementioned @Steveb-p @adamwojs "php stan ignore" Issue mentioned above. Let me know if there is anything else you need from me for either PR 🙌

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants