Improve performance by disabling return/resolved values validation #1493

Warxcell · 2023-12-07T22:38:58Z

What do you think about this one? The use-case is to gain a little bit of performance, by disabling resolver's returned value validation - and just trusting it.

No idea how much will gain - just throwing it in the wild and see what people think. :)

spawnia · 2023-12-11T09:26:57Z

My instinct says that the additional check for whichever setting controls this behaviour might eat up some of the benefits afforded by it. Making this as fast as possible would require having a code path that does no checks at all, but also does not checks if that behaviour is enabled. This would require duplicating/rewriting substantial parts of the executor, as well as possibly swapping the default field resolver and other parts that have sanity/safety checks.

Feel free to explore this in a pull request, along with benchmarks to test the effectiveness. You can also use a custom executor in your own project, see

graphql-php/src/Executor/Executor.php

Line 78 in 3ae4185

    
           public static function setImplementationFactory(callable $implementationFactory): void

. I don't think this is going to be a worthwhile endeavours and am thus closing this issue, but I am also open to being convinced otherwise with working code and the benchmarks to prove it.

mfn · 2023-12-13T10:36:23Z

FTR I would be interested in this.

Once payloads returned exceed a significant size (in the multi-mega-byte range), I've seen that when profiling (used xdebug & PhpStorm xdebug snapshot analyzer) the micro-overhead for a single field becomes measurable in the big picture to the point it starts to rival the slowest part in those requests, usually the database.

spawnia · 2023-12-13T11:03:21Z

FTR?

mfn · 2023-12-13T11:06:36Z

For The Record

Warxcell · 2023-12-15T08:07:19Z

btw https://github.com/zalando-incubator/graphql-jit have thing called

disableLeafSerialization {boolean, default: false} - disables leaf node serializers. The serializers validate the content of the field at runtime so this option should only be set to true if there are strong assurances that the values are valid.

I will try to make test in few days to see what's the gain.

@mfn can you point me where most execution time comes from?

mfn · 2023-12-15T10:15:25Z

I can't share the xdebug cachegrind output but I can leave a screenshot 😅 Hope it's useful:

Of course Laravel / Eloquent also appears a LOT there.

But if you look at own time vs. amount of calls, I think even small optimizations can pay off (ofc. also in non-GraphQL parts)

The json payload here generated (uncompressed) was about 3.8MB

Although it would be quite involved, I would be able to re-run code on similar complex results and compare them, if we've something to test.

Warxcell · 2023-12-15T18:33:57Z

@spawnia another thing that could improve performance is skipping

            $validationErrors = DocumentValidator::validate($schema, $documentNode, $validationRules);

            if ($validationErrors !== []) {
                return $promiseAdapter->createFulfilled(
                    new ExecutionResult(null, $validationErrors)
                );
            }

in case if document is cached. If document is cached - it was already validated somewhere in the past.

Warxcell · 2023-12-15T19:38:09Z

before

after:

basically 100-200ms improvement on one specific query on our side.

 $cacheItem = $this->queryCache->getItem(md5($op->query));

            if ($cacheItem->isHit()) {
                $documentNode = AST::fromArray($cacheItem->get());
            } else {
                $documentNode = Parser::parse($op->query);

                $queryComplexity = DocumentValidator::getRule(QueryComplexity::class);
                assert(
                    $queryComplexity instanceof QueryComplexity,
                    'should not register a different rule for QueryComplexity'
                );

                $queryComplexity->setRawVariableValues($op->variables);


                $validationErrors = DocumentValidator::validate($this->schema, $documentNode);

                if ($validationErrors !== []) {
                    return $promiseAdapter->createFulfilled(
                        new ExecutionResult(null, $validationErrors)
                    );
                } else {
                    $cacheItem->set(AST::toArray($documentNode));
                    $this->queryCache->save($cacheItem);
                }
            }

But i'm not sure how to incorporate this into here. I basically copied StandardServer, Helper and Grapql::promiseToExecute into my bundle in order to achieve that.

https://github.com/Warxcell/graphql-bundle/blob/main/src/Controller/GraphQL.php#L231-L257

Any advice how to proceed with this one?

Warxcell · 2023-12-15T21:07:07Z

I can't share the xdebug cachegrind output but I can leave a screenshot 😅 Hope it's useful:

Of course Laravel / Eloquent also appears a LOT there.

But if you look at own time vs. amount of calls, I think even small optimizations can pay off (ofc. also in non-GraphQL parts)

The json payload here generated (uncompressed) was about 3.8MB

Although it would be quite involved, I would be able to re-run code on similar complex results and compare them, if we've something to test.

I've tested with blackfire, and from 300ms total response time, we have 80ms lost in is_iterable checks (173 times called).

@spawnia what about using assert(is_iterable()) (it won't be executed on production, hence the optimization) ?

SebastienTainon · 2024-03-19T17:15:09Z

The performance of an API is a key point. Any small improvement in the response time would be very interesting! (and these benchmarks look promising)

Warxcell · 2024-03-19T18:56:43Z

The performance of an API is a key point. Any small improvement in the response time would be very interesting! (and these benchmarks look promising)

you can achieve same results by copy-paste-modify StandardServer.php

You can see what I've done here: https://github.com/Warxcell/graphql-bundle/blob/main/src/Controller/GraphQL.php ( think https://github.com/Warxcell/graphql-bundle/blob/main/src/Controller/GraphQL.php#L254-L280 is the most important part - it skips parsing && validation of query if its cached)

The question is - Can we incorporate that directly into this package?

spawnia · 2024-03-20T08:17:59Z

@spawnia what about using assert(is_iterable()) (it won't be executed on production, hence the optimization) ?

I don't want to change the default behaviour of this library in production settings. I very much appreciate how it prevents bad data from ever reaching clients, since we can be fearless and not use defensive programming there.

Warxcell · 2024-06-27T06:57:36Z

@spawnia what about using assert(is_iterable()) (it won't be executed on production, hence the optimization) ?

I don't want to change the default behaviour of this library in production settings. I very much appreciate how it prevents bad data from ever reaching clients, since we can be fearless and not use defensive programming there.

The thing is, one may use phpstan/psalm for resolvers - and thus does not need to validate data at runtime.

bountyhub-bot · 2024-09-10T20:26:02Z

🚀 Bounty Alert!

💰 A bounty of $15.00 has been created by omarsoufiane on BountyHub.

🔗 Claim this bounty by submitting a pull request that solves the issue!

Good luck, and happy coding! 💻

chojnicki · 2024-10-02T08:28:52Z

@Warxcell did you examined this issue/idea any further?

I was debugging slow API responses with xdebug and came to same conclusions - a lot of time it taken by data validation. I do not need this feature that much, especially that it takes a lot of time to execute. So I was looking for some option to just disable it in lighthouse or graphql-php, but was suprised surprised that there is not such thing. I do know that such validation is Graphql advantage, but it should be optional in my opinion. In many cases my data typing is validating by PHP itself by build in php8 typing. If some of my functions will return for example "string" instead of "int" - php will crash, so I do not need graphql-php check it for me again.

Also ideally it could be enabled on dev/test environment, and disabled on production. If my query is working and validating fine on local and while testing, i do not see the point of checking it again on production.

I don't want to change the default behaviour of this library in production settings. I very much appreciate how it prevents bad data from ever reaching clients, since we can be fearless and not use defensive programming there.

So again, it's very valid point, default should be as it is right now, but it could be configurable, like in that JS implementation.

@spawnia what do you think?

Warxcell · 2024-10-02T09:23:07Z

I did following patch

diff -uraBN graphql-php-og/src/Executor/ReferenceExecutor.php graphql-php/src/Executor/ReferenceExecutor.php
--- graphql-php-og/src/Executor/ReferenceExecutor.php	2024-06-17 00:30:41.773638085 +0300
+++ graphql-php/src/Executor/ReferenceExecutor.php	2024-06-17 00:31:02.825636896 +0300
@@ -849,9 +849,6 @@
                 $result,
                 $contextValue
             );
-            if ($completed === null) {
-                throw new InvariantViolation("Cannot return null for non-nullable field \"{$info->parentType}.{$info->fieldName}\".");
-            }
 
             return $completed;
         }
@@ -862,12 +859,6 @@
 
         // If field type is List, complete each item in the list with the inner type
         if ($returnType instanceof ListOfType) {
-            if (! \is_iterable($result)) {
-                $resultType = \gettype($result);
-
-                throw new InvariantViolation("Expected field {$info->parentType}.{$info->fieldName} to return iterable, but got: {$resultType}.");
-            }
-
             return $this->completeListValue($returnType, $fieldNodes, $info, $path, $result, $contextValue);
         }

➕ what I've mention above. (Query parsed, validated - then if valid - cached, so next time if query is found in cache - its not validated - because only validated queries are cached). (More info here: #1493 (comment))

➕ there is one more optimization, pending to be merged. #1587 This one caches argument resolution. So it will be very beneficial if you have list which may have some fields with many arguments.

Like mention above: for specific query I got 100-200ms improvement with these changes.

chojnicki · 2024-10-02T15:56:54Z

I did following patch

Tried that already, commenting also other asserts, with 0ms difference in my case.
Actually, replacing if ($completed === null) { additional if for checking config will not make it any faster, since comparing nulls is extra simple and fast. Like @spawnia said

additional check for whichever setting controls this behaviour might eat up some of the benefits afforded by it.

completeValue still takes a lot of time. So I guess I was wrong, validation is not a reason (in my case). My idea was that it loops over everything in result BECAUSE it needs to validate things, but it has to do it anyway because that's how graphql-php works. So even after disabling result validation, if I'm not wrong, method calls (visible here #1493 (comment)), will not change. So more complicated and nested json result means slower response (opposite to REST). In xdebug i can see that it stays most of the time in graphql-php trying to just return data. That's disappointing.

➕ what I've mention above. (Query parsed, validated - then if valid - cached, so next time if query is found in cache - its not validated - because only validated queries are cached).

Good idea but it is not necessary in my case. Lighthouse (that is using graphql-php) already does that out of the box. I checked source and tested it - query AST is indeed cached as array, exactly like in your example. So parsing should be fast.

In your case @Warxcell if I understand correctly, only this query caching did the real difference, but it's outside of this issue scope, because you changed query part of request, on return part, like described in first post.

I'm not sure what is going on in PR you provided (but I will for sure test it after release), but I wonder if is there anything that could be done to speed up data return.

I'm out of ideas for now.

Warxcell · 2024-10-03T05:14:42Z

What about is_iterable checks? They were taking 40ms on my side.

Also, can you share some graphs with calling time, etc?

chojnicki · 2024-10-03T16:36:32Z

Also, can you share some graphs with calling time, etc?

I wanted to but today I cannot reproduce situation where completeValue and methods around were marked as most time consuming. Right now it points to lighthouse methods and I should not post it here then since it's not that repository. Will try to investigate it further.

About is_iterable:

So even that it was called a lot of times, according to profile it took close to nothing. I tried to make different queries, but results were similar. It's interesting that it took 40ms in your case. I cannot see it in graph - do you know to many times it was called?

But looking in vendor for more is_iterable I found symfony/string with very interesting comment.

https://github.com/symfony/string/blob/7.1/AbstractString.php#L547C102-L547C103

so there might be something to it.

Maybe I would get different results with blackfire that you used. But being forced to pay "monthly" subscription up front for whole year is just ripping off that I will not support such bad business practice with my money.

spawnia closed this as not planned Won't fix, can't repro, duplicate, stale Dec 11, 2023

spawnia added the enhancement label Dec 11, 2023

spawnia reopened this Dec 15, 2023

spawnia changed the title ~~Feature request: disable return/resolved values validation~~ Improve performance by disabling return/resolved values validation Dec 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve performance by disabling return/resolved values validation #1493

Improve performance by disabling return/resolved values validation #1493

Warxcell commented Dec 7, 2023 •

edited

Loading

spawnia commented Dec 11, 2023

mfn commented Dec 13, 2023

spawnia commented Dec 13, 2023

mfn commented Dec 13, 2023

Warxcell commented Dec 15, 2023 •

edited

Loading

mfn commented Dec 15, 2023

Warxcell commented Dec 15, 2023

Warxcell commented Dec 15, 2023 •

edited

Loading

Warxcell commented Dec 15, 2023 •

edited

Loading

SebastienTainon commented Mar 19, 2024 •

edited

Loading

Warxcell commented Mar 19, 2024 •

edited

Loading

spawnia commented Mar 20, 2024

Warxcell commented Jun 27, 2024

bountyhub-bot commented Sep 10, 2024

chojnicki commented Oct 2, 2024 •

edited

Loading

Warxcell commented Oct 2, 2024 •

edited

Loading

chojnicki commented Oct 2, 2024 •

edited

Loading

Warxcell commented Oct 3, 2024

chojnicki commented Oct 3, 2024 •

edited

Loading

Improve performance by disabling return/resolved values validation #1493

Improve performance by disabling return/resolved values validation #1493

Comments

Warxcell commented Dec 7, 2023 • edited Loading

spawnia commented Dec 11, 2023

mfn commented Dec 13, 2023

spawnia commented Dec 13, 2023

mfn commented Dec 13, 2023

Warxcell commented Dec 15, 2023 • edited Loading

mfn commented Dec 15, 2023

Warxcell commented Dec 15, 2023

Warxcell commented Dec 15, 2023 • edited Loading

Warxcell commented Dec 15, 2023 • edited Loading

SebastienTainon commented Mar 19, 2024 • edited Loading

Warxcell commented Mar 19, 2024 • edited Loading

spawnia commented Mar 20, 2024

Warxcell commented Jun 27, 2024

bountyhub-bot commented Sep 10, 2024

🚀 Bounty Alert!

chojnicki commented Oct 2, 2024 • edited Loading

Warxcell commented Oct 2, 2024 • edited Loading

chojnicki commented Oct 2, 2024 • edited Loading

Warxcell commented Oct 3, 2024

chojnicki commented Oct 3, 2024 • edited Loading

Warxcell commented Dec 7, 2023 •

edited

Loading

Warxcell commented Dec 15, 2023 •

edited

Loading

Warxcell commented Dec 15, 2023 •

edited

Loading

Warxcell commented Dec 15, 2023 •

edited

Loading

SebastienTainon commented Mar 19, 2024 •

edited

Loading

Warxcell commented Mar 19, 2024 •

edited

Loading

chojnicki commented Oct 2, 2024 •

edited

Loading

Warxcell commented Oct 2, 2024 •

edited

Loading

chojnicki commented Oct 2, 2024 •

edited

Loading

chojnicki commented Oct 3, 2024 •

edited

Loading