Optimization - Skip processing non-existing fields in FieldsWillMerge #5125
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Context
It was observed that if a query includes thousands of non-existing fields, it takes a lot of processing time within the FieldsWillMerge#find_conflicts_within.
What
The iteration loop represents
n(n-1^2)/2
, wheren
is the number of duplicated fields being queried, and the underlying call to #find_conflict is using considerable time. For example, a query with2000
times the same non-existing field i.e.query { a a ... 2000 times }
takes ~6000 ms to process and skipping these fields reduces the response time to ~500ms.Optimization
As far as I can tell, skipping non-existing fields from the
FieldsWillMerge
rule doesn't alter the business logic.The only caveat of this optimization, is that if a service using this library was relying on the Schema#validate_timeout to safeguard from a potentially malicious query, it may no longer reach the configured timeout but rather instead return multiple
undefinedField
errors, which can be mitigated by the Schema#validate_max_errors.