Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Select extra node in pagination relations for free hasNextPage query #4908

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

mtollie
Copy link

@mtollie mtollie commented Apr 11, 2024

The goal of this PR is to implement modifying the query generated by nodes() to add an extra row to the query limit and returned rows. This row is thrown away, but its existence is used to infer the value of @has_next_page.

Hopefully this isn't a crazy suggestion but more of an answer to active_record_relation_connection_spec.rb

This currently runs one query to load the nodes, then another one to count just beyond the nodes.
A better implementation would load first + 1 nodes and use that to set has_next_page.

The assumption is that fetching 1 extra row is cheaper and more efficient (e.g. fewer round trips, warmer cache, more efficient index than that chosen for relation_count()) than a SELECT COUNT(*) ... LIMIT 1 ... would be. Another assumption is that hasNextPage will actually be wanted in the future by the query. For our use case in a full-text search context this is definitely true but open to suggestions!

Also somewhat unsure about the placement and desired generalizability of this function. It works fine put in the base relation_connection class but might have undesired side effects.

@rmosolgo
Copy link
Owner

Hey, thanks for getting this started! I think this would be a great improvement.

One catch right now is that SQL queries are dependent on the order of selections in the GraphQL document. If hasNextPage comes before nodes, it will still run two queries. I push a new spec demonstrating that behavior.

It'd be nice to make this behavior independent of the client's ordering of fields, but I'm not sure the best way to do it. One way to do it would be to always load the nodes -- even when only hasNextPage is selected. This isn't perfectly optimized, but it's easier to do. (GraphQL-Pro's stable relation connection uses this approach.)

Another approach might be to somehow initialize a lookahead (https://graphql-ruby.org/queries/lookahead.html) and use it to investigate what fields are selected. I think you'd have to do that in here somehow:

def after_resolve(value:, object:, arguments:, context:, memo:)
original_arguments = memo
# rename some inputs to avoid conflicts inside the block
maybe_lazy = value
value = nil
context.query.after_lazy(maybe_lazy) do |resolved_value|
value = resolved_value
if value.is_a? GraphQL::ExecutionError
# This isn't even going to work because context doesn't have ast_node anymore
context.add_error(value)
nil
elsif value.nil?
nil
elsif value.is_a?(GraphQL::Pagination::Connection)
# update the connection with some things that may not have been provided
value.context ||= context
value.parent ||= object.object
value.first_value ||= original_arguments[:first]
value.after_value ||= original_arguments[:after]
value.last_value ||= original_arguments[:last]
value.before_value ||= original_arguments[:before]
value.arguments ||= original_arguments # rubocop:disable Development/ContextIsPassedCop -- unrelated .arguments method
value.field ||= field
if field.has_max_page_size? && !value.has_max_page_size_override?
value.max_page_size = field.max_page_size
end
if field.has_default_page_size? && !value.has_default_page_size_override?
value.default_page_size = field.default_page_size
end
if context.schema.new_connections? && (custom_t = context.schema.connections.edge_class_for_field(@field))
value.edge_class = custom_t
end
value
else
context.namespace(:connections)[:all_wrappers] ||= context.schema.connections.all_wrappers
context.schema.connections.wrap(field, object.object, value, original_arguments, context)
end
end

But I'm not sure the best way to get the ast_nodes: ... needed for a lookahead. Here's where GraphQL-Ruby usually does it, just for example's sake:

when :lookahead
if !field_ast_nodes
field_ast_nodes = [ast_node]
end
extra_args[:lookahead] = Execution::Lookahead.new(
query: query,
ast_nodes: field_ast_nodes,
field: field_defn,
)

Let me know what you think -- maybe we can find a way to work up a lookahead if we need one...

@mtollie
Copy link
Author

mtollie commented Apr 23, 2024

Apologies for the delayed response and thanks for the new test! Great catch.

I totally agree that the most correct behavior is to make the optimized behavior independent of client ordering. But I'm not a huge fan of always loading the nodes. I tried that approach initially before this PR and thought that it was a mess keeping the details straight between the ActiveRecord relation connection, the base relation connection, and the base connection. And it'd still be a performance regression for the no-nodes edge case which I assumed was common enough given the test suite coverage.

The lookahead approach sounds really interesting and the direction that'd allow both optimizations to co-exist. A bit out of my wheelhouse, but I'll keep looking in that direction!

@mtollie
Copy link
Author

mtollie commented Apr 26, 2024

@rmosolgo What do you think about always evaluating nodes first if it's present? That avoids the ordering/lookahead problem completely.

At a higher level it would be something like evaluating results by descending complexity which makes some intuitive sense to me. I made a terribly inefficient and partial but test passing proof-of-concept of what I mean here: mtollie#3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants