Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(graphql): cache attribute hashes #723

Merged

Conversation

rmosolgo
Copy link
Contributor

@rmosolgo rmosolgo commented Nov 21, 2023

This is to reduce memory consumption by this tracer when running GraphQL queries. In my benchmark (below), this resulted in a 97% reduction in this library's footprint and a 10% reduction overall (~7% of objects).

- Total allocated: 1809805 bytes (15888 objects)
+ Total allocated: 1625269 bytes (14794 objects)

  Total retained:  0 bytes (0 objects)

  allocated memory by gem
  -----------------------------------
   1003824  opentelemetry-sdk-1.3.1
    381029  graphql-2.1.6
    194512  opentelemetry-api-1.2.3
-   189656  opentelemetry-instrumentation-graphql-0.26.7
     35952  other
+     4880  graphql/lib
      4288  racc-1.7.3
       368  logger
       136  monitor
        40  set
otel_bench.rb

require "bundler/inline"

gemfile(false) do
  source "https://rubygems.org"
  gem "graphql", "2.1.6"
  gem "opentelemetry-sdk" # , path: "~/code/opentelemetry-ruby/sdk"
  gem "opentelemetry-api" # , path: "~/code/opentelemetry-ruby/api"
  gem "opentelemetry-instrumentation-graphql", path: "~/code/opentelemetry-ruby-contrib/instrumentation/graphql"
  gem "benchmark-ips"
  gem "memory_profiler"
  gem "stackprof"
end

class MySchema < GraphQL::Schema
  DATA = {
    "5" => {
      name: "Beatrix Potter",
    },
    "6" => {
      name: "J.R.R. Tolkien"
    },
    "10" => {
      title: "The Tale of Squirrel Nutkin",
      author_id: "5",
      recommended_book_ids: ["11", "12"],
    },
    "11" => {
      title: "The Tale of Ginger and Pickles",
      author_id: "5",
      recommended_book_ids: ["10", "12"],
    },
    "12" => {
      title: "The Tale of Mr. Tod",
      author_id: "5",
      recommended_book_ids: ["11", "10"],
    },
    "21" => {
      title: "The Hobbit",
      author_id: "6",
      recommended_book_ids: ["22", "23", "24", "25", "26", "27"],
    },
    "22" => {
      title: "The Fellowship of the Ring",
      author_id: "6",
      recommended_book_ids: ["21", "23", "24", "25", "26", "27"],
    },
    "23" => {
      title: "The Two Towers",
      author_id: "6",
      recommended_book_ids: ["22", "21", "24", "25", "26", "27"],
    },
    "24" => {
      title: "The Return of the King",
      author_id: "6",
      recommended_book_ids: ["22", "23", "21", "25", "26", "27"],
    },
    "25" => {
      title: "The Silmarillion",
      author_id: "6",
      recommended_book_ids: ["22", "23", "24", "21", "26", "27"],
    },
    "26" => {
      title: "The Adventures of Tom Bombadil",
      author_id: "6",
      recommended_book_ids: ["21", "22", "23", "24", "25", "27"]
    },
    "27" => {
      title: "Leaf by Niggle",
      author_id: "6",
      recommended_book_ids: ["21", "22", "23", "24", "25", "26"]
    }
  }

  module Node
    include GraphQL::Schema::Interface
    field :id, ID
  end

  class Author < GraphQL::Schema::Object
    def self.authorized?(obj, ctx)
      -> { true }
    end
    implements Node
    field :name, String
    field :books, ["MySchema::Book"]

    def books
      author_id = DATA.find { |(id, auth)| auth == object }.first
      DATA.each_value.select { |d| d[:author_id] == author_id }
    end
  end

  class Book < GraphQL::Schema::Object
    implements Node
    field :title, String
    field :author, Author
    field :recommended_books, [Book]

    def author
      DATA[object[:author_id]]
    end

    def recommended_books
      object[:recommended_book_ids].map { |id| -> { DATA[id] } }
    end
  end

  class Query < GraphQL::Schema::Object
    field :node, Node do
      argument :id, ID
    end

    def node(id:)
      DATA[id]
    end
  end

  query(Query)

  orphan_types(Book, Author)

  def self.resolve_type(type, obj, ctx)
    if obj.key?(:name)
      Author
    elsif obj.key?(:title)
      Book
    else
      raise "Unknown object: #{obj.inspect}"
    end
  end

  lazy_resolve(Proc, :call)
end

OpenTelemetry::SDK.configure do |c|
  c.use 'OpenTelemetry::Instrumentation::GraphQL', {
    schemas: [MySchema],
    # enable_platform_field maps to the execute_field and execute_field_lazy keys
    enable_platform_field: true,
    # enable_platform_authorized maps to the authorized and authorized_lazy keys
    enable_platform_authorized: true,
    # enable_platform_resolve_type maps to the resolve_type and resolve_type_lazy keys
    enable_platform_resolve_type: true,
  }
end


# puts MySchema.to_definition

small_query_str = <<-GRAPHQL
query {
  node(id: "10") {
    __typename
    ... on Book {
      title
      author {
        __typename
        name
      }
      recommendedBooks {
        __typename
        title
      }
    }
  }
}
GRAPHQL

large_query_str = <<-GRAPHQL
{
  node(id: "6") {
    ... on Author {
      name
      books {
        title
        recommendedBooks {
          title
          author { name }
          recommendedBooks { title }
        }
      }
    }
  }
}
GRAPHQL

# pp MySchema.execute(small_query_str).to_h
# pp MySchema.execute(large_query_str).to_h

Benchmark.ips do |x|
  x.report("small query") { MySchema.execute(small_query_str) }
  x.report("large query") { MySchema.execute(large_query_str) }
end

query_str = large_query_str

puts "\n\n\n"

result = StackProf.run(mode: :wall, interval: 1) do
  MySchema.execute(query_str)
end
report = StackProf::Report.new(result)
File.open("tmp/otel_query.dump", "wb+") { |f| report.print_dump(f) }
report.print_text

report = MemoryProfiler.report do
  MySchema.execute(query_str)
end
puts "\n\n\n"
report.pretty_print

I also added .compare_by_identity to these caches because they always receive the same objects -- GraphQL definition objects -- as keys.

@@ -73,26 +126,18 @@ def execute_query_lazy(query:, multiplex:, &block)

def execute_field(field:, query:, ast_node:, arguments:, object:, &block)
platform_key = _otel_execute_field_key(field: field)
return super unless platform_key
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't explain why, but replacing this super call (and the one below in execute_field_lazy) made up a big part of the memory improvement. In the "before" profile, these lines appear in the result:

       345  /Users/rmosolgo/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/graphql-2.1.6/lib/graphql/schema/object.rb:82
       345  /Users/rmosolgo/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/opentelemetry-instrumentation-graphql-0.26.7/lib/opentelemetry/instrumentation/graphql/tracers/graphql_trace.rb:104
       344  /Users/rmosolgo/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/graphql-2.1.6/lib/graphql/execution/interpreter/runtime.rb:600
       344  /Users/rmosolgo/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/opentelemetry-instrumentation-graphql-0.26.7/lib/opentelemetry/instrumentation/graphql/tracers/graphql_trace.rb:76
       343  otel_bench.rb:104

But adding these arguments caused them to not appear anymore.

Looking at the breakdown by class, my only guess is that, somewhere under the hood, Ruby was making a Hash as part of this code:

  allocated objects by class
  -----------------------------------
-     6230  Hash
+     5126  Hash
      2395  Array
-     1188  Proc
+     1194  Proc
-      871  String
+      875  String
       785  Thread::Mutex
       784  OpenTelemetry::Context
       784  OpenTelemetry::SDK::Trace::Samplers::Result
       784  OpenTelemetry::SDK::Trace::Span
       784  OpenTelemetry::Trace::SpanContext
       423  GraphQL::Execution::Lazy
       345  GraphQL::Execution::Interpreter::Runtime::GraphQLResultHash
       301  MySchema::Book

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for sharing this. I wonder if there is a performance Rubocop linter we can leverage here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is the rubocop-performance gem, but after doing a quick test with the current instrumentation, these issues weren't pointed out.

@arielvalentin
Copy link
Collaborator

@rmosolgo TIL about Hash#compare_by_identity.

Is this allocation problem specific to OTel or have you also identified this problem with the existing Tracers that included in the graphql gem?

I would like to confirm with you that these types will not change at runtime, i.e. there isn't any logic that would result in dynamically generated types or Module instances that would result in the hashes growing out of control. Is that correct?

Also, do you have any concerns about thread safe access to these cache Hashes in JRuby?

Is there a way to eagerly seed the hashes at startup as opposed to using lazy initialization?

@rmosolgo
Copy link
Contributor Author

existing Tracers

Broadly speaking, it's definitely a general problem. The reason I re-wrote GraphQL-Ruby's tracing to use methods with keyword arguments was to avoid allocating hashes for the old data argument. Those hashes were created regardless of whether or not they were actually used and, since tracing calls grow with the size of the execution result set, the overhead really grows for large responses.

As for the built-in tracers, I definitely try to keep allocations in these paths to a minimum, although I know the datadog one is not so great 🙈 .

hashes growing out of control

It is technically possible to create GraphQL types on-the-fly (https://graphql-ruby.org/schema/dynamic_types.html), but in any case, these hashes only live for the duration of a single query. The Trace object they belong to is created at the start of the query and isn't used after that. (This object lifetime was another reason for rewriting the trace system!) In this way, it's similar to GraphQL-Ruby's own per-query caches of types, arguments, fields, etc, in GraphQL::Schema::Warden.

thread safe access

I don't have specific insight into this, but GraphQL-Ruby uses tons of caches just like this:

And of course, the existing platform_*_key hashes work this way, too. Since those work properly, I'm guessing these will too!

eagerly seed the hashes

It might be possible to do this via context.query.types, but I don't think it's desirable, because most queries use less than the full set of types. Creating them as-needed and then discarding them at the end of the query seems like a sweet spot between creating instances for every call (which creates a lot of duplicate objects) and making a "master cache" at the schema level (which would be fewer allocations, but we'd have to think through issues like thread safety or permissions between different queries).

compare_by_identity

Yes, it's new to me too, but I've started using it all over GraphQL-Ruby for this kind of cache! It ends up saving a good bit of time on hot paths, when you know the keys will be exactly the same object (eg rmosolgo/graphql-ruby#4431 (comment), rmosolgo/graphql-ruby#4443)

@arielvalentin
Copy link
Collaborator

I appreciate the context and explanations. I'll take a look in later today if I have enough time.

I'll be OOO until Monday. This review will be at the top of my list if I am unable to get to it today.

Copy link
Collaborator

@arielvalentin arielvalentin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I see. I did not realize these are following the existing caching patterns that exist for field, authorize and resolve_type.

The difference being that it uses compare_by_identiy and adds support for caching lazy types.

Is there a similar optimization that can be done for the older tracer implementation?

@arielvalentin arielvalentin changed the title GraphQL: cache attribute hashes perf(graphql): cache attribute hashes Nov 22, 2023
@rmosolgo
Copy link
Contributor Author

similar optimization

There is, I pushed it in 92ad82f

I also noticed that, since authorized and resolve_type create the same kind of attributes hash, they could share a cache. (Same thing for authorized_lazy and resolve_type_lazy.) I updated both implementations in e65ef47.

Copy link
Collaborator

@arielvalentin arielvalentin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rmosolgo it looks like the tests for the legacy tracer are failing now.

I did not mean for my comments to have you optimize the legacy tracer as part of this PR and I am amenable to deferring optimizations of the legacy tracer as part of a separate one.

Since the test suite is broken I will change the review status to "Request Changes" until that is fixed.

@rmosolgo
Copy link
Contributor Author

Derp -- I'll revert those changes. It looks like that tracer expects a slightly different behavior when tracing fields that are defined in an interface module.

@arielvalentin arielvalentin merged commit a7f6111 into open-telemetry:main Nov 27, 2023
46 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants