Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement persistent query cache for single queries #6881

Merged
merged 29 commits into from
Feb 23, 2024
Merged

Conversation

fantix
Copy link
Member

@fantix fantix commented Feb 16, 2024

  • Compilation results are serialized per QueryUnit in consideration of the future per-statement script cache
  • Using pickle to serialize QueryUnit, prefixed with a one-byte version number
  • Both compilation request and result are stored in edgedb._query_cache table serialized, restored when firstly introspecting the database/branch
  • Using PG notification for other tenants on the same backend to catch up with new cache entries
  • Caching as stored functions is disabled by default; turn on by EDGEDB_DEBUG_FUNC_CACHE=1
  • Make sure the DB/memory cache registry is well in-sync
  • Use creation_time for a less-brain-split distributed LRU
  • Handle in-transaction cache

Function TODOs:

  • Deal with the case of more than 100 parameters
  • Functions that would need to return record[]
  • Migration of non-function cache data to func cache
  • [ ]

@fantix fantix changed the title Implement persistent query cache Implement persistent query cache for single queries Feb 16, 2024
@fantix

This comment was marked as resolved.

@msullivan

This comment was marked as resolved.

@msullivan

This comment was marked as resolved.

@msullivan

This comment was marked as resolved.

@fantix

This comment was marked as resolved.

@msullivan
Copy link
Member

Should we just cache the text and not bother with functions?

@msullivan
Copy link
Member

Another option might be to actually generate composite types for everything returned by queries. That seems kind of miserable, though.

@elprans

This comment was marked as resolved.

@elprans
Copy link
Member

elprans commented Feb 18, 2024

Hm, though the insert query is failing with function return row and query-specified return row do not match, which seems fixable.

@fantix

This comment was marked as resolved.

@elprans

This comment was marked as resolved.

@msullivan
Copy link
Member

Let's make sure to document this

@fantix fantix force-pushed the persistent-cache-5 branch 4 times, most recently from 62d1cf1 to 4d46f95 Compare February 19, 2024 03:13
@msullivan
Copy link
Member

It seems like we don't populate the cache in transactions, which means that the test suite is not going to be exercising the cache very much.

@fantix
Copy link
Member Author

fantix commented Feb 20, 2024

It seems like we don't populate the cache in transactions, which means that the test suite is not going to be exercising the cache very much.

Yeah, that seemed to be the case for some time already, I do have a TODO in the pre-RFC to add the in-transaction cache back.

@fantix fantix requested review from 1st1, msullivan and elprans February 22, 2024 02:30
@fantix fantix marked this pull request as ready for review February 22, 2024 02:30
@msullivan msullivan force-pushed the persistent-cache-5 branch 2 times, most recently from 1153a39 to 8aa86aa Compare February 22, 2024 03:00
@msullivan
Copy link
Member

msullivan commented Feb 22, 2024

Some other things we might want:

  • A way to disable recompilation
  • A way for the user to forcibly clear the cache. Probably an administer command.

Thing two is mostly something that I need, but arguably these two things can be the same, and the way to "disable recompilation" is just to clear the cache first.

sql_res = pg_compiler.compile_ir_to_sql_tree(
ir,
expected_cardinality_one=ctx.expected_cardinality_one,
output_format=_convert_format(ctx.output_format),
backend_runtime_params=ctx.backend_runtime_params,
expand_inhviews=options.expand_inhviews,
detach_params=bool(use_persistent_cache and debug.flags.func_cache),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't need the bool, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right if we can let mypy think Flag is a bool:

edb/server/compiler/compiler.py:1854:23: error: Argument "detach_params" to "compile_ir_to_sql_tree" has incompatible type "Flag | bool"; expected "bool"  [arg-type]

if keys:
self.tenant.create_task(
self.tenant.evict_query_cache(self.name, keys),
interruptable=False,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this right? (Genuinely I am not sure.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, script cache! Now fixed and I'll add a test to cover this branch.

fantix and others added 26 commits February 23, 2024 18:02
Is this reliable, or do we need to properly retry??
* Apply review suggestions
* More comments and code style
* Fix eviction on scripts
* Fix missing signal in clear_cache_keys()
* Recompiled queries are persited in batches
* Fix bug that recompile didn't work
COMMIT may also contain user_schema, so we know DDL was in the tx
@fantix fantix force-pushed the persistent-cache-5 branch from 5a31914 to 1769098 Compare February 23, 2024 23:04
@fantix fantix merged commit 3912a68 into master Feb 23, 2024
24 checks passed
@fantix fantix deleted the persistent-cache-5 branch February 23, 2024 23:45
@fantix fantix mentioned this pull request Mar 13, 2024
9 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants