Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add conditional index on morango store to increase deserialization performance #228

Closed
bjester opened this issue Aug 30, 2024 · 0 comments · Fixed by #232
Closed

Add conditional index on morango store to increase deserialization performance #228

bjester opened this issue Aug 30, 2024 · 0 comments · Fixed by #232

Comments

@bjester
Copy link
Member

bjester commented Aug 30, 2024

Overview

From research into KDP deserialization performance:

I think this graph of the KDP prod database IO is interesting. Currently, a sync is deserializing. The units in the graph are seconds of IO Wait per second. So I interpret that anything over 1 s/s essentially means the IO Wait is compounding. It's almost entirely reads, and so it seems this is likely caused by sequential reads of the morango store (without index). The query that I see filters on several columns, and so a full compound index would be gigantic, but a targeted index on the same columns with the constraint on dirty_bit = True should theoretically be plenty small with frequent deserializations. And it should speed it up a lot if the planner decides it's able to use the index

Screenshot from 2024-08-30 14-38-39

Since we dropped support for anything older than Django 3, we can now add a conditional index through Django

Description and outcomes

  • Add a conditional index on profile, model_name, partition, and dirty_bit with the condition that dirty_bit = True
  • Ensure this works with SQLite with the test suite we added to test older versions of SQLite
SELECT "morango_store"."profile", "morango_store"."serialized", "morango_store"."deleted", "morango_store"."hard_deleted", "morango_store"."last_saved_instance", "morango_store"."last_saved_counter", "morango_store"."partition", "morango_store"."source_id", "morango_store"."model_name", "morango_store"."conflicting_serialized_data", "morango_store"."_self_ref_fk", "morango_store"."id", "morango_store"."dirty_bit", "morango_store"."deserialization_error", "morango_store"."last_transfer_session_id" FROM "morango_store" WHERE ("morango_store"."profile" = 'facilitydata' AND "morango_store"."model_name" = 'attemptlog' AND "morango_store"."partition"::text LIKE '<REDACTED>%' AND "morango_store"."dirty_bit" = true)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants