Skip to content

Add conditional index on morango store to increase deserialization performance #228

Closed
@bjester

Description

@bjester

Overview

From research into KDP deserialization performance:

I think this graph of the KDP prod database IO is interesting. Currently, a sync is deserializing. The units in the graph are seconds of IO Wait per second. So I interpret that anything over 1 s/s essentially means the IO Wait is compounding. It's almost entirely reads, and so it seems this is likely caused by sequential reads of the morango store (without index). The query that I see filters on several columns, and so a full compound index would be gigantic, but a targeted index on the same columns with the constraint on dirty_bit = True should theoretically be plenty small with frequent deserializations. And it should speed it up a lot if the planner decides it's able to use the index

Screenshot from 2024-08-30 14-38-39

Since we dropped support for anything older than Django 3, we can now add a conditional index through Django

Description and outcomes

  • Add a conditional index on profile, model_name, partition, and dirty_bit with the condition that dirty_bit = True
  • Ensure this works with SQLite with the test suite we added to test older versions of SQLite
SELECT "morango_store"."profile", "morango_store"."serialized", "morango_store"."deleted", "morango_store"."hard_deleted", "morango_store"."last_saved_instance", "morango_store"."last_saved_counter", "morango_store"."partition", "morango_store"."source_id", "morango_store"."model_name", "morango_store"."conflicting_serialized_data", "morango_store"."_self_ref_fk", "morango_store"."id", "morango_store"."dirty_bit", "morango_store"."deserialization_error", "morango_store"."last_transfer_session_id" FROM "morango_store" WHERE ("morango_store"."profile" = 'facilitydata' AND "morango_store"."model_name" = 'attemptlog' AND "morango_store"."partition"::text LIKE '<REDACTED>%' AND "morango_store"."dirty_bit" = true)

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions