Skip to content

Optimize IFNULL with indexes #1552

Open
@rentalhost

Description

@rentalhost

Hello,

I've been studying SQLite using the Turso platform. There, every read or write operation is counted. So, if I run a query that doesn't match an existing index, a table scan is performed, consuming as many read credits as there are records in the table. This can be quite costly.

Let me share a real example from my experience:

CREATE TABLE "example_entries" (
	"id"		INTEGER NOT NULL UNIQUE,
	"name"		TEXT,
	"updatedAt"	INTEGER,
	"createdAt"	INTEGER NOT NULL,

	PRIMARY KEY("id" AUTOINCREMENT)
);

Whenever a record is created, only the createdAt field is populated. For simplicity, let's assume updatedAt remains NULL during creation.

If I run a query based on either updatedAt or createdAt, it results in a table scan because neither column is indexed. To address this, we can create indexes:

CREATE INDEX "example_entries.updatedAt" ON "example_entries" ("updatedAt" ASC);
CREATE INDEX "example_entries.createdAt" ON "example_entries" ("createdAt" ASC);

Great! Now, let's say I want to return the id, name, and the update date, or the creation date if the update date is NULL. This can be done with:

SELECT `id`, `name`, IFNULL(`updatedAt`, `createdAt`) AS `updatedAt` FROM `example_entries`;

This works well, though we still have the table scan issue, but without a WHERE clause, there's not much that can be done.

Now, let's limit our query to find records updated or created after a certain timestamp:

SELECT `id`, `name`, IFNULL(`updatedAt`, `createdAt`) AS `updatedAt` FROM `example_entries` WHERE IFNULL(`updatedAt`, `createdAt`) >= @timestamp;

Although this filter works, it still results in a table scan. From what I understand, SQLite can't optimize IFNULL with two columns to use indexes efficiently. To solve this, I need to "expand" the IFNULL like this:

SELECT `id`, `name`, IFNULL(`updatedAt`, `createdAt`) AS `updatedAt` FROM `example_entries` WHERE `updatedAt` >= @timestamp OR (`updatedAt` IS NULL AND `createdAt` >= @timestamp);

This adds cognitive complexity for something that might be optimized internally by SQLite.

I have reported this issue to the SQLite team for their evaluation as well. You can find the post here: SQLite Forum Post.

I would love to hear the libsql team's perspective on this matter and if there's a possibility for such an optimization in libsql.

Thank you for your attention.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions