Skip to content

Commit

Permalink
SQLite: Optimization: Read one row ahead.
Browse files Browse the repository at this point in the history
By always iterating the underlying query one row ahead of what has been returned, we can discover when the query is done and return it to the statement cache more proactively.

Without this optimization, statements very commonly don't get returned to the cache until the cursor is GC'd -- especially statements that return no results at all.

In theory this would inflate the "rows read" metric for an application that commonly creates cursors and doesn't iterate them to completion. But, that should be uncommon, and "buffering ahead" is hardly an unreasonable thing for the platform to be doing.

I verified this optimization works by checking how often sql-test.js is unable to reuse a cache entry because it's still "in use" (`isShared()` returns true). Before this, there were dozens of cases, but after, there are only two, and they are intentional cases of overlapping queries.
  • Loading branch information
kentonv committed Oct 29, 2024
1 parent 1da3c7e commit 6f5fd8c
Show file tree
Hide file tree
Showing 3 changed files with 54 additions and 34 deletions.
5 changes: 4 additions & 1 deletion src/workerd/api/sql-test.js
Original file line number Diff line number Diff line change
Expand Up @@ -1070,9 +1070,12 @@ async function testIoStats(storage) {
while (true) {
const result = resultsIterator.next();
if (result.done) {
assert.equal(10, cursor.rowsRead);
break;
}
assert.equal(++rowsSeen, cursor.rowsRead);
// + 1 because the cursor is always one result ahead of what has been returned -- but there
// are only 10 rows total.
assert.equal(Math.min(++rowsSeen + 1, 10), cursor.rowsRead);
}
}

Expand Down
64 changes: 36 additions & 28 deletions src/workerd/api/sql.c++
Original file line number Diff line number Diff line change
Expand Up @@ -226,20 +226,13 @@ jsg::JsValue SqlStorage::Cursor::one(jsg::Lock& js) {
auto result = JSG_REQUIRE_NONNULL(rowIteratorNext(js, self), Error,
"Expected exactly one result from SQL query, but got no results.");

// Manually grab the query, check that there are no more results, and clean it up.
auto& query = KJ_ASSERT_NONNULL(state)->query;
query.nextRow();
bool hadOneResult = query.isDone();

// Save off row counts before the query goes away, just like iteratorImpl() would do when done.
rowsRead = query.getRowsRead();
rowsWritten = query.getRowsWritten();

// End the query even if it wasn't done.
state = kj::none;

JSG_REQUIRE(
hadOneResult, Error, "Expected exactly one result from SQL query, but got multiple results.");
KJ_IF_SOME(s, state) {
// It appears that the query had more results, otherwise we would have set `state` to `none`
// inside `iteratorImpl()`.
endQuery(*s);
JSG_FAIL_REQUIRE(
Error, "Expected exactly one result from SQL query, but got multiple results.");
}

return result;
}
Expand All @@ -252,8 +245,8 @@ jsg::Ref<SqlStorage::Cursor::RowIterator> SqlStorage::Cursor::rows(jsg::Lock& js
}

kj::Maybe<jsg::JsObject> SqlStorage::Cursor::rowIteratorNext(jsg::Lock& js, jsg::Ref<Cursor>& obj) {
auto names = obj->cachedColumnNames.get();
KJ_IF_SOME(values, iteratorImpl(js, obj)) {
auto names = obj->cachedColumnNames.get();
jsg::JsObject result = js.obj();
KJ_ASSERT(names.size() == values.size());
for (auto i: kj::zeroTo(names.size())) {
Expand Down Expand Up @@ -303,22 +296,10 @@ kj::Maybe<v8::LocalVector<v8::Value>> SqlStorage::Cursor::iteratorImpl(
}
});

if (state.isFirst) {
// Little hack: We don't want to call query.nextRow() at the end of this method because it
// may invalidate the backing buffers of StringPtrs that we haven't returned to JS yet.
state.isFirst = false;
} else {
state.query.nextRow();
}

auto& query = state.query;

if (query.isDone()) {
// Save off row counts before the query goes away.
obj->rowsRead = query.getRowsRead();
obj->rowsWritten = query.getRowsWritten();
// Clean up the query proactively.
obj->state = kj::none;
obj->endQuery(state);
return kj::none;
}

Expand Down Expand Up @@ -349,9 +330,36 @@ kj::Maybe<v8::LocalVector<v8::Value>> SqlStorage::Cursor::iteratorImpl(
}
results.push_back(wrapSqlValue(js, kj::mv(value)));
}

// Proactively iterate to the next row and, if it turns out the query is done, discard it. This
// is an optimization to make sure that the statement can be returned to the statement cache once
// the application has iterated over all results, even if the application fails to call next()
// one last time to get `{done: true}`. A common case where this could happen is if the app is
// expecting zero or one results, so it calls `exec(...).next()`. In the case that one result
// was returned, the application may not bother calling `next()` again. If we hadn't proactively
// iterated ahead by one, then the statement would not be returned to the cache until it was
// GC'd, which might prevent the cache from being effective in the meantime.
//
// Unfortunately, this does not help with the case where the application stops iterating with
// results still available from the cursor. There's not much we can do about that case since
// there's no way to know if the app might come back and try to use the cursor again later.
query.nextRow();
if (query.isDone()) {
obj->endQuery(state);
}

return kj::mv(results);
}

void SqlStorage::Cursor::endQuery(State& stateRef) {
// Save off row counts before the query goes away.
rowsRead = stateRef.query.getRowsRead();
rowsWritten = stateRef.query.getRowsWritten();

// Clean up the query proactively.
state = kj::none;
}

kj::Array<const SqliteDatabase::Query::ValuePtr> SqlStorage::Cursor::mapBindings(
kj::ArrayPtr<BindingValue> values) {
return KJ_MAP(value, values) -> SqliteDatabase::Query::ValuePtr {
Expand Down
19 changes: 14 additions & 5 deletions src/workerd/api/sql.h
Original file line number Diff line number Diff line change
Expand Up @@ -200,9 +200,15 @@ class SqlStorage::Cursor final: public jsg::Object {
public:
template <typename... Params>
Cursor(Params&&... params)
: state(IoContext::current().addObject(kj::heap<State>(kj::fwd<Params>(params)...))),
ownCachedColumnNames(kj::none), // silence bogus Clang warning on next line
cachedColumnNames(ownCachedColumnNames.emplace()) {}
: ownCachedColumnNames(kj::none), // silence bogus Clang warning on next line
cachedColumnNames(ownCachedColumnNames.emplace()) {
auto stateObj = kj::heap<State>(kj::fwd<Params>(params)...);
if (stateObj->query.isDone()) {
endQuery(*stateObj);
} else {
state = IoContext::current().addObject(kj::mv(stateObj));
}
}
~Cursor() noexcept(false);

double getRowsRead();
Expand Down Expand Up @@ -254,8 +260,6 @@ class SqlStorage::Cursor final: public jsg::Object {

SqliteDatabase::Query query;

bool isFirst = true;

State(SqliteDatabase& db,
SqliteDatabase::Regulator& regulator,
kj::StringPtr sqlCode,
Expand Down Expand Up @@ -286,6 +290,11 @@ class SqlStorage::Cursor final: public jsg::Object {
kj::Maybe<CachedColumnNames> ownCachedColumnNames;
CachedColumnNames& cachedColumnNames;

// Invoke when `query.isDone()`, or when we want to prematurely cancel the query. This records
// row counters and then sets `state` to `none` to drop the query and return the prepared
// statement to the statement cache.
void endQuery(State& stateRef);

static kj::Array<const SqliteDatabase::Query::ValuePtr> mapBindings(
kj::ArrayPtr<BindingValue> values);

Expand Down

0 comments on commit 6f5fd8c

Please sign in to comment.