Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 BUG: D1 does not VACUUM causing performance issues #1618

Open
matthewjosephtaylor opened this issue Feb 2, 2024 · 4 comments
Open

🐛 BUG: D1 does not VACUUM causing performance issues #1618

matthewjosephtaylor opened this issue Feb 2, 2024 · 4 comments
Labels
bug Something isn't working D1 D1 Database

Comments

@matthewjosephtaylor
Copy link

Which Cloudflare product(s) does this pertain to?

D1

What version(s) of the tool(s) are you using?

3.25.0

What version of Node are you using?

20.9.0

What operating system and version are you using?

Mac Darwin Kernel Version 23.0.0

Describe the Bug

Observed behavior

D1 gets slower and slower as I use as I do a lot of inserts and deletes. the sqlite DB file inside .wrangler seems to grow endlessly

Expected behavior

sqlite auto vacuum should probably be enabled.

Steps to reproduce

Insert a lot of data into D1 db
delete a lot of rows from D1 db
Note that the size of the sqlite file on disk does not reduce.

Work-around:

sqlite3 .wrangler/state/v3/d1/<dbfile>.sqlite
# once opened
VACUUM;
.quit

Please provide a link to a minimal reproduction

No response

Please provide any relevant error logs

No response

@matthewjosephtaylor matthewjosephtaylor added the bug Something isn't working label Feb 2, 2024
@petebacondarwin petebacondarwin added the D1 D1 Database label Feb 5, 2024
@mrbbot
Copy link
Contributor

mrbbot commented Feb 5, 2024

Hey! 👋 Thanks for raising this. I'm going to transfer this to the workerd repo, since we'll need some help from the runtime team to address this issue. Either we...

  • Enable auto_vacuum in local mode only, noting that this only applies to newly created tables, so wouldn't work for existing .sqlite databases.
  • Expose VACUUM, either via SQL or some experimental API, then call this when starting up/disposing Miniflare instances. Ideally workerd would expose this so we can avoid depending on an SQLite client in miniflare. This would involve adding a Node native dependency which would slow down our installs, and require users to have C++ build tools installed.

@mrbbot mrbbot transferred this issue from cloudflare/workers-sdk Feb 5, 2024
@mrbbot
Copy link
Contributor

mrbbot commented Feb 5, 2024

For the runtime team, would you be able to weigh in on which solution you'd prefer? Or if there's something else we're not considering. 🙂

@a-robinson
Copy link
Member

We should be enabling auto-vacuum in local mode. cc @kentonv just in case he disagrees since he handled this for non-local D1.

@kentonv
Copy link
Member

kentonv commented Feb 5, 2024

We enable auto-vacuum in production, we should probably just do the same in local workerd.

That said, while auto-vacuum will allow the database to shrink when rows are deleted, I am not sure about the "gets slower and slower" part. What is making things slower, exactly? I wouldn't expect excessive freelist size to affect performance. Is fragmentation causing the slowness? If so auto-vacuum could actually make it worse.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working D1 D1 Database
Projects
Status: Backlog
Development

No branches or pull requests

5 participants