Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[batch] Consider retention periods for batches #14626

Open
daniel-goldstein opened this issue Jul 18, 2024 · 0 comments
Open

[batch] Consider retention periods for batches #14626

daniel-goldstein opened this issue Jul 18, 2024 · 0 comments
Labels
batch Epic needs-triage A brand new issue that needs triaging.

Comments

@daniel-goldstein
Copy link
Contributor

What happened?

Hail Batch never forgets a batch. All batches, jobs, and attempts are forever persisted in the Batch database. This is rarely a performance problem, as the indexes ensure that old rows are rarely ever looked at, but the fact that the database storage is monotonically increasing is something that we have to reckon with, and it makes migrations very time intensive. There are certainly many improvements that can be made to waste less space in the database (like #14623), but ultimately we will need to make a decision about how long we should persist batches.

We should quantify the utility of historic batches, what might be a good cutoff or alternative process for expiring batches, and whether we should provide some sort of export that users can use to own information about their batches. I imagine the most relevant information would be cost and logs.

Version

0.2.132

Relevant log output

No response

@daniel-goldstein daniel-goldstein added Epic needs-triage A brand new issue that needs triaging. batch labels Jul 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
batch Epic needs-triage A brand new issue that needs triaging.
Projects
None yet
Development

No branches or pull requests

1 participant