Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: support minimum idle time before terminating server #125

Open
wants to merge 16 commits into
base: master
Choose a base branch
from

Conversation

tjstansell
Copy link

This attempts to add support for DRONE_POOL_MIN_IDLE, which will define the minimum idle time before a server is terminated. It defaults to 0, but if you set it to 30m then it will only terminate a server after it has sat idle for at least 30 minutes (no jobs have run). It utilizes the Updated server property, which I believe gets updated as the server changes state from running to pending status. As it completes the builds, this should represent when the last build finished. As such, if jobs continue to fire regularly, then it stays online, but if jobs stop, it would then terminate after this duration.

I'm currently trying to get a test installation online to test these changes, but figured I'd see if folks had feedback regardless.

@tjstansell
Copy link
Author

Hm... the above changes cause slack notifications to be sent since they trigger on Updates.

@bradrydzewski
Copy link
Member

bradrydzewski commented Jan 12, 2023

@tjstansell conceptually I like the idea, however, the Updated date does not reflect the time the last build was executed on the machine, it just reflects the last time the database entry was updated. We do not currently store the last build date in the autoscaler database. I am not sure how to achieve this at the moment -- would need to think on this more.

@bradrydzewski
Copy link
Member

bradrydzewski commented Jan 12, 2023

ok, reading the patch, I think I have a better understanding. It looks like a forced save is used to update the Updated date for the server based on the list of busy servers, which is a clever approach. I would probably recommend we create a dedicated field to store this data (e.g. server.LastBusy). The main reason is that I cannot guarantee the Updated date would not be updated in the future, for other reasons, which could break your logic.

engine/planner.go Outdated Show resolved Hide resolved
@tjstansell
Copy link
Author

Thanks for all the feedback so far. I updated it to use a new LastBusy field to track this. Hopefully I did it right. :) Let me know if we need anything else.

@tjstansell
Copy link
Author

Actually, I think I now see how the db migration bits work ... working on updating it to do a proper migration.

@tjstansell
Copy link
Author

I deployed this to our internal instance and it seems to be working! Let me know if there's anything you'd like to see changed, though... Thanks!

@tjstansell
Copy link
Author

Any additional thoughts on this? Thanks.

bsda added a commit to Synthace/drone-autoscaler that referenced this pull request Feb 16, 2023
the autoscaler will shutdown any server that is not busy but has reached the min_age parameter (30m currently)

This feature adds a new flag that basically resets the age to 0 every time a new build is executed.

This will allow us to decrease the min-age but allow the servers to be used for a longer period of time during busy hours, thus avoiding starting up too many servers and decreasing the build startup times

This is a feature added by a contributor to the original autoscaler code. harness is very slow to accept contributions so the PR has sat there for about 1 month. drone#125
@tjstansell
Copy link
Author

@bradrydzewski curious if there's any way to get some progress on this so it doesn't just get lost. fwiw, we've been running on this code on our production instances for months now without any issue.

@tjstansell
Copy link
Author

@bradrydzewski sorry to keep pestering, but I've been trying to get some progress on this PR now for over 3 months with zero responses. Is there any chance someone can take a look at this before it's so stale it's no longer valid?

@alikhil
Copy link

alikhil commented Aug 18, 2023

It would be nice to have this feature merged

@tjstansell
Copy link
Author

FWIW, we've been running this patched version for the last year and a half and have not noticed any issues... I did just rebase my patch against the current master branch to pick up the other features that have been added, so this is now current. I'd love it if someone could evaluate this and hopefully merge it this time...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants