Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement graceful termination #39

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

tsipinakis
Copy link
Member

Please describe the change you are making

Implement support for graceful termination. Currently when upgrading or restarting ContainerSSH all connections are killed instantly kicking out all users. This patch adds an option to delay the termination until all users are disconnected.

This is very useful in kubernetes, with this the pod will stay in 'Terminating' state until all users are disconnected but at the same time will be taken out of the service so no new connections will be routed to it.

Are you the owner of the code you are sending in, or do you have permission of the owner?

Sent with permission of the owner

The code will be published under the MIT-0 license. Have you read and understood this license?

Yes

Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @tsipinakis this actually seems to be a bug and should already be working. We'll have to investigate why it doesn't but this does not seem to be the right approach. The shutdown handlers should take care of closing client connections gracefully.

// GracefulTerminationDeadline is the amount of time ContainerSSH will
// wait after receiving a SIGTERM for all the clients to disconnect.
// After this duraction all connections are forcefully terminated.
GracefulTerminationDeadline time.Duration `json:"gracefulTerminationDeadline" yaml:"gracefulTerminationDeadline" default:"0"`
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ContainerSSH will not always run on SSH only, we plan a web client too. This deadline should not be tied to SSH.

go s.shutdownHandlers.Shutdown(lifecycle.ShutdownContext())
go s.disconnectClients(lifecycle, allClientsExited)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this does not work, this is actually a bug. The shutdown handlers should get a chance to run before the clients are disconnected forcefully.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to work the way it's intended currently: disconnectClients() waits until the context timeout (hardcoded 30 seconds currently in main) and then force-disconnects all clients. All this PR does is make this delay configurable.

I delay the shutdown handler until all clients are disconnected because running them means shutting down things like the metrics server which may be needed if the delay is too long, we want to know what happened during that time.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the future it may be beneficial to add a message when a shutdown is triggered prompting all users to re-connect.

@@ -204,12 +204,15 @@ func startPool(pool Service, lifecycle service.Lifecycle) error {
rotateSignals := make(chan os.Signal, 1)
signal.Notify(exitSignals, exitSignalList...)
signal.Notify(rotateSignals, rotateSignalList...)

deadline := cfg.SSH.GracefulTerminationDeadline + 5 * time.Second
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 seconds plus is a very arbitrary number, we should come up with something better.

@ghost ghost force-pushed the gracefultermination branch from ba31cb1 to 7c11f35 Compare March 13, 2022 18:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant