-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: Improve shutdown logic, add more context done checks #685
base: main
Are you sure you want to change the base?
Conversation
nickpetrovic
commented
Nov 4, 2024
- Improve shutdown logic by running server (gRPC/HTTP) shutdowns and global context cancellation in parallel
- Merge global and local contexts to avoid modifying downstream code that doesn't need to know about the global context
- In the runc_server.go RunCStreamLogs func, make sure we can cancel the loop if the gateway is being shutdown
- Cancel RequestBuffer heartBeat func if gateway is shutting down
From my testing, these changes allows the gateway to shutdown. Its worth testing this in staging with a remote gpu node to observe how the gateways (multiple) will respond when they are all |
"sync" | ||
) | ||
|
||
func MergeContexts(ctxs ...context.Context) (context.Context, context.CancelFunc) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we sure we always want child contexts to be cancelled immediately if the parents is cancelled? Just want to make sure we don't cancel a DB transaction or something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't use this for every time theres two contexts. This is mostly a helper for when merging two contexts is okay. For example, we're merging contexts before we call the LogStream.Stream function. In particular, we can break out of the for loop in handleStream() with just a single/wrapped context vs adding a global context to LogStream and then adding a second case statement in handleStream.
We're also using the merged context in goroutines where we don't need to add a second case statement for a global context - at least I don't think we do. 1. because they're goroutines and errors aren't handled/returned there and 2. I didn't see a reason to have different logic for when the global context was cancelled.
Hopefully that makes sense. LMK if there's something that you're thinking of specifically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, that makes sense to me.