🥅 Kill servers on engine death #63

joerunde · 2024-07-24T17:35:09Z

Description

This PR modifies the grpc server's exception handler to check if the async llm engine is still running.
If it's not then the handler sends a SIGTERM to kill the serving process, which should trigger the graceful shutdown handlers.

This approach is a bit heavy-handed, but await server.stop(0) seems very unhappy if called within the context of a request handler. Maybe we could use some grpc context termination callbacks for this, but I didn't really think it was worth the time to investigate that too deeply. I think the only real drawback here is that unit testing this feature would be pretty difficult with an os.kill in the mix.

This PR currently does not add this handling to the http server, but once PRs 6594 and vllm-project/vllm#6740 are merged, we should be able to remove our copy-pasta http server so both will handle engine death properly.

How Has This Been Tested?

Tested by installing vllm==0.5.3.post1 and the adapter from this branch onto a dev pod, injecting a runtime failure into the llm engine, and running a grpc request.

Merge criteria:

The commits are squashed in a cohesive manner and have meaningful messages.
Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
The developer has manually tested the changes and verified that the changes work

Signed-off-by: Joe Runde <[email protected]>

codecov-commenter · 2024-07-24T17:42:13Z

Codecov Report

Attention: Patch coverage is 78.26087% with 5 lines in your changes missing coverage. Please review.

Project coverage is 60.52%. Comparing base (96b8e0f) to head (98bb2a4).

Files	Patch %	Lines
src/vllm_tgis_adapter/grpc/grpc_server.py	58.33%	5 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #63      +/-   ##
==========================================
- Coverage   61.19%   60.52%   -0.67%     
==========================================
  Files          20       20              
  Lines        1193     1183      -10     
  Branches      211      208       -3     
==========================================
- Hits          730      716      -14     
- Misses        387      391       +4     
  Partials       76       76

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

dtrifiro · 2024-07-26T15:24:56Z

Waiting on vllm-project/vllm#6594 to make sure that we have a common way of stopping the server

njhill · 2024-08-08T16:52:10Z

vllm-project/vllm#6594 now merged finally

joerunde · 2024-08-08T16:52:59Z

vllm-project/vllm#6594 is now merged- I'll see if I can take another pass at this to not use os.kill

Signed-off-by: Joe Runde <[email protected]>

src/vllm_tgis_adapter/__main__.py

src/vllm_tgis_adapter/http.py

Signed-off-by: Joe Runde <[email protected]>

njhill

Thanks @joerunde!

🥅 Kill servers on engine death

b85f909

Signed-off-by: Joe Runde <[email protected]>

dtrifiro added the waiting-on-upstream label Jul 29, 2024

joerunde added 2 commits August 7, 2024 13:13

Merge branch 'main' into kill-grpc-server

0d8d486

Merge branch 'main' into kill-grpc-server

297a882

joerunde removed the waiting-on-upstream label Aug 8, 2024

✨ actually kill both htpp + grpc

ae1aa2d

Signed-off-by: Joe Runde <[email protected]>

njhill reviewed Aug 9, 2024

View reviewed changes

src/vllm_tgis_adapter/__main__.py Show resolved Hide resolved

dtrifiro reviewed Aug 9, 2024

View reviewed changes

src/vllm_tgis_adapter/__main__.py Outdated Show resolved Hide resolved

dtrifiro reviewed Aug 9, 2024

View reviewed changes

src/vllm_tgis_adapter/http.py Outdated Show resolved Hide resolved

🐛 add backwards compatibility

98bb2a4

Signed-off-by: Joe Runde <[email protected]>

njhill approved these changes Aug 9, 2024

View reviewed changes

joerunde added this pull request to the merge queue Aug 9, 2024

Merged via the queue into main with commit a7d89dc Aug 9, 2024
3 checks passed

joerunde deleted the kill-grpc-server branch August 9, 2024 17:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🥅 Kill servers on engine death #63

🥅 Kill servers on engine death #63

joerunde commented Jul 24, 2024

codecov-commenter commented Jul 24, 2024 •

edited

Loading

dtrifiro commented Jul 26, 2024

njhill commented Aug 8, 2024

joerunde commented Aug 8, 2024

njhill left a comment

🥅 Kill servers on engine death #63

🥅 Kill servers on engine death #63

Conversation

joerunde commented Jul 24, 2024

Description

How Has This Been Tested?

Merge criteria:

codecov-commenter commented Jul 24, 2024 • edited Loading

Codecov Report

dtrifiro commented Jul 26, 2024

njhill commented Aug 8, 2024

joerunde commented Aug 8, 2024

njhill left a comment

Choose a reason for hiding this comment

codecov-commenter commented Jul 24, 2024 •

edited

Loading