Fast PA teardown #674

tgerdesnv · 2024-05-24T15:09:56Z

Adds the ability for PA to exit quicker.

Old behavior was that PA would always wait for all requests to finish before exiting. For the case of async request-rate combined with a slow model, this could add up to many minutes of waiting.

New behavior is that for non-shared-memory cases, PA will exit immediately and drop remaining requests on the floor.

For the case of PA sweeping through multiple values (request-rate 10:20:10 for example), it WILL still wait for all requests from the first experiment to finish before going to the next step.

Something messy is that the LoadManager now needs to remember if it is shared memory or not. It was already utilizing that information, so not the end of the world.

Something important to note is that running PA back to back immediately may result in lower results on the 2nd run, as the actual server may still be draining the abandoned requests from the first run.

Here is a before/after. Note that both of these include a change (that won't be part of this story) to make sure that request-rate can actually issue the requested rate.

Before:

After:

dyastremsky · 2024-05-31T17:09:40Z

src/c++/perf_analyzer/client_backend/mock_client_backend.h

+  {
+    // Make sure no requests carry over to the next test
+    while (stats_->num_active_infer_calls) {
+      std::this_thread::sleep_for(std::chrono::milliseconds(1));


Do you want a configurable timeout? Otherwise, it's going to be hard to know why the test times out for someone seeing it fail for that reason.

matthewkotila

Submitting part-way through the minimize emails to you but still give you something to respond to because I couldn't finish the review in one sitting.

matthewkotila · 2024-06-04T23:34:33Z

src/c++/perf_analyzer/client_backend/mock_client_backend.h

The first run is 1m54s:

The second run is 24s.

Is the additional time for the first run mainly after the last pass? Or is it distributed among the passes?

Yes, it was 100% after the final results were printed. Old behavior is that it just sat there

matthewkotila · 2024-06-04T23:36:02Z

src/c++/perf_analyzer/client_backend/mock_client_backend.h

In the future, can you suggest an order of files to review along with some line-specific PR comments explaining at a high level what the changes are for?

matthewkotila · 2024-06-04T23:38:26Z

src/c++/perf_analyzer/client_backend/mock_client_backend.h

One high-level thing that crossed my mind is that: do we want to just drop requests like this PR implements?

Maybe we can instead add a print statement explaining that profiling is complete, but that PA is waiting for unfinished requests and that user can ^C early if they don't mind that the server is still potentially processing said requests.

Longer term what we need is a proper way to cancel requests. Short term, this is holding up some GenAI-Perf scripting, where they are running with different request-rates in a bash script. ^C isn't going to help them.

tgerdesnv force-pushed the tgerdes-faster-teardown branch 2 times, most recently from 1c55334 to 3604a7d Compare May 29, 2024 16:43

tgerdesnv requested a review from matthewkotila May 31, 2024 15:11

tgerdesnv marked this pull request as ready for review May 31, 2024 15:11

dyastremsky reviewed May 31, 2024

View reviewed changes

matthewkotila reviewed Jun 5, 2024

View reviewed changes

tgerdesnv added 3 commits June 5, 2024 08:26

don't flush all requests at end of PA

e15b255

Only fast exit for non-shm cases

2ca6e64

comments

bee15ad

tgerdesnv force-pushed the tgerdes-faster-teardown branch from 7f3bbfa to bee15ad Compare June 5, 2024 13:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fast PA teardown #674

Fast PA teardown #674

tgerdesnv commented May 24, 2024 •

edited

Loading

dyastremsky May 31, 2024

matthewkotila left a comment •

edited

Loading

matthewkotila Jun 4, 2024

tgerdesnv Jun 5, 2024

matthewkotila Jun 4, 2024

matthewkotila Jun 4, 2024

tgerdesnv Jun 5, 2024

Fast PA teardown #674

Are you sure you want to change the base?

Fast PA teardown #674

Conversation

tgerdesnv commented May 24, 2024 • edited Loading

dyastremsky May 31, 2024

Choose a reason for hiding this comment

matthewkotila left a comment • edited Loading

Choose a reason for hiding this comment

matthewkotila Jun 4, 2024

Choose a reason for hiding this comment

tgerdesnv Jun 5, 2024

Choose a reason for hiding this comment

matthewkotila Jun 4, 2024

Choose a reason for hiding this comment

matthewkotila Jun 4, 2024

Choose a reason for hiding this comment

tgerdesnv Jun 5, 2024

Choose a reason for hiding this comment

tgerdesnv commented May 24, 2024 •

edited

Loading

matthewkotila left a comment •

edited

Loading