Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix state transitions for re-running requests #251

Merged
merged 6 commits into from
Sep 6, 2023
Merged

Conversation

rmccorm4
Copy link
Contributor

@rmccorm4 rmccorm4 commented Aug 31, 2023

Fixes L0_simple_lib--base, and allows requests to be re-used for multiple inferences.

Adds a sanity check that pending count isn't incremented multiple times without getting decremented in some edge case where a request is PENDING, fails to get to the PENDING state, and ends up in PENDING state again (re-used).

General logic:

  1. When a request is created, it is in INITIALIZED state
  2. When a request is run, it is set to PENDING state
  3. If a request makes it to the backend, it should be in EXECUTING state
    • At this point, the backend takes ownership of the request, and is responsible to call RequestRelease() on it, moving request into RELEASED state
  4. If any error occurs before getting to backend, it should hit InferenceRequest::RespondIfError and be set to RELEASED state.
  5. If somehow a request didn't make it to backend (EXECUTING state), but also didn't get RELEASED, then it should still be in PENDING state. If the request is to be re-used for another inference, it should call PrepareForInference() which should set state back to INITIALIZED at the start of the cycle.

If a request were to be transitioned from RELEASED to PENDING/EXECUTING directly, this should indicate a logical error as the request should only execute via InferAsync entrypoint, which calls PrepareForInference ([re]sets to INITIALIZED state) then InferenceRequest::Run (sets to PENDING state).

@rmccorm4 rmccorm4 requested review from nnshah1 and kthui August 31, 2023 22:03
kthui
kthui previously approved these changes Sep 1, 2023
src/infer_request.cc Outdated Show resolved Hide resolved
src/infer_request.cc Outdated Show resolved Hide resolved
src/infer_request.cc Outdated Show resolved Hide resolved
@rmccorm4 rmccorm4 changed the title Allow transitioning from RELEASED to STARTED state for re-using requests Fix state transitions for re-running requests Sep 6, 2023
@rmccorm4 rmccorm4 merged commit 1dcf5bb into main Sep 6, 2023
1 check passed
@rmccorm4 rmccorm4 deleted the rmccormick-state branch September 6, 2023 15:33
Tabrizian added a commit that referenced this pull request Sep 7, 2023
* Fix state transitions for re-running requests (#251)

* Add backend/server APIs

* Implement the cancellation APIs

* Only store the state in response factory

* Add unit testing for request cancellation

* Add test

* Add cancellation status

* Add testing for cancelling a request after release

* Handle request re-use

* Enable request reuse test

* Add staged changes

* Add temporary fix for the request state bug

---------

Co-authored-by: Ryan McCormick <[email protected]>
Tabrizian added a commit that referenced this pull request Sep 13, 2023
* Fix state transitions for re-running requests (#251)

* Add backend/server APIs

* Implement the cancellation APIs

* Only store the state in response factory

* Add unit testing for request cancellation

* Add test

* Add cancellation status

* Add testing for cancelling a request after release

* Handle request re-use

* Enable request reuse test

* Add staged changes

* Add temporary fix for the request state bug

---------

Co-authored-by: Ryan McCormick <[email protected]>
Tabrizian added a commit that referenced this pull request Sep 13, 2023
* Fix state transitions for re-running requests (#251)

* Add backend/server APIs

* Implement the cancellation APIs

* Only store the state in response factory

* Add unit testing for request cancellation

* Add test

* Add cancellation status

* Add testing for cancelling a request after release

* Handle request re-use

* Enable request reuse test

* Add staged changes

* Add temporary fix for the request state bug

---------

Co-authored-by: Ryan McCormick <[email protected]>
Tabrizian added a commit that referenced this pull request Sep 13, 2023
* Add inference request cancellation APIs (#249)

* Fix state transitions for re-running requests (#251)

* Add backend/server APIs

* Implement the cancellation APIs

* Only store the state in response factory

* Add unit testing for request cancellation

* Add test

* Add cancellation status

* Add testing for cancelling a request after release

* Handle request re-use

* Enable request reuse test

* Add staged changes

* Add temporary fix for the request state bug

---------

Co-authored-by: Ryan McCormick <[email protected]>

* Fix request re-use when cancelling a request

* Review edit

* Fix warmup request

* Fix null requests

* Review edit

---------

Co-authored-by: Ryan McCormick <[email protected]>
tanmayv25 added a commit that referenced this pull request Oct 4, 2023
* Add inference request cancellation APIs (#249)

* Fix state transitions for re-running requests (#251)

* Add backend/server APIs

* Implement the cancellation APIs

* Only store the state in response factory

* Add unit testing for request cancellation

* Add test

* Add cancellation status

* Add testing for cancelling a request after release

* Handle request re-use

* Enable request reuse test

* Add staged changes

* Add temporary fix for the request state bug

---------

Co-authored-by: Ryan McCormick <[email protected]>

* Fix request re-use when cancelling a request (#253)

* Add inference request cancellation APIs (#249)

* Fix state transitions for re-running requests (#251)

* Add backend/server APIs

* Implement the cancellation APIs

* Only store the state in response factory

* Add unit testing for request cancellation

* Add test

* Add cancellation status

* Add testing for cancelling a request after release

* Handle request re-use

* Enable request reuse test

* Add staged changes

* Add temporary fix for the request state bug

---------

Co-authored-by: Ryan McCormick <[email protected]>

* Fix request re-use when cancelling a request

* Review edit

* Fix warmup request

* Fix null requests

* Review edit

---------

Co-authored-by: Ryan McCormick <[email protected]>

* Dynamic batch scheduler request cancellation (#257)

* Add adapter for IsCancelled()

* Move cancel functions outside TRITON_ENABLE_STATS

* Add request cancellation to dynamic batch scheduler

* Refactor cancelled requests to use rejected requests routine

* Remove shared pointer wrapper for rejected and cancelled requests

* Ensemble scheduler request cancellation (#263)

* Add adapter for IsCancelled()

* Move cancel functions outside TRITON_ENABLE_STATS

* Add request cancellation to ensemble scheduler

* Sequence batch scheduler request cancellation (#260)

* Add request cancellation to sequence batch scheduler

* Add request cancellation to in-flight sequences

* Refactor on how a request is cancelled

* Fix issue when request is timeout dummy

* Always mark request cancelled before cancelling

* Mark immutable static status as const

---------

Co-authored-by: Iman Tabrizian <[email protected]>
Co-authored-by: Ryan McCormick <[email protected]>
Co-authored-by: Jacky <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants