Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calculate response throughput metric #356

Merged

Conversation

matthewkotila
Copy link
Contributor

@matthewkotila matthewkotila commented Jul 10, 2023

  • Skips null response latency recording
  • Calculates responses sent per second in InferenceProfiler

@matthewkotila
Copy link
Contributor Author

Will add testing later

Copy link
Contributor

@debermudez debermudez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does is_null_response_ need locking or are we okay because thats part of the backend which each inference has its own copy of?

src/c++/library/grpc_client.cc Show resolved Hide resolved
src/c++/library/http_client.cc Show resolved Hide resolved
src/c++/perf_analyzer/inference_profiler.cc Outdated Show resolved Hide resolved
@matthewkotila matthewkotila force-pushed the decoupled-model-support branch 3 times, most recently from b33df47 to 86daba9 Compare July 19, 2023 01:33
@matthewkotila matthewkotila force-pushed the matthewkotila-response-throughput branch from e7565d7 to 26fcc82 Compare July 19, 2023 01:46
@matthewkotila matthewkotila marked this pull request as ready for review July 19, 2023 01:46
@matthewkotila
Copy link
Contributor Author

@debermudez: does is_null_response_ need locking or are we okay because thats part of the backend which each inference has its own copy of?

There is no multithreaded accessing going on with that value, as far as I am aware.

@matthewkotila matthewkotila force-pushed the matthewkotila-response-throughput branch from 4a9171a to 3642c0a Compare July 20, 2023 16:56
@matthewkotila
Copy link
Contributor Author

Unit tests passed locally on the last commit, no need to run the CI since the code change was within the unit testing.

@matthewkotila matthewkotila merged commit 6847b53 into decoupled-model-support Jul 21, 2023
@matthewkotila matthewkotila deleted the matthewkotila-response-throughput branch July 21, 2023 18:51
@matthewkotila matthewkotila removed the request for review from debermudez July 21, 2023 18:51
matthewkotila added a commit that referenced this pull request Jul 25, 2023
* Calculate response throughput metric

* Address feedback

* Cleanup
matthewkotila added a commit that referenced this pull request Jul 27, 2023
* Calculate response throughput metric

* Address feedback

* Cleanup
matthewkotila added a commit that referenced this pull request Jul 28, 2023
* Calculate response throughput metric

* Address feedback

* Cleanup
matthewkotila added a commit that referenced this pull request Jul 28, 2023
* Calculate response throughput metric

* Address feedback

* Cleanup
matthewkotila added a commit that referenced this pull request Aug 8, 2023
* Calculate response throughput metric

* Address feedback

* Cleanup
matthewkotila added a commit that referenced this pull request Aug 11, 2023
* Calculate response throughput metric

* Address feedback

* Cleanup
matthewkotila added a commit that referenced this pull request Aug 15, 2023
* Calculate response throughput metric

* Address feedback

* Cleanup
matthewkotila added a commit that referenced this pull request Aug 15, 2023
* Update time stamp vector to hold a list of response times (#347)

* WIP change to timestamp vector

* Update testing to use new vector for end times

* Fix lambda and tag todo with a ticket number

* Fix iterator validation

* Update PA async callback to only run with final response (#351)

* Update PA async callback to only run with final response

* Address feedback

* Address feedback

* Address feedback

* Fix bug

* Calculate response throughput metric (#356)

* Calculate response throughput metric

* Address feedback

* Cleanup

* add json schema for output file generation (#364)

* Add json schema for output file and examples

* Add newlines at end of files

* Update json type to correct keyword

* Update schema to fix hierarchy and address feedback

* Add json file with known error for testing

* Move example files to the docs directory

* Remove some example json files

* Update schema to use integer and remove uniqueness from timestamps

* Add CLI option for profile export file (#369)

* Add CLI option for profile export file

* Address feedback

* Store sequence ID in timestamps tuple object (#367)

* Store sequence ID in timestamps tuple object

* Fix bug and address feedback

* Address feedback

* Address feedback

* Output response throughput metric to stdout and csv (#373)

* Add missing --profile-export-file help description and cli docs description (#374)

* create new json file output (#375)

* Initial commit to gather all experiment data

* Update response_timestamps variable name

* Revert erase_indices deletion

* Add reporter class to create json output

* Add comments to public methods

* Add file and stdout output

* Copy valid request records into RawDataCollector

* Plumb file path to reporter

* Connect collector to reporter

* Add json data to top level object

* Add experiments value to document

* Update json ints to uint64 and fix extry addition

* Print seq id only when non zero

* Moving schema to test repo

* Only write file if specified

* Address feedback

* Address feedback

---------

Co-authored-by: Matthew Kotila <[email protected]>
Co-authored-by: tgerdes <[email protected]>

* Fix bug where null response were accidentally excluded from profile export (#379)

* Fix bug where null response were accidentally excluded from profile export

* Fix bug

* Address feedback

* Add unit tests for ProfileDataCollector and ProfileDataExporter classes (#380)

* Add initial mock class and test file for ProfileDataCollector

* Mock FindExperiment method

* Add test for FindExperiment and fix a bug in the function

* Add tests for AddData and AddWindow in ProfileDataCollector

* Add a skeleton unit test code for ProfileDataExporter

* Mock ConvertToJson

* Add subcase for ConvertToJson method

* Add subcase for OutputToFile method

* Add subcase for AddExperiment method

* Split tests into multiple test cases

* Set request rate as double in json

* Address feedback

---------

Co-authored-by: Elias Bermudez <[email protected]>
Co-authored-by: tgerdes <[email protected]>
Co-authored-by: Hyunjae Woo <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants