-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failing with Generic Error message: Failed to obtain stable measurement. #777
Comments
@Kanupriyagoyal Can you please provide complete reproduction steps, which includes Trtiton-sdk container version, your server set-up instructions for us to reproduce the issue you are observing? |
Using perf_analyzer built from source
Server : r24.07 loaded multiple models with different data types with batch and non batch to see different datatypes supported with perf analyzer I0814 14:12:56.254085 68896 server.cc:674] Need check on this also: |
If I want to pass specific data to perf analyzer using --input-data
Would you please give an example how to create json , If I want specific json to be used for perf analyzer without round robin. |
@Kanupriyagoyal I think this is reasonable error from the tool. The latencies and thus the throughputs are bouncing all over the place. I need to investigate more the answer to your follow up question. For now, you can get ungated by loading only a couple models per server instance to profile. |
@debermudez I did load and unload of the model in explicit mode, so now one model loaded at a time. Still getting these issues
But when I am restarting the server and loading the model it is working. |
@Kanupriyagoyal You need to match the input name, type, and shape. Could you share the model config of the model you want to query? |
model config.pbtxt
input_gbm.json passed
Running perf analyzer:
|
@Kanupriyagoyal Pinged another teammate to investigate more. Part of the issue is the format of the content. |
@Kanupriyagoyal any luck? |
Yes by changing the data as flattened in a row-major format it worked for float data |
@debermudez @nv-hwoo would you please suggest for the bytes data . Any example
|
I am testing on the basic models. Model take input and return the same output of same datatype.
Inference is happening:
2024-08-20 09:35:15,923 - INFO - array_final: array([[103]], dtype=uint8)
array_final: [[103]]
perf_analyzer -m model_equals_b_uint8 --measurement-mode count_windows --measurement-request-count 5 -v
*** Measurement Settings ***
Batch size: 1
Service Kind: TRITON
Using "count_windows" mode for stabilization
Stabilizing using average latency and throughput
Minimum number of samples in each window: 5
Using synchronous calls for inference
Request concurrency: 1
Pass [1] throughput: 478.67 infer/sec. Avg latency: 2059 usec (std 3372 usec).
Pass [2] throughput: 621.713 infer/sec. Avg latency: 1625 usec (std 3008 usec).
Pass [3] throughput: 491.884 infer/sec. Avg latency: 2027 usec (std 13098 usec).
Pass [4] throughput: 18.0441 infer/sec. Avg latency: 54594 usec (std 80657 usec).
Pass [5] throughput: 16.6456 infer/sec. Avg latency: 61007 usec (std 68590 usec).
Pass [6] throughput: 62.8963 infer/sec. Avg latency: 5822 usec (std 7896 usec).
Pass [7] throughput: 16.871 infer/sec. Avg latency: 60256 usec (std 112842 usec).
Pass [8] throughput: 15.6989 infer/sec. Avg latency: 63212 usec (std 110034 usec).
Pass [9] throughput: 15.1902 infer/sec. Avg latency: 65797 usec (std 87972 usec).
Pass [10] throughput: 14.0266 infer/sec. Avg latency: 72140 usec (std 93986 usec).
Failed to obtain stable measurement within 10 measurement windows for concurrency 1. Please try to increase the --measurement-request-count.
Failed to obtain stable measurement.
perf_analyzer -m model_equals_b_uint8 --measurement-mode count_windows --measurement-request-count 50 -v
*** Measurement Settings ***
Batch size: 1
Service Kind: TRITON
Using "count_windows" mode for stabilization
Stabilizing using average latency and throughput
Minimum number of samples in each window: 50
Using synchronous calls for inference
Request concurrency: 1
Pass [1] throughput: 23.4639 infer/sec. Avg latency: 42614 usec (std 182802 usec).
Pass [2] throughput: 141.78 infer/sec. Avg latency: 3437 usec (std 5377 usec).
Pass [3] throughput: 14.8405 infer/sec. Avg latency: 67552 usec (std 97666 usec).
Pass [4] throughput: 12.2003 infer/sec. Avg latency: 82423 usec (std 75027 usec).
Pass [5] throughput: 14.2399 infer/sec. Avg latency: 70712 usec (std 120651 usec).
Pass [6] throughput: 86.8397 infer/sec. Avg latency: 2083 usec (std 2502 usec).
Pass [7] throughput: 22.6803 infer/sec. Avg latency: 45020 usec (std 178493 usec).
Pass [8] throughput: 17.8704 infer/sec. Avg latency: 56233 usec (std 175833 usec).
Pass [9] throughput: 23.0646 infer/sec. Avg latency: 43166 usec (std 148978 usec).
Pass [10] throughput: 18.234 infer/sec. Avg latency: 55330 usec (std 102755 usec).
Failed to obtain stable measurement within 10 measurement windows for concurrency 1. Please try to increase the --measurement-request-count.
Failed to obtain stable measurement.
perf_analyzer -m model_equals_b_uint8 --measurement-mode count_windows --measurement-request-count 75 -v
*** Measurement Settings ***
Batch size: 1
Service Kind: TRITON
Using "count_windows" mode for stabilization
Stabilizing using average latency and throughput
Minimum number of samples in each window: 75
Using synchronous calls for inference
Request concurrency: 1
Pass [1] throughput: 428.863 infer/sec. Avg latency: 2328 usec (std 3510 usec).
Pass [2] throughput: 494.642 infer/sec. Avg latency: 2018 usec (std 3441 usec).
Pass [3] throughput: 308.695 infer/sec. Avg latency: 3156 usec (std 13751 usec).
Pass [4] throughput: 340.429 infer/sec. Avg latency: 1828 usec (std 3966 usec).
Pass [5] throughput: 21.0775 infer/sec. Avg latency: 47814 usec (std 168738 usec).
Pass [6] throughput: 18.7684 infer/sec. Avg latency: 53730 usec (std 65595 usec).
Pass [7] throughput: 16.0608 infer/sec. Avg latency: 62265 usec (std 63152 usec).
Pass [8] throughput: 3.68812 infer/sec. Avg latency: 271139 usec (std 363750 usec).
Pass [9] throughput: 203.656 infer/sec. Avg latency: 4908 usec (std 6825 usec).
Pass [10] throughput: 214.693 infer/sec. Avg latency: 2469 usec (std 3830 usec).
Failed to obtain stable measurement within 10 measurement windows for concurrency 1. Please try to increase the --measurement-request-count.
Failed to obtain stable measurement.
perf_analyzer -m model_equals_b_uint8 --measurement-mode count_windows --measurement-request-count 100 -v
*** Measurement Settings ***
Batch size: 1
Service Kind: TRITON
Using "count_windows" mode for stabilization
Stabilizing using average latency and throughput
Minimum number of samples in each window: 100
Using synchronous calls for inference
Request concurrency: 1
Pass [1] throughput: 423.137 infer/sec. Avg latency: 2331 usec (std 2866 usec).
Pass [2] throughput: 99.6489 infer/sec. Avg latency: 10037 usec (std 135019 usec).
Pass [3] throughput: 253.617 infer/sec. Avg latency: 1639 usec (std 1605 usec).
Pass [4] throughput: 16.316 infer/sec. Avg latency: 62273 usec (std 161047 usec).
Pass [5] throughput: 22.5236 infer/sec. Avg latency: 44084 usec (std 143282 usec).
Pass [6] throughput: 13.3747 infer/sec. Avg latency: 75319 usec (std 81540 usec).
Pass [7] throughput: 15.3824 infer/sec. Avg latency: 65006 usec (std 130209 usec).
Pass [8] throughput: 2.24593 infer/sec. Avg latency: 445246 usec (std 205477 usec).
Pass [9] throughput: 145.757 infer/sec. Avg latency: 2459 usec (std 4845 usec).
Pass [10] throughput: 15.9015 infer/sec. Avg latency: 63902 usec (std 89986 usec).
Failed to obtain stable measurement within 10 measurement windows for concurrency 1. Please try to increase the --measurement-request-count.
Failed to obtain stable measurement.
perf_analyzer -b 1 -m model_equals_b_uint16 -v
*** Measurement Settings ***
Batch size: 1
Service Kind: TRITON
Using "time_windows" mode for stabilization
Stabilizing using average latency and throughput
Measurement window: 5000 msec
Using synchronous calls for inference
Request concurrency: 1
Pass [1] throughput: 264.819 infer/sec. Avg latency: 2428 usec (std 19166 usec).
Pass [2] throughput: 19.249 infer/sec. Avg latency: 45776 usec (std 59715 usec).
Pass [3] throughput: 11.4458 infer/sec. Avg latency: 87830 usec (std 55669 usec).
Pass [4] throughput: 13.7479 infer/sec. Avg latency: 73070 usec (std 177674 usec).
Pass [5] throughput: 16.5643 infer/sec. Avg latency: 59318 usec (std 166888 usec).
Pass [6] throughput: 11.5103 infer/sec. Avg latency: 86986 usec (std 188720 usec).
Pass [7] throughput: 32.5302 infer/sec. Avg latency: 31859 usec (std 184371 usec).
Pass [8] throughput: 23.3457 infer/sec. Avg latency: 42082 usec (std 186189 usec).
Pass [9] throughput: 14.2139 infer/sec. Avg latency: 70781 usec (std 194576 usec).
Pass [10] throughput: 14.5149 infer/sec. Avg latency: 68353 usec (std 190451 usec).
Failed to obtain stable measurement within 10 measurement windows for concurrency 1. Please try to increase the --measurement-interval.
Failed to obtain stable measurement.
Everytime i am getting same generic error message.Though I am increasing the measurement-request-count
The text was updated successfully, but these errors were encountered: