Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Patch SGL Benchmark Test for Pytest Dashboard (#551)
# Description The nightly SGLang benchmark tests had first successful run last night: https://github.com/nod-ai/shark-ai/actions/runs/11850084805/job/33024395622 And uploaded to dashboard successfully: https://nod-ai.github.io/shark-ai/llm/sglang/?sort=result However, since I used a mock to pipe the `bench_serving` script output to `logger.info`, we ended up with results appearing in runner log, that did not appear in dashboard: ```text ============ Serving Benchmark Result ============ INFO __name__:mock.py:1189 Backend: shortfin INFO __name__:mock.py:1189 Traffic request rate: 4 INFO __name__:mock.py:1189 Successful requests: 10 INFO __name__:mock.py:1189 Benchmark duration (s): 716.95 INFO __name__:mock.py:1189 Total input tokens: 1960 INFO __name__:mock.py:1189 Total generated tokens: 2774 INFO __name__:mock.py:1189 Total generated tokens (retokenized): 291 INFO __name__:mock.py:1189 Request throughput (req/s): 0.01 INFO __name__:mock.py:1189 Input token throughput (tok/s): 2.73 INFO __name__:mock.py:1189 Output token throughput (tok/s): 3.87 INFO __name__:mock.py:1189 ----------------End-to-End Latency---------------- INFO __name__:mock.py:1189 Mean E2E Latency (ms): 549509.25 INFO __name__:mock.py:1189 Median E2E Latency (ms): 578828.23 INFO __name__:mock.py:1189 ---------------Time to First Token---------------- INFO __name__:mock.py:1189 Mean TTFT (ms): 327289.54 INFO __name__:mock.py:1[189](https://github.com/nod-ai/shark-ai/actions/runs/11850084805/job/33024395622#step:8:190) Median TTFT (ms): 367482.31 INFO __name__:mock.py:1189 P99 TTFT (ms): 367972.81 INFO __name__:mock.py:1189 -----Time per Output Token (excl. 1st token)------ INFO __name__:mock.py:1189 Mean TPOT (ms): 939.35 INFO __name__:mock.py:1189 Median TPOT (ms): 886.13 INFO __name__:mock.py:1189 P99 TPOT (ms): 2315.83 INFO __name__:mock.py:1189 ---------------Inter-token Latency---------------- INFO __name__:mock.py:1189 Mean ITL (ms): 732.59 INFO __name__:mock.py:1189 Median ITL (ms): 729.43 INFO __name__:mock.py:1189 P99 ITL (ms): 1477.77 INFO __name__:mock.py:1189 ================================================== ``` It also had a small bug that was obfuscated in runner/terminal logs, but appeared in dashboard, which prevented jsonl files from being generated after benchmark results were collected. By fixing the bug in bench_serving input args, and logging resulting jsonl file after each run, I was able to verify locally that the output html contains the proper results: 
- Loading branch information