-
Notifications
You must be signed in to change notification settings - Fork 32
Set up data for ui - WIP #87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
def output_token_throughput_distribution(self) -> Distribution: | ||
""" | ||
Get the distribution for output token throughput. | ||
|
||
:return: The distribution of output token throughput. | ||
:rtype: Distribution | ||
""" | ||
throughputs = [] | ||
for r in self.results: | ||
duration = (r.end_time or 0) - (r.start_time or 0) | ||
if duration > 0: | ||
throughputs.append(r.output_token_count / duration) | ||
|
||
return Distribution(data=throughputs) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The UI relies on the output throughput distribution, and I didn't find any methods/properties that were in the token/(unit of time) shape the UI expects so I added this.
src/guidellm/main.py
Outdated
generate_ui_api_data(report) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just so I can run this simply and look at the json generated
bucket_width = dist.range / n_buckets | ||
bucket_counts = [0] * n_buckets | ||
|
||
for val in dist.data: | ||
|
||
idx = int((val - minv) // bucket_width) | ||
if idx == n_buckets: | ||
idx = n_buckets - 1 | ||
bucket_counts[idx] += 1 | ||
|
||
buckets = [] | ||
for i, count in enumerate(bucket_counts): | ||
bucket_start = minv + i * bucket_width | ||
buckets.append({ | ||
"value": bucket_start, | ||
"count": count | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure the proper way to generate these buckets or if there is code somewhere else in guidellm that could manage this and I missed it.
But this code assumes we have a set number of buckets we want to generate and then determines the bucket width based off of that. It is a hard coded approach, and some data analysis first might result in a better number of buckets or bucket size. But generally I figured the UI would look good with there being a set number of buckets so our histograms conveniently look the same and take up a comfortable amount of space.
with open("ben_test/run_info.json", "w") as f: | ||
json.dump(run_info_json, f, indent=2) | ||
with open("ben_test/workload_details.json", "w") as f: | ||
json.dump(workload_details_json, f, indent=2) | ||
with open("ben_test/benchmarks.json", "w") as f: | ||
json.dump(benchmarks_json, f, indent=2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just for testing purposes, to view the generated json.
What the data looks like currently. Converted .js to .txt so I could post them here |
…f request over time data and use raw, refactor and test interpolation functionality
…ector and output html report
This is a roughly thrown together version of my understanding of how guidellm data can populate the UI. It isn't tied into the html injection functionality which will serve the UI (as I'm just generating json I manually drop into the UI for now). Nor is it set up to be the basis of the api that will serve the guidellm UI, which would be where this logic would likely fit in the future. But the calculations are meant to be 100% correct/accurate.
I'll add more tests when I finish the frontend work to help walk through what this is trying to achieve. For now attention is most needed on the calculations for benchmark metrics, prompt/output token metrics, requests over time, etc, which are used to make histograms and line charts in the UI.
I've attached the data generated from this code in files below, and the model I've used thus far (microsoft/DialoGPT-small) is not a great chat model afaik so it probably isn't ideal data, but the performance is a little better running on my mac due to its small size which I think will make for more realistic looking plots on the charts. Will have to get a realistic set up to output better data soon.
I don't have a goal of trying to get this merged in urgently, there is a bit more UI work to do before this is useful. But I'll be pestering people for more reviews soon.