Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix trailing comma in CSV for verbose_csv and collect_metrics modes #385

Merged
merged 1 commit into from
Aug 31, 2023

Conversation

rmccorm4
Copy link
Contributor

@rmccorm4 rmccorm4 commented Aug 30, 2023

Overview

There are a few places where trailing commas are left depending on config flags:

Test/verification

Ideally there should be some unit tests to validate the produced CSVs are actually valid CSV files (no trailing delimiters), but verified locally here:

# Gather CSVs for different combinations of flags
./perf_analyzer -m id --shape INPUT0:1,1 --concurrency 1:2 -f report.csv
./perf_analyzer -m id --shape INPUT0:1,1 --concurrency 1:2 -f report_metrics.csv --collect-metrics
./perf_analyzer -m id --shape INPUT0:1,1 --concurrency 1:2 -f report_verbose.csv --verbose-csv
./perf_analyzer -m id --shape INPUT0:1,1 --concurrency 1:2 -f report_verbose_metrics.csv --verbose-csv --collect-metrics

# Verify CSVs set concurrency column correctly instead of getting shifted over
CSVS="report.csv report_metrics.csv report_verbose.csv report_verbose_metrics.csv"
for csv in ${CSVS}; do
  python -c "import pandas as pd; df = pd.read_csv('${csv}'); print(df); assert df['Concurrency'][0] == 1"
done

Side note, nice tool for visual verification via jq:

$ jq -cR 'split(",")' report.csv
["Concurrency","Inferences/Second","Client Send","Network+Server Send/Recv","Server Queue","Server Compute Input","Server Compute Infer","Server Compute Output","Client Recv","p50 latency","p90 latency","p95 latency","p99 latency"]
["1","2281.23","109","284","29","1","12","0","0","413","588","656","806"]
["2","4174.56","120","315","28","1","11","0","0","459","636","701","864"]

@matthewkotila
Copy link
Contributor

@nv-braf are there any consequences to MA from this?

@rmccorm4
Copy link
Contributor Author

@nv-braf are there any consequences to MA from this?

How is MA consuming these CSVs? I'm surprised it is not failing on parsing the CSVs, or getting incorrect values with columns shifted.

@matthewkotila
Copy link
Contributor

This is technically changing the format of an output a user might expect to be a certain way. I'm not sure how to determine if a user is expecting the trailing comma.

@dyastremsky any input? You were the one we last discussed breaking changes with and how we should be more cautious.

@matthewkotila
Copy link
Contributor

I think it's unlikely someone is expecting a (unnecessary) trailing comma at the end (I don't think it's technically an invalid csv format, but it's implying there's a "null field" at the end, which we don't need to imply), so as long as MA doesn't break from this, I'm cool with getting it merged and making the output cleaner.

@matthewkotila
Copy link
Contributor

matthewkotila commented Aug 30, 2023

I think I misunderstood what this trailing comma is. It looks like it only exists in the data rows, but not in the header row. Yeah, this is a bug (and introduced by me, my bad).

Fix looks good to me--thanks Ryan. Let's just make sure MA can still work. @nv-braf suggestions on running an MA pipeline with these changes?

@rmccorm4
Copy link
Contributor Author

rmccorm4 commented Aug 30, 2023

I think it's unlikely someone is expecting a (unnecessary) trailing comma at the end (I don't think it's technically an invalid csv format, but it's implying there's a "null field" at the end, which we don't need to imply), so as long as MA doesn't break from this, I'm cool with getting it merged and making the output cleaner.

For verbose only. it's true that it is just introducing an additional column that happens to be empty, so technically yes this is introducing a change to remove the empty column:

$ perf_analyzer -m id --shape INPUT0:1,1 -f report_verbose.csv --verbose-csv 

>>> with open("report_verbose.csv") as f:
...   for line in f:
...     print(len(line.split(",")))
... 
17
17

However for verbose + collect-metrics, the number of columns is different in the header row and data row:

$ perf_analyzer -m id --shape INPUT0:1,1 -f report_verbose_metrics.csv --verbose-csv --collect-metrics

>>> with open("report_verbose_metrics.csv") as f:
...   for line in f:
...     print(len(line.split(",")))
... 
20
21

Which I believe should be considered invalid

@nv-braf
Copy link
Contributor

nv-braf commented Aug 30, 2023

@

MA consumes the <model_name>_results.csv and then calls the csv library to parse the file:

with open(perf_config["latency-report-file"], mode="r") as f: csv_reader = csv.DictReader(f, delimiter=",")

@dyastremsky
Copy link
Contributor

This is technically changing the format of an output a user might expect to be a certain way. I'm not sure how to determine if a user is expecting the trailing comma.

@dyastremsky any input? You were the one we last discussed breaking changes with and how we should be more cautious.

Thanks for looping me in. This seems like a solid change. Seems like we're fixing previously broken behavior, which would have positive impact for customers. I assume the risk of breaking anything for customers with this change to be fairly close to zero.

I do agree though that additional testing would be helpful. That would require us to explicitly decide the expected format and validate it automatically after future PRs.

@nv-braf
Copy link
Contributor

nv-braf commented Aug 30, 2023

I think I misunderstood what this trailing comma is. It looks like it only exists in the data rows, but not in the header row. Yeah, this is a bug (and introduced by me, my bad).

Fix looks good to me--thanks Ryan. Let's just make sure MA can still work. @nv-braf suggestions on running an MA pipeline with these changes?

You should be able to run the MA CI with a custom version of PA. I would ask Mischa to help you with this.

@matthewkotila
Copy link
Contributor

matthewkotila commented Aug 30, 2023

@nv-braf ran locally main branch of MA with a csv output from PA that doesn't have the trailing comma and it worked fine. Good enough verification for me for now. Approving.

@rmccorm4 rmccorm4 merged commit 8ecca20 into main Aug 31, 2023
3 checks passed
@rmccorm4 rmccorm4 deleted the rmccormick-csv branch August 31, 2023 22:41
aancel pushed a commit to IRCAD/triton-inference-server-client that referenced this pull request Feb 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants