Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specifying batch size on llama2-70B cm automation #190

Open
rajesh-s opened this issue Aug 27, 2024 · 4 comments
Open

Specifying batch size on llama2-70B cm automation #190

rajesh-s opened this issue Aug 27, 2024 · 4 comments

Comments

@rajesh-s
Copy link

rajesh-s commented Aug 27, 2024

I could not find information either in the documentation or in the cm scripts on the batch size that is being used to report the results in the MLCommons database.

  1. The default batch size from the implementation seems to be 1. Is the cm automation specifying a value different from this?
  2. What knobs can the user have to view the configuration used on a particular submission to ensure alignment while profiling new systems?
  3. If I used the automation scripts as indicated on the documentation page, on the same hardware used in the submissions, should I see nearly the same performance?
@arjunsuresh
Copy link
Contributor

@rajesh-s most of the inference submissions are done using Nvidia implementation. In CM we have tried to match the typical batch sizes as in the Nvidia submissions but we haven't tested all of the systems. In the CM run command you can specify --batch_size= to use custom batch size for Nvidia implementation.

For reference implementation I'm not sure if different batch sizes work as many things are hardwired and no one has done any submission using it.

@rajesh-s
Copy link
Author

It would help if the batch sizes are listed atleast on the submissions which I could not find on the results.

The CM run command seems to default to the batch size of 1 as I indicated above, which might be good to note in the documentation. The results vary largely on the sizes and it maybe imperative to document them.

@anandhu-eng
Copy link
Contributor

Hi @rajesh-s, sorry for replying late. Have noted the required addition.

@arjunsuresh , would it be apt to include in collapsible section or should we give that as a tip since there is a chance of users ignoring the collapsible option.

@anandhu-eng
Copy link
Contributor

Hi @rajesh-s , we have added the changes in our forks but its yet to be merged to MLCommons official inference repo. You can find the changes here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants