Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GEMM] [Tuning] Add waves_per_eu to gemm tuning #362

Merged
merged 6 commits into from
Oct 16, 2023

Conversation

zhanglx13
Copy link

@zhanglx13 zhanglx13 commented Oct 13, 2023

And reduce tuning time by fixing a bug in the pre-compile step

More details:
Previously, the pre-compiled step is done in parallel. However, the compiled kernels are not cached. Therefore, at the tuning step, we still pay the overhead of kernel compilation.

@zhanglx13 zhanglx13 force-pushed the add-waves-per-eu-to-gemm-tuning branch from 3e77212 to d806234 Compare October 13, 2023 15:01
@zhanglx13 zhanglx13 force-pushed the add-waves-per-eu-to-gemm-tuning branch from b36ec13 to bd61bcf Compare October 15, 2023 02:13
Copy link

@sjw36 sjw36 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@jayfurmanek jayfurmanek merged commit 1de859d into triton-mlir Oct 16, 2023
2 checks passed
scxiao pushed a commit that referenced this pull request Oct 20, 2023
* Add waves_per_eu in the tuning space

* Do not allocate tensor on device during kernel compilation step

* Add breakdown elapsed time

* Parallelize the post-processing step

* Parallelize the profile step with --ngpus

* Better timing info printout
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants