[GEMM] [Tuning] Add `waves_per_eu` to gemm tuning #362

zhanglx13 · 2023-10-13T14:59:53Z

And reduce tuning time by fixing a bug in the pre-compile step

More details:
Previously, the pre-compiled step is done in parallel. However, the compiled kernels are not cached. Therefore, at the tuning step, we still pay the overhead of kernel compilation.

sjw36

LGTM!

* Add waves_per_eu in the tuning space * Do not allocate tensor on device during kernel compilation step * Add breakdown elapsed time * Parallelize the post-processing step * Parallelize the profile step with --ngpus * Better timing info printout

Add waves_per_eu in the tuning space

d806234

zhanglx13 force-pushed the add-waves-per-eu-to-gemm-tuning branch from 3e77212 to d806234 Compare October 13, 2023 15:01

zhanglx13 added 5 commits October 13, 2023 14:50

Do not allocate tensor on device during kernel compilation step

c5b48dd

Add breakdown elapsed time

963a8f0

Parallelize the post-processing step

7139b5c

Parallelize the profile step with --ngpus

14cc8f0

Better timing info printout

bd61bcf

zhanglx13 force-pushed the add-waves-per-eu-to-gemm-tuning branch from b36ec13 to bd61bcf Compare October 15, 2023 02:13

sjw36 approved these changes Oct 16, 2023

View reviewed changes

jayfurmanek approved these changes Oct 16, 2023

View reviewed changes

jayfurmanek merged commit 1de859d into triton-mlir Oct 16, 2023
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[GEMM] [Tuning] Add `waves_per_eu` to gemm tuning #362

[GEMM] [Tuning] Add `waves_per_eu` to gemm tuning #362

zhanglx13 commented Oct 13, 2023 •

edited

Loading

sjw36 left a comment

[GEMM] [Tuning] Add waves_per_eu to gemm tuning #362

[GEMM] [Tuning] Add waves_per_eu to gemm tuning #362

Conversation

zhanglx13 commented Oct 13, 2023 • edited Loading

sjw36 left a comment

Choose a reason for hiding this comment

[GEMM] [Tuning] Add `waves_per_eu` to gemm tuning #362

[GEMM] [Tuning] Add `waves_per_eu` to gemm tuning #362

zhanglx13 commented Oct 13, 2023 •

edited

Loading