Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add tutorial group gemm example #343

Merged
merged 11 commits into from
Oct 11, 2023
Merged

Conversation

scxiao
Copy link

@scxiao scxiao commented Sep 27, 2023

Cherry pickup tutorial group gemm example

@zhanglx13
Copy link

I just tried it and here is what I got:

group-gemm-performance:
N cuBLAS Triton
0 128.0 0.061440 0.031040
1 256.0 0.035521 0.042240
2 512.0 0.060640 0.065440
3 1024.0 0.108961 0.392642

Does it look ok to you, @scxiao?

@scxiao
Copy link
Author

scxiao commented Sep 28, 2023

I just tried it and here is what I got:

group-gemm-performance: N cuBLAS Triton 0 128.0 0.061440 0.031040 1 256.0 0.035521 0.042240 2 512.0 0.060640 0.065440 3 1024.0 0.108961 0.392642

Does it look ok to you, @scxiao?

Here is the a100 performance, unit is time (ms).
group-gemm-performance:
N cuBLAS Triton
0 128.0 0.020480 0.013312
1 256.0 0.023552 0.018432
2 512.0 0.032768 0.026624
3 1024.0 0.071680 0.087040

So, our performance is bad. Let me do some investigation, we should be much better than these numbers.

@scxiao
Copy link
Author

scxiao commented Oct 10, 2023

Latest performance on MI250X
group-gemm-performance:
N cuBLAS Triton
0 128.0 0.061441 0.017120
1 256.0 0.034881 0.021281
2 512.0 0.060481 0.039040
3 1024.0 0.108481 0.153281

2x to 2.5x faster than first try

@scxiao
Copy link
Author

scxiao commented Oct 10, 2023

Latest performance numbers on MI250
group-gemm-performance:
N cuBLAS Triton
0 128.0 0.061440 0.016480
1 256.0 0.035040 0.020160
2 512.0 0.060481 0.035040
3 1024.0 0.108640 0.113441

@scxiao scxiao merged commit 99fa2e4 into triton-mlir Oct 11, 2023
2 checks passed
scxiao added a commit that referenced this pull request Oct 20, 2023
* [DOCS] Add a tutorial example of grouped gemm (triton-lang#2326)
Co-authored-by: Bin Fan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants