Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can wgmma.async and barrier.arrive Ensure GEMM Completion Before Moving Forward? #1373

Open
ziyuhuang123 opened this issue Dec 6, 2024 · 2 comments

Comments

@ziyuhuang123
Copy link

Since the GEMM here uses wgmma.async, the subsequent barrier.arrive cannot guarantee that GEMM execution is complete, right? It can only ensure that the GEMM has been issued, correct? Otherwise, if strictly following the order of GEMM0, GEMM1, and SOFTMAX, it would be impossible to achieve the overlap within the warp group as shown in the figure below:
8db1fb4982175e8838169e00b3e44aa

b90c287986be26ecb9c1296d4f7de16

@ziyuhuang123
Copy link
Author

40da1808db654e71fba4221de97a851
Considering that GEMM is asynchronous (assuming) and softmax is synchronous (execution must complete before proceeding), I tried to combine the intra and inter illustrations to roughly draw something like this...

@tridao
Copy link
Contributor

tridao commented Dec 6, 2024

Yes barrier.arrive only guarantees that the gemm has been issued and not necessarily completed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants