Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix an consumer release placement issue #17

Open
wants to merge 3 commits into
base: ws
Choose a base branch
from
Open

Conversation

htyu
Copy link
Contributor

@htyu htyu commented Dec 21, 2024

When there is a tt.trans preceding a tt.dot, the real consumer of the producer buffer should be the dot op instead of the trans op, since the trans op only recalculates the shared memory address for the dot and itself doesn't really consume the data.

@htyu htyu requested a review from manman-ren December 21, 2024 07:43
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Dec 21, 2024
lib/Dialect/TritonGPU/Transforms/WSCodePartition.cpp Outdated Show resolved Hide resolved
while (!transUsers.empty()) {
auto transUser = transUsers.pop_back_val();
visited.insert(transUser);
if (isa<TransOp>(transUser)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is TransOp the only op that we need to go across?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question but I'm not quite sure. Basically we should consider all ops that do not really load the data from SMEM, instead, only manipulate the pointers.

@htyu htyu force-pushed the hoy/consumerelease branch from c8d98aa to faeaefc Compare January 9, 2025 18:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants