Skip to content

Commit

Permalink
[BACKEND] Turn off thread locality optimization based on assumptions (#…
Browse files Browse the repository at this point in the history
…4688)

The pattern optimizing thread locality for reductions in a loop makes
assumptions that the reduction is happening on the most inner dim. This
seems deeply engrained in the code so I didn't try to fix it at the
moment. It's unclear to me if this code is working as expected in
general as it makes assumptions on how reshape with allow_reorder=True
will move data around.
  • Loading branch information
ThomasRaoux committed Sep 10, 2024
1 parent 2df33bb commit 58eccfc
Showing 1 changed file with 4 additions and 0 deletions.
4 changes: 4 additions & 0 deletions lib/Dialect/TritonGPU/Transforms/OptimizeThreadLocality.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -120,6 +120,10 @@ class TritonGPUOptimizeThreadLocalityPass
// TODO: relax this restriction
if (!(isa<triton::gpu::BlockedEncodingAttr>(srcEncoding) && rank > 1))
return;
// The code currently assumes that the reduction is happening on the most
// inner dim.
if (reduce.getAxis() != rank - 1)
return;
for (auto operand : reduce->getOperands()) {
if (!operand.getDefiningOp<triton::LoadOp>())
return;
Expand Down

0 comments on commit 58eccfc

Please sign in to comment.