Skip to content

Commit

Permalink
[LoopVectorize] Make needsExtract notice scalarized instructions (llv…
Browse files Browse the repository at this point in the history
…m#119720)

LoopVectorizationCostModel::needsExtract should recognise instructions
that have been widened by scalarizing as scalar instructions, and thus
not needing an extract when used by later scalarized instructions.

This fixes an incorrect cost calculation in computePredInstDiscount,
where we are adding a scalarization overhead cost when we shouldn't,
though I haven't come up with a test case where it makes a difference.
It will make a difference when the cost model switches to using the cost
kind TCK_CodeSize for optsize, as not doing this causes the test
LoopVectorize/X86/small-size.ll to get worse.
  • Loading branch information
john-brawn-arm authored Jan 2, 2025
1 parent 6d604ba commit 073e65a
Show file tree
Hide file tree
Showing 3 changed files with 164 additions and 163 deletions.
3 changes: 2 additions & 1 deletion llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1731,7 +1731,8 @@ class LoopVectorizationCostModel {
bool needsExtract(Value *V, ElementCount VF) const {
Instruction *I = dyn_cast<Instruction>(V);
if (VF.isScalar() || !I || !TheLoop->contains(I) ||
TheLoop->isLoopInvariant(I))
TheLoop->isLoopInvariant(I) ||
getWideningDecision(I, VF) == CM_Scalarize)
return false;

// Assume we can vectorize V (and hence we need extraction) if the
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -170,8 +170,8 @@ entry:
; VF_2-LABEL: Checking a loop in 'i64_factor_8'
; VF_2: Found an estimated cost of 8 for VF 2 For instruction: %tmp2 = load i64, ptr %tmp0, align 8
; VF_2-NEXT: Found an estimated cost of 8 for VF 2 For instruction: %tmp3 = load i64, ptr %tmp1, align 8
; VF_2-NEXT: Found an estimated cost of 12 for VF 2 For instruction: store i64 %tmp2, ptr %tmp0, align 8
; VF_2-NEXT: Found an estimated cost of 12 for VF 2 For instruction: store i64 %tmp3, ptr %tmp1, align 8
; VF_2-NEXT: Found an estimated cost of 8 for VF 2 For instruction: store i64 %tmp2, ptr %tmp0, align 8
; VF_2-NEXT: Found an estimated cost of 8 for VF 2 For instruction: store i64 %tmp3, ptr %tmp1, align 8
for.body:
%i = phi i64 [ 0, %entry ], [ %i.next, %for.body ]
%tmp0 = getelementptr inbounds %i64.8, ptr %data, i64 %i, i32 2
Expand Down
Loading

0 comments on commit 073e65a

Please sign in to comment.