Skip to content

Commit 073e65a

Browse files
[LoopVectorize] Make needsExtract notice scalarized instructions (llvm#119720)
LoopVectorizationCostModel::needsExtract should recognise instructions that have been widened by scalarizing as scalar instructions, and thus not needing an extract when used by later scalarized instructions. This fixes an incorrect cost calculation in computePredInstDiscount, where we are adding a scalarization overhead cost when we shouldn't, though I haven't come up with a test case where it makes a difference. It will make a difference when the cost model switches to using the cost kind TCK_CodeSize for optsize, as not doing this causes the test LoopVectorize/X86/small-size.ll to get worse.
1 parent 6d604ba commit 073e65a

File tree

3 files changed

+164
-163
lines changed

3 files changed

+164
-163
lines changed

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1731,7 +1731,8 @@ class LoopVectorizationCostModel {
17311731
bool needsExtract(Value *V, ElementCount VF) const {
17321732
Instruction *I = dyn_cast<Instruction>(V);
17331733
if (VF.isScalar() || !I || !TheLoop->contains(I) ||
1734-
TheLoop->isLoopInvariant(I))
1734+
TheLoop->isLoopInvariant(I) ||
1735+
getWideningDecision(I, VF) == CM_Scalarize)
17351736
return false;
17361737

17371738
// Assume we can vectorize V (and hence we need extraction) if the

llvm/test/Transforms/LoopVectorize/AArch64/interleaved_cost.ll

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -170,8 +170,8 @@ entry:
170170
; VF_2-LABEL: Checking a loop in 'i64_factor_8'
171171
; VF_2: Found an estimated cost of 8 for VF 2 For instruction: %tmp2 = load i64, ptr %tmp0, align 8
172172
; VF_2-NEXT: Found an estimated cost of 8 for VF 2 For instruction: %tmp3 = load i64, ptr %tmp1, align 8
173-
; VF_2-NEXT: Found an estimated cost of 12 for VF 2 For instruction: store i64 %tmp2, ptr %tmp0, align 8
174-
; VF_2-NEXT: Found an estimated cost of 12 for VF 2 For instruction: store i64 %tmp3, ptr %tmp1, align 8
173+
; VF_2-NEXT: Found an estimated cost of 8 for VF 2 For instruction: store i64 %tmp2, ptr %tmp0, align 8
174+
; VF_2-NEXT: Found an estimated cost of 8 for VF 2 For instruction: store i64 %tmp3, ptr %tmp1, align 8
175175
for.body:
176176
%i = phi i64 [ 0, %entry ], [ %i.next, %for.body ]
177177
%tmp0 = getelementptr inbounds %i64.8, ptr %data, i64 %i, i32 2

0 commit comments

Comments
 (0)