Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LV] Exercise type-mismatch with RT-check conflict rdx #130295

Merged
merged 2 commits into from
Apr 2, 2025

Conversation

artagnon
Copy link
Contributor

@artagnon artagnon commented Mar 7, 2025

The test suite of LoopVectorize suffers from a coverage hole when types mismatch, and runtime checks are needed, with a conflict redux. Fix this coverage hole by adding tests.

@llvmbot
Copy link
Member

llvmbot commented Mar 7, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Ramkumar Ramachandra (artagnon)

Changes

The test suite of LoopVectorize suffers from a coverage hole when types mismatch, in the target-independent case. There is already interleave-allocsize-not-equal-typesize.ll under the AArch64 target written for the purposes of fixing specific bugs, but nothing exercising this in a target-independent fashion. Fix this by adapting a test from LoopAccessAnalysis' depend_diff_types.ll, and demonstrate that LoopVectorize only ever interleaves when types mismatch.


Full diff: https://github.com/llvm/llvm-project/pull/130295.diff

1 Files Affected:

  • (added) llvm/test/Transforms/LoopVectorize/type-mismatch-interleave.ll (+222)
diff --git a/llvm/test/Transforms/LoopVectorize/type-mismatch-interleave.ll b/llvm/test/Transforms/LoopVectorize/type-mismatch-interleave.ll
new file mode 100644
index 0000000000000..0b4ffc9e3008d
--- /dev/null
+++ b/llvm/test/Transforms/LoopVectorize/type-mismatch-interleave.ll
@@ -0,0 +1,222 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -passes=loop-vectorize -force-vector-width=4 -S %s | FileCheck %s
+
+; Tests demonstrating that LV only ever interleaves when types mismatch.
+
+define void @different_types(ptr noalias %src.1, ptr noalias %src.2, ptr noalias %dst.1, ptr noalias %dst.2, i64 %n) {
+; CHECK-LABEL: define void @different_types(
+; CHECK-SAME: ptr noalias [[SRC_1:%.*]], ptr noalias [[SRC_2:%.*]], ptr noalias [[DST_1:%.*]], ptr noalias [[DST_2:%.*]], i64 [[N:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*]]:
+; CHECK-NEXT:    [[UMAX:%.*]] = call i64 @llvm.umax.i64(i64 [[N]], i64 1)
+; CHECK-NEXT:    [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[UMAX]], 4
+; CHECK-NEXT:    br i1 [[MIN_ITERS_CHECK]], label %[[SCALAR_PH:.*]], label %[[VECTOR_PH:.*]]
+; CHECK:       [[VECTOR_PH]]:
+; CHECK-NEXT:    [[N_MOD_VF:%.*]] = urem i64 [[UMAX]], 4
+; CHECK-NEXT:    [[N_VEC:%.*]] = sub i64 [[UMAX]], [[N_MOD_VF]]
+; CHECK-NEXT:    br label %[[VECTOR_BODY:.*]]
+; CHECK:       [[VECTOR_BODY]]:
+; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, %[[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], %[[VECTOR_BODY]] ]
+; CHECK-NEXT:    [[TMP0:%.*]] = add i64 [[INDEX]], 0
+; CHECK-NEXT:    [[TMP1:%.*]] = add i64 [[INDEX]], 1
+; CHECK-NEXT:    [[TMP2:%.*]] = add i64 [[INDEX]], 2
+; CHECK-NEXT:    [[TMP3:%.*]] = add i64 [[INDEX]], 3
+; CHECK-NEXT:    [[TMP4:%.*]] = getelementptr i64, ptr [[SRC_1]], i64 [[TMP0]]
+; CHECK-NEXT:    [[TMP5:%.*]] = getelementptr i64, ptr [[TMP4]], i32 0
+; CHECK-NEXT:    [[WIDE_LOAD:%.*]] = load <4 x i64>, ptr [[TMP5]], align 4
+; CHECK-NEXT:    [[TMP6:%.*]] = trunc <4 x i64> [[WIDE_LOAD]] to <4 x i32>
+; CHECK-NEXT:    [[TMP7:%.*]] = getelementptr i64, ptr [[SRC_2]], i64 [[TMP0]]
+; CHECK-NEXT:    [[TMP8:%.*]] = getelementptr i64, ptr [[TMP7]], i32 0
+; CHECK-NEXT:    [[WIDE_LOAD1:%.*]] = load <4 x i64>, ptr [[TMP8]], align 4
+; CHECK-NEXT:    [[TMP9:%.*]] = add <4 x i64> [[WIDE_LOAD]], [[WIDE_LOAD1]]
+; CHECK-NEXT:    [[TMP10:%.*]] = getelementptr nusw i64, ptr [[DST_1]], i64 [[TMP0]]
+; CHECK-NEXT:    [[TMP11:%.*]] = getelementptr nusw i64, ptr [[DST_1]], i64 [[TMP1]]
+; CHECK-NEXT:    [[TMP12:%.*]] = getelementptr nusw i64, ptr [[DST_1]], i64 [[TMP2]]
+; CHECK-NEXT:    [[TMP13:%.*]] = getelementptr nusw i64, ptr [[DST_1]], i64 [[TMP3]]
+; CHECK-NEXT:    [[TMP14:%.*]] = extractelement <4 x i32> [[TMP6]], i32 0
+; CHECK-NEXT:    store i32 [[TMP14]], ptr [[TMP10]], align 4
+; CHECK-NEXT:    [[TMP15:%.*]] = extractelement <4 x i32> [[TMP6]], i32 1
+; CHECK-NEXT:    store i32 [[TMP15]], ptr [[TMP11]], align 4
+; CHECK-NEXT:    [[TMP16:%.*]] = extractelement <4 x i32> [[TMP6]], i32 2
+; CHECK-NEXT:    store i32 [[TMP16]], ptr [[TMP12]], align 4
+; CHECK-NEXT:    [[TMP17:%.*]] = extractelement <4 x i32> [[TMP6]], i32 3
+; CHECK-NEXT:    store i32 [[TMP17]], ptr [[TMP13]], align 4
+; CHECK-NEXT:    [[TMP18:%.*]] = getelementptr nusw i64, ptr [[DST_2]], i64 [[TMP0]]
+; CHECK-NEXT:    [[TMP19:%.*]] = getelementptr nusw i64, ptr [[TMP18]], i32 0
+; CHECK-NEXT:    store <4 x i64> [[TMP9]], ptr [[TMP19]], align 4
+; CHECK-NEXT:    [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4
+; CHECK-NEXT:    [[TMP20:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
+; CHECK-NEXT:    br i1 [[TMP20]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
+; CHECK:       [[MIDDLE_BLOCK]]:
+; CHECK-NEXT:    [[CMP_N:%.*]] = icmp eq i64 [[UMAX]], [[N_VEC]]
+; CHECK-NEXT:    br i1 [[CMP_N]], label %[[EXIT:.*]], label %[[SCALAR_PH]]
+; CHECK:       [[SCALAR_PH]]:
+; CHECK-NEXT:    [[BC_RESUME_VAL:%.*]] = phi i64 [ [[N_VEC]], %[[MIDDLE_BLOCK]] ], [ 0, %[[ENTRY]] ]
+; CHECK-NEXT:    br label %[[LOOP:.*]]
+; CHECK:       [[LOOP]]:
+; CHECK-NEXT:    [[IV:%.*]] = phi i64 [ [[BC_RESUME_VAL]], %[[SCALAR_PH]] ], [ [[IV_NEXT:%.*]], %[[LOOP]] ]
+; CHECK-NEXT:    [[GEP_SRC_1:%.*]] = getelementptr i64, ptr [[SRC_1]], i64 [[IV]]
+; CHECK-NEXT:    [[LD_SRC_1:%.*]] = load i64, ptr [[GEP_SRC_1]], align 4
+; CHECK-NEXT:    [[LD_SRC_1_I32:%.*]] = trunc i64 [[LD_SRC_1]] to i32
+; CHECK-NEXT:    [[GEP_SRC_2:%.*]] = getelementptr i64, ptr [[SRC_2]], i64 [[IV]]
+; CHECK-NEXT:    [[LD_SRC_2:%.*]] = load i64, ptr [[GEP_SRC_2]], align 4
+; CHECK-NEXT:    [[ADD:%.*]] = add i64 [[LD_SRC_1]], [[LD_SRC_2]]
+; CHECK-NEXT:    [[GEP_DST_1:%.*]] = getelementptr nusw i64, ptr [[DST_1]], i64 [[IV]]
+; CHECK-NEXT:    store i32 [[LD_SRC_1_I32]], ptr [[GEP_DST_1]], align 4
+; CHECK-NEXT:    [[GEP_DST_2:%.*]] = getelementptr nusw i64, ptr [[DST_2]], i64 [[IV]]
+; CHECK-NEXT:    store i64 [[ADD]], ptr [[GEP_DST_2]], align 4
+; CHECK-NEXT:    [[IV_NEXT]] = add nuw nsw i64 [[IV]], 1
+; CHECK-NEXT:    [[COND:%.*]] = icmp ult i64 [[IV_NEXT]], [[N]]
+; CHECK-NEXT:    br i1 [[COND]], label %[[LOOP]], label %[[EXIT]], !llvm.loop [[LOOP3:![0-9]+]]
+; CHECK:       [[EXIT]]:
+; CHECK-NEXT:    ret void
+;
+entry:
+  br label %loop
+
+loop:
+  %iv = phi i64 [ 0, %entry ], [ %iv.next, %loop ]
+
+  %gep.src.1 = getelementptr i64, ptr %src.1, i64 %iv
+  %ld.src.1 = load i64, ptr %gep.src.1
+  %ld.src.1.i32 = trunc i64 %ld.src.1 to i32
+
+  %gep.src.2 = getelementptr i64, ptr %src.2, i64 %iv
+  %ld.src.2 = load i64, ptr %gep.src.2
+  %add = add i64 %ld.src.1, %ld.src.2
+
+  %gep.dst.1 = getelementptr nusw i64, ptr %dst.1, i64 %iv
+  store i32 %ld.src.1.i32, ptr %gep.dst.1
+
+  %gep.dst.2 = getelementptr nusw i64, ptr %dst.2, i64 %iv
+  store i64 %add, ptr %gep.dst.2
+
+  %iv.next = add nuw nsw i64 %iv, 1
+  %cond = icmp ult i64 %iv.next, %n
+  br i1 %cond, label %loop, label %exit
+
+exit:
+  ret void
+}
+
+define void @different_types_rt_memcheck(ptr %src.1, ptr %src.2, ptr %dst.1, ptr %dst.2, i64 %n) {
+; CHECK-LABEL: define void @different_types_rt_memcheck(
+; CHECK-SAME: ptr [[SRC_1:%.*]], ptr [[SRC_2:%.*]], ptr [[DST_1:%.*]], ptr [[DST_2:%.*]], i64 [[N:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*]]:
+; CHECK-NEXT:    [[SRC_25:%.*]] = ptrtoint ptr [[SRC_2]] to i64
+; CHECK-NEXT:    [[SRC_13:%.*]] = ptrtoint ptr [[SRC_1]] to i64
+; CHECK-NEXT:    [[DST_12:%.*]] = ptrtoint ptr [[DST_1]] to i64
+; CHECK-NEXT:    [[DST_21:%.*]] = ptrtoint ptr [[DST_2]] to i64
+; CHECK-NEXT:    [[UMAX:%.*]] = call i64 @llvm.umax.i64(i64 [[N]], i64 1)
+; CHECK-NEXT:    [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[UMAX]], 4
+; CHECK-NEXT:    br i1 [[MIN_ITERS_CHECK]], label %[[SCALAR_PH:.*]], label %[[VECTOR_MEMCHECK:.*]]
+; CHECK:       [[VECTOR_MEMCHECK]]:
+; CHECK-NEXT:    [[TMP0:%.*]] = sub i64 [[DST_21]], [[DST_12]]
+; CHECK-NEXT:    [[DIFF_CHECK:%.*]] = icmp ult i64 [[TMP0]], 32
+; CHECK-NEXT:    [[TMP1:%.*]] = sub i64 [[DST_12]], [[SRC_13]]
+; CHECK-NEXT:    [[DIFF_CHECK4:%.*]] = icmp ult i64 [[TMP1]], 32
+; CHECK-NEXT:    [[CONFLICT_RDX:%.*]] = or i1 [[DIFF_CHECK]], [[DIFF_CHECK4]]
+; CHECK-NEXT:    [[TMP2:%.*]] = sub i64 [[DST_12]], [[SRC_25]]
+; CHECK-NEXT:    [[DIFF_CHECK6:%.*]] = icmp ult i64 [[TMP2]], 32
+; CHECK-NEXT:    [[CONFLICT_RDX7:%.*]] = or i1 [[CONFLICT_RDX]], [[DIFF_CHECK6]]
+; CHECK-NEXT:    [[TMP3:%.*]] = sub i64 [[DST_21]], [[SRC_13]]
+; CHECK-NEXT:    [[DIFF_CHECK8:%.*]] = icmp ult i64 [[TMP3]], 32
+; CHECK-NEXT:    [[CONFLICT_RDX9:%.*]] = or i1 [[CONFLICT_RDX7]], [[DIFF_CHECK8]]
+; CHECK-NEXT:    [[TMP4:%.*]] = sub i64 [[DST_21]], [[SRC_25]]
+; CHECK-NEXT:    [[DIFF_CHECK10:%.*]] = icmp ult i64 [[TMP4]], 32
+; CHECK-NEXT:    [[CONFLICT_RDX11:%.*]] = or i1 [[CONFLICT_RDX9]], [[DIFF_CHECK10]]
+; CHECK-NEXT:    br i1 [[CONFLICT_RDX11]], label %[[SCALAR_PH]], label %[[VECTOR_PH:.*]]
+; CHECK:       [[VECTOR_PH]]:
+; CHECK-NEXT:    [[N_MOD_VF:%.*]] = urem i64 [[UMAX]], 4
+; CHECK-NEXT:    [[N_VEC:%.*]] = sub i64 [[UMAX]], [[N_MOD_VF]]
+; CHECK-NEXT:    br label %[[VECTOR_BODY:.*]]
+; CHECK:       [[VECTOR_BODY]]:
+; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, %[[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], %[[VECTOR_BODY]] ]
+; CHECK-NEXT:    [[TMP5:%.*]] = add i64 [[INDEX]], 0
+; CHECK-NEXT:    [[TMP6:%.*]] = add i64 [[INDEX]], 1
+; CHECK-NEXT:    [[TMP7:%.*]] = add i64 [[INDEX]], 2
+; CHECK-NEXT:    [[TMP8:%.*]] = add i64 [[INDEX]], 3
+; CHECK-NEXT:    [[TMP9:%.*]] = getelementptr i64, ptr [[SRC_1]], i64 [[TMP5]]
+; CHECK-NEXT:    [[TMP10:%.*]] = getelementptr i64, ptr [[TMP9]], i32 0
+; CHECK-NEXT:    [[WIDE_LOAD:%.*]] = load <4 x i64>, ptr [[TMP10]], align 4
+; CHECK-NEXT:    [[TMP11:%.*]] = trunc <4 x i64> [[WIDE_LOAD]] to <4 x i32>
+; CHECK-NEXT:    [[TMP12:%.*]] = getelementptr i64, ptr [[SRC_2]], i64 [[TMP5]]
+; CHECK-NEXT:    [[TMP13:%.*]] = getelementptr i64, ptr [[TMP12]], i32 0
+; CHECK-NEXT:    [[WIDE_LOAD12:%.*]] = load <4 x i64>, ptr [[TMP13]], align 4
+; CHECK-NEXT:    [[TMP14:%.*]] = add <4 x i64> [[WIDE_LOAD]], [[WIDE_LOAD12]]
+; CHECK-NEXT:    [[TMP15:%.*]] = getelementptr nusw i64, ptr [[DST_1]], i64 [[TMP5]]
+; CHECK-NEXT:    [[TMP16:%.*]] = getelementptr nusw i64, ptr [[DST_1]], i64 [[TMP6]]
+; CHECK-NEXT:    [[TMP17:%.*]] = getelementptr nusw i64, ptr [[DST_1]], i64 [[TMP7]]
+; CHECK-NEXT:    [[TMP18:%.*]] = getelementptr nusw i64, ptr [[DST_1]], i64 [[TMP8]]
+; CHECK-NEXT:    [[TMP19:%.*]] = extractelement <4 x i32> [[TMP11]], i32 0
+; CHECK-NEXT:    store i32 [[TMP19]], ptr [[TMP15]], align 4
+; CHECK-NEXT:    [[TMP20:%.*]] = extractelement <4 x i32> [[TMP11]], i32 1
+; CHECK-NEXT:    store i32 [[TMP20]], ptr [[TMP16]], align 4
+; CHECK-NEXT:    [[TMP21:%.*]] = extractelement <4 x i32> [[TMP11]], i32 2
+; CHECK-NEXT:    store i32 [[TMP21]], ptr [[TMP17]], align 4
+; CHECK-NEXT:    [[TMP22:%.*]] = extractelement <4 x i32> [[TMP11]], i32 3
+; CHECK-NEXT:    store i32 [[TMP22]], ptr [[TMP18]], align 4
+; CHECK-NEXT:    [[TMP23:%.*]] = getelementptr nusw i64, ptr [[DST_2]], i64 [[TMP5]]
+; CHECK-NEXT:    [[TMP24:%.*]] = getelementptr nusw i64, ptr [[TMP23]], i32 0
+; CHECK-NEXT:    store <4 x i64> [[TMP14]], ptr [[TMP24]], align 4
+; CHECK-NEXT:    [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4
+; CHECK-NEXT:    [[TMP25:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
+; CHECK-NEXT:    br i1 [[TMP25]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !llvm.loop [[LOOP4:![0-9]+]]
+; CHECK:       [[MIDDLE_BLOCK]]:
+; CHECK-NEXT:    [[CMP_N:%.*]] = icmp eq i64 [[UMAX]], [[N_VEC]]
+; CHECK-NEXT:    br i1 [[CMP_N]], label %[[EXIT:.*]], label %[[SCALAR_PH]]
+; CHECK:       [[SCALAR_PH]]:
+; CHECK-NEXT:    [[BC_RESUME_VAL:%.*]] = phi i64 [ [[N_VEC]], %[[MIDDLE_BLOCK]] ], [ 0, %[[ENTRY]] ], [ 0, %[[VECTOR_MEMCHECK]] ]
+; CHECK-NEXT:    br label %[[LOOP:.*]]
+; CHECK:       [[LOOP]]:
+; CHECK-NEXT:    [[IV:%.*]] = phi i64 [ [[BC_RESUME_VAL]], %[[SCALAR_PH]] ], [ [[IV_NEXT:%.*]], %[[LOOP]] ]
+; CHECK-NEXT:    [[GEP_SRC_1:%.*]] = getelementptr i64, ptr [[SRC_1]], i64 [[IV]]
+; CHECK-NEXT:    [[LD_SRC_1:%.*]] = load i64, ptr [[GEP_SRC_1]], align 4
+; CHECK-NEXT:    [[LD_SRC_1_I32:%.*]] = trunc i64 [[LD_SRC_1]] to i32
+; CHECK-NEXT:    [[GEP_SRC_2:%.*]] = getelementptr i64, ptr [[SRC_2]], i64 [[IV]]
+; CHECK-NEXT:    [[LD_SRC_2:%.*]] = load i64, ptr [[GEP_SRC_2]], align 4
+; CHECK-NEXT:    [[ADD:%.*]] = add i64 [[LD_SRC_1]], [[LD_SRC_2]]
+; CHECK-NEXT:    [[GEP_DST_1:%.*]] = getelementptr nusw i64, ptr [[DST_1]], i64 [[IV]]
+; CHECK-NEXT:    store i32 [[LD_SRC_1_I32]], ptr [[GEP_DST_1]], align 4
+; CHECK-NEXT:    [[GEP_DST_2:%.*]] = getelementptr nusw i64, ptr [[DST_2]], i64 [[IV]]
+; CHECK-NEXT:    store i64 [[ADD]], ptr [[GEP_DST_2]], align 4
+; CHECK-NEXT:    [[IV_NEXT]] = add nuw nsw i64 [[IV]], 1
+; CHECK-NEXT:    [[COND:%.*]] = icmp ult i64 [[IV_NEXT]], [[N]]
+; CHECK-NEXT:    br i1 [[COND]], label %[[LOOP]], label %[[EXIT]], !llvm.loop [[LOOP5:![0-9]+]]
+; CHECK:       [[EXIT]]:
+; CHECK-NEXT:    ret void
+;
+entry:
+  br label %loop
+
+loop:
+  %iv = phi i64 [ 0, %entry ], [ %iv.next, %loop ]
+
+  %gep.src.1 = getelementptr i64, ptr %src.1, i64 %iv
+  %ld.src.1 = load i64, ptr %gep.src.1
+  %ld.src.1.i32 = trunc i64 %ld.src.1 to i32
+
+  %gep.src.2 = getelementptr i64, ptr %src.2, i64 %iv
+  %ld.src.2 = load i64, ptr %gep.src.2
+  %add = add i64 %ld.src.1, %ld.src.2
+
+  %gep.dst.1 = getelementptr nusw i64, ptr %dst.1, i64 %iv
+  store i32 %ld.src.1.i32, ptr %gep.dst.1
+
+  %gep.dst.2 = getelementptr nusw i64, ptr %dst.2, i64 %iv
+  store i64 %add, ptr %gep.dst.2
+
+  %iv.next = add nuw nsw i64 %iv, 1
+  %cond = icmp ult i64 %iv.next, %n
+  br i1 %cond, label %loop, label %exit
+
+exit:
+  ret void
+}
+;.
+; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]]}
+; CHECK: [[META1]] = !{!"llvm.loop.isvectorized", i32 1}
+; CHECK: [[META2]] = !{!"llvm.loop.unroll.runtime.disable"}
+; CHECK: [[LOOP3]] = distinct !{[[LOOP3]], [[META2]], [[META1]]}
+; CHECK: [[LOOP4]] = distinct !{[[LOOP4]], [[META1]], [[META2]]}
+; CHECK: [[LOOP5]] = distinct !{[[LOOP5]], [[META1]]}
+;.

@david-arm
Copy link
Contributor

In the commit message you say The test suite of LoopVectorize suffers from a coverage hole when types mismatch, but what types are you referring to and in what context? It's not obvious what code path/problem you're trying to test.

@artagnon artagnon force-pushed the lv-type-mismatch-test branch from 1897016 to 78e5e85 Compare March 7, 2025 17:06
@artagnon
Copy link
Contributor Author

artagnon commented Mar 7, 2025

In the commit message you say The test suite of LoopVectorize suffers from a coverage hole when types mismatch, but what types are you referring to and in what context? It's not obvious what code path/problem you're trying to test.

Thanks, I've clarified this in the message, and added two more tests.

@artagnon artagnon changed the title [LV] Add target-independent test for type-mismatch [LV] Add target-independent tests for type-mismatch Mar 7, 2025
Copy link
Contributor

@fhahn fhahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

, and demonstrate that LoopVectorize only ever interleaves when types mismatch.

I think I am missing how type mismatch is related to interleaving? Picking the interleave count should be independent of whether access types mismatch I think?

@artagnon
Copy link
Contributor Author

, and demonstrate that LoopVectorize only ever interleaves when types mismatch.

I think I am missing how type mismatch is related to interleaving? Picking the interleave count should be independent of whether access types mismatch I think?

What I'm trying to say is that the examples are not really "vectorized": how would you suggest phrasing this, if there is something to this observation? Otherwise, I can just drop observation, and simply present the tests as-is.

@artagnon artagnon force-pushed the lv-type-mismatch-test branch from 78e5e85 to 7a7fbf0 Compare March 10, 2025 11:21
@artagnon
Copy link
Contributor Author

Gentle ping. I think these tests are useful, but I might be missing something.

Copy link
Contributor

@fhahn fhahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

, and demonstrate that LoopVectorize only ever interleaves when types mismatch.

I think I am missing how type mismatch is related to interleaving? Picking the interleave count should be independent of whether access types mismatch I think?

What I'm trying to say is that the examples are not really "vectorized": how would you suggest phrasing this, if there is something to this observation? Otherwise, I can just drop observation, and simply present the tests as-is.

Yes I think it would be good to drop the observation, because it was confusing to me initially. I am also not sure what you mean by not really vectorized, AFAICT they are vectorized, just some instructions are currently scalarized?

The test suite of LoopVectorize suffers from a coverage hole when types
mismatch, in the target-independent case. There is already
interleave-allocsize-not-equal-typesize.ll under the AArch64 target
written for the purposes of fixing specific bugs, but nothing exercising
this in a target-independent fashion. Fix this by adapting a test from
LoopAccessAnalysis' depend_diff_types.ll, and demonstrate that
LoopVectorize only ever interleaves when types mismatch.
@artagnon artagnon changed the title [LV] Add target-independent tests for type-mismatch [LV] Add tests for type-mismatch Apr 1, 2025
@artagnon artagnon force-pushed the lv-type-mismatch-test branch from 7a7fbf0 to a5fbbf1 Compare April 1, 2025 14:36
@artagnon artagnon changed the title [LV] Add tests for type-mismatch [LV] Add tests for type-mismatch with RT-check conflict rdx Apr 2, 2025
@artagnon artagnon changed the title [LV] Add tests for type-mismatch with RT-check conflict rdx [LV] Exercise type-mismatch with RT-check conflict rdx Apr 2, 2025
@artagnon
Copy link
Contributor Author

artagnon commented Apr 2, 2025

Thanks, your review helped me clarify what I was testing: I've updated the commit message, as well as the patch.

Copy link
Contributor

@fhahn fhahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@artagnon artagnon merged commit f7591ee into llvm:main Apr 2, 2025
11 checks passed
@artagnon artagnon deleted the lv-type-mismatch-test branch April 2, 2025 16:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants