-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LV][NFC] Pre-commit test for supporting strided accesses. #130563
Conversation
@llvm/pr-subscribers-backend-risc-v Author: Mel Chen (Mel-Chen) ChangesDuplicate riscv-vector-reverse.ll as riscv-vector-reverse-output.ll to verify all generated IR, not just debug output. Patch is 40.78 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/130563.diff 1 Files Affected:
diff --git a/llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse-output.ll b/llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse-output.ll
new file mode 100644
index 0000000000000..a15a8342154ff
--- /dev/null
+++ b/llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse-output.ll
@@ -0,0 +1,620 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+
+;; This is the loop in c++ being vectorize in this file with
+;; vector.reverse
+;; #pragma clang loop vectorize_width(4, scalable)
+;; for (int i = N-1; i >= 0; --i)
+;; a[i] = b[i] + 1.0;
+
+; RUN: opt -passes=loop-vectorize -mtriple=riscv64 -mattr=+v \
+; RUN: -riscv-v-vector-bits-min=128 -S < %s \
+; RUN: | FileCheck --check-prefix=RV64 %s
+
+; RUN: opt -passes=loop-vectorize -mtriple=riscv32 -mattr=+v \
+; RUN: -riscv-v-vector-bits-min=128 -S < %s \
+; RUN: | FileCheck --check-prefix=RV32 %s
+
+; RUN: opt -passes=loop-vectorize -mtriple=riscv64 -mattr=+v \
+; RUN: -riscv-v-vector-bits-min=128 -force-vector-interleave=2 -S < %s \
+; RUN: | FileCheck --check-prefix=RV64-UF2 %s
+
+define void @vector_reverse_i64(ptr noalias %A, ptr noalias %B, i32 %n) {
+; RV64-LABEL: define void @vector_reverse_i64(
+; RV64-SAME: ptr noalias [[A:%.*]], ptr noalias [[B:%.*]], i32 [[N:%.*]]) #[[ATTR0:[0-9]+]] {
+; RV64-NEXT: [[ENTRY:.*:]]
+; RV64-NEXT: [[CMP7:%.*]] = icmp sgt i32 [[N]], 0
+; RV64-NEXT: br i1 [[CMP7]], label %[[FOR_BODY_PREHEADER:.*]], label %[[FOR_COND_CLEANUP:.*]]
+; RV64: [[FOR_BODY_PREHEADER]]:
+; RV64-NEXT: [[TMP0:%.*]] = zext i32 [[N]] to i64
+; RV64-NEXT: [[TMP1:%.*]] = call i64 @llvm.vscale.i64()
+; RV64-NEXT: [[TMP2:%.*]] = mul i64 [[TMP1]], 4
+; RV64-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[TMP0]], [[TMP2]]
+; RV64-NEXT: br i1 [[MIN_ITERS_CHECK]], label %[[SCALAR_PH:.*]], label %[[VECTOR_SCEVCHECK:.*]]
+; RV64: [[VECTOR_SCEVCHECK]]:
+; RV64-NEXT: [[TMP3:%.*]] = add nsw i64 [[TMP0]], -1
+; RV64-NEXT: [[TMP4:%.*]] = add i32 [[N]], -1
+; RV64-NEXT: [[TMP5:%.*]] = trunc i64 [[TMP3]] to i32
+; RV64-NEXT: [[MUL:%.*]] = call { i32, i1 } @llvm.umul.with.overflow.i32(i32 1, i32 [[TMP5]])
+; RV64-NEXT: [[MUL_RESULT:%.*]] = extractvalue { i32, i1 } [[MUL]], 0
+; RV64-NEXT: [[MUL_OVERFLOW:%.*]] = extractvalue { i32, i1 } [[MUL]], 1
+; RV64-NEXT: [[TMP6:%.*]] = sub i32 [[TMP4]], [[MUL_RESULT]]
+; RV64-NEXT: [[TMP7:%.*]] = icmp ugt i32 [[TMP6]], [[TMP4]]
+; RV64-NEXT: [[TMP8:%.*]] = or i1 [[TMP7]], [[MUL_OVERFLOW]]
+; RV64-NEXT: [[TMP9:%.*]] = icmp ugt i64 [[TMP3]], 4294967295
+; RV64-NEXT: [[TMP10:%.*]] = or i1 [[TMP8]], [[TMP9]]
+; RV64-NEXT: br i1 [[TMP10]], label %[[SCALAR_PH]], label %[[VECTOR_PH:.*]]
+; RV64: [[VECTOR_PH]]:
+; RV64-NEXT: [[TMP11:%.*]] = call i64 @llvm.vscale.i64()
+; RV64-NEXT: [[TMP12:%.*]] = mul i64 [[TMP11]], 4
+; RV64-NEXT: [[N_MOD_VF:%.*]] = urem i64 [[TMP0]], [[TMP12]]
+; RV64-NEXT: [[N_VEC:%.*]] = sub i64 [[TMP0]], [[N_MOD_VF]]
+; RV64-NEXT: [[TMP13:%.*]] = call i64 @llvm.vscale.i64()
+; RV64-NEXT: [[TMP14:%.*]] = mul i64 [[TMP13]], 4
+; RV64-NEXT: [[TMP15:%.*]] = sub i64 [[TMP0]], [[N_VEC]]
+; RV64-NEXT: [[DOTCAST:%.*]] = trunc i64 [[N_VEC]] to i32
+; RV64-NEXT: [[TMP16:%.*]] = sub i32 [[N]], [[DOTCAST]]
+; RV64-NEXT: br label %[[VECTOR_BODY:.*]]
+; RV64: [[VECTOR_BODY]]:
+; RV64-NEXT: [[INDEX:%.*]] = phi i64 [ 0, %[[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], %[[VECTOR_BODY]] ]
+; RV64-NEXT: [[DOTCAST1:%.*]] = trunc i64 [[INDEX]] to i32
+; RV64-NEXT: [[OFFSET_IDX:%.*]] = sub i32 [[N]], [[DOTCAST1]]
+; RV64-NEXT: [[TMP17:%.*]] = add i32 [[OFFSET_IDX]], 0
+; RV64-NEXT: [[TMP18:%.*]] = add nsw i32 [[TMP17]], -1
+; RV64-NEXT: [[TMP19:%.*]] = zext i32 [[TMP18]] to i64
+; RV64-NEXT: [[TMP20:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[TMP19]]
+; RV64-NEXT: [[TMP21:%.*]] = mul i64 0, [[TMP14]]
+; RV64-NEXT: [[TMP22:%.*]] = sub i64 1, [[TMP14]]
+; RV64-NEXT: [[TMP23:%.*]] = getelementptr inbounds i32, ptr [[TMP20]], i64 [[TMP21]]
+; RV64-NEXT: [[TMP24:%.*]] = getelementptr inbounds i32, ptr [[TMP23]], i64 [[TMP22]]
+; RV64-NEXT: [[WIDE_LOAD:%.*]] = load <vscale x 4 x i32>, ptr [[TMP24]], align 4
+; RV64-NEXT: [[REVERSE:%.*]] = call <vscale x 4 x i32> @llvm.vector.reverse.nxv4i32(<vscale x 4 x i32> [[WIDE_LOAD]])
+; RV64-NEXT: [[TMP25:%.*]] = add <vscale x 4 x i32> [[REVERSE]], splat (i32 1)
+; RV64-NEXT: [[TMP26:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[TMP19]]
+; RV64-NEXT: [[TMP27:%.*]] = mul i64 0, [[TMP14]]
+; RV64-NEXT: [[TMP28:%.*]] = sub i64 1, [[TMP14]]
+; RV64-NEXT: [[TMP29:%.*]] = getelementptr inbounds i32, ptr [[TMP26]], i64 [[TMP27]]
+; RV64-NEXT: [[TMP30:%.*]] = getelementptr inbounds i32, ptr [[TMP29]], i64 [[TMP28]]
+; RV64-NEXT: [[REVERSE2:%.*]] = call <vscale x 4 x i32> @llvm.vector.reverse.nxv4i32(<vscale x 4 x i32> [[TMP25]])
+; RV64-NEXT: store <vscale x 4 x i32> [[REVERSE2]], ptr [[TMP30]], align 4
+; RV64-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], [[TMP14]]
+; RV64-NEXT: [[TMP31:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
+; RV64-NEXT: br i1 [[TMP31]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
+; RV64: [[MIDDLE_BLOCK]]:
+; RV64-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[TMP0]], [[N_VEC]]
+; RV64-NEXT: br i1 [[CMP_N]], label %[[FOR_COND_CLEANUP_LOOPEXIT:.*]], label %[[SCALAR_PH]]
+; RV64: [[SCALAR_PH]]:
+; RV64-NEXT: [[BC_RESUME_VAL:%.*]] = phi i64 [ [[TMP15]], %[[MIDDLE_BLOCK]] ], [ [[TMP0]], %[[FOR_BODY_PREHEADER]] ], [ [[TMP0]], %[[VECTOR_SCEVCHECK]] ]
+; RV64-NEXT: [[BC_RESUME_VAL3:%.*]] = phi i32 [ [[TMP16]], %[[MIDDLE_BLOCK]] ], [ [[N]], %[[FOR_BODY_PREHEADER]] ], [ [[N]], %[[VECTOR_SCEVCHECK]] ]
+; RV64-NEXT: br label %[[FOR_BODY:.*]]
+; RV64: [[FOR_COND_CLEANUP_LOOPEXIT]]:
+; RV64-NEXT: br label %[[FOR_COND_CLEANUP]]
+; RV64: [[FOR_COND_CLEANUP]]:
+; RV64-NEXT: ret void
+; RV64: [[FOR_BODY]]:
+; RV64-NEXT: [[INDVARS_IV:%.*]] = phi i64 [ [[BC_RESUME_VAL]], %[[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.*]], %[[FOR_BODY]] ]
+; RV64-NEXT: [[I_0_IN8:%.*]] = phi i32 [ [[BC_RESUME_VAL3]], %[[SCALAR_PH]] ], [ [[I_0:%.*]], %[[FOR_BODY]] ]
+; RV64-NEXT: [[I_0]] = add nsw i32 [[I_0_IN8]], -1
+; RV64-NEXT: [[IDXPROM:%.*]] = zext i32 [[I_0]] to i64
+; RV64-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[IDXPROM]]
+; RV64-NEXT: [[TMP32:%.*]] = load i32, ptr [[ARRAYIDX]], align 4
+; RV64-NEXT: [[ADD9:%.*]] = add i32 [[TMP32]], 1
+; RV64-NEXT: [[ARRAYIDX3:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[IDXPROM]]
+; RV64-NEXT: store i32 [[ADD9]], ptr [[ARRAYIDX3]], align 4
+; RV64-NEXT: [[CMP:%.*]] = icmp ugt i64 [[INDVARS_IV]], 1
+; RV64-NEXT: [[INDVARS_IV_NEXT]] = add nsw i64 [[INDVARS_IV]], -1
+; RV64-NEXT: br i1 [[CMP]], label %[[FOR_BODY]], label %[[FOR_COND_CLEANUP_LOOPEXIT]], !llvm.loop [[LOOP4:![0-9]+]]
+;
+; RV32-LABEL: define void @vector_reverse_i64(
+; RV32-SAME: ptr noalias [[A:%.*]], ptr noalias [[B:%.*]], i32 [[N:%.*]]) #[[ATTR0:[0-9]+]] {
+; RV32-NEXT: [[ENTRY:.*:]]
+; RV32-NEXT: [[CMP7:%.*]] = icmp sgt i32 [[N]], 0
+; RV32-NEXT: br i1 [[CMP7]], label %[[FOR_BODY_PREHEADER:.*]], label %[[FOR_COND_CLEANUP:.*]]
+; RV32: [[FOR_BODY_PREHEADER]]:
+; RV32-NEXT: [[TMP0:%.*]] = zext i32 [[N]] to i64
+; RV32-NEXT: [[TMP1:%.*]] = call i64 @llvm.vscale.i64()
+; RV32-NEXT: [[TMP2:%.*]] = mul i64 [[TMP1]], 4
+; RV32-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[TMP0]], [[TMP2]]
+; RV32-NEXT: br i1 [[MIN_ITERS_CHECK]], label %[[SCALAR_PH:.*]], label %[[VECTOR_PH:.*]]
+; RV32: [[VECTOR_PH]]:
+; RV32-NEXT: [[TMP3:%.*]] = call i64 @llvm.vscale.i64()
+; RV32-NEXT: [[TMP4:%.*]] = mul i64 [[TMP3]], 4
+; RV32-NEXT: [[N_MOD_VF:%.*]] = urem i64 [[TMP0]], [[TMP4]]
+; RV32-NEXT: [[N_VEC:%.*]] = sub i64 [[TMP0]], [[N_MOD_VF]]
+; RV32-NEXT: [[TMP5:%.*]] = call i64 @llvm.vscale.i64()
+; RV32-NEXT: [[TMP6:%.*]] = mul i64 [[TMP5]], 4
+; RV32-NEXT: [[TMP7:%.*]] = sub i64 [[TMP0]], [[N_VEC]]
+; RV32-NEXT: [[DOTCAST:%.*]] = trunc i64 [[N_VEC]] to i32
+; RV32-NEXT: [[TMP8:%.*]] = sub i32 [[N]], [[DOTCAST]]
+; RV32-NEXT: br label %[[VECTOR_BODY:.*]]
+; RV32: [[VECTOR_BODY]]:
+; RV32-NEXT: [[INDEX:%.*]] = phi i64 [ 0, %[[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], %[[VECTOR_BODY]] ]
+; RV32-NEXT: [[DOTCAST1:%.*]] = trunc i64 [[INDEX]] to i32
+; RV32-NEXT: [[OFFSET_IDX:%.*]] = sub i32 [[N]], [[DOTCAST1]]
+; RV32-NEXT: [[TMP9:%.*]] = add i32 [[OFFSET_IDX]], 0
+; RV32-NEXT: [[TMP10:%.*]] = add nsw i32 [[TMP9]], -1
+; RV32-NEXT: [[TMP11:%.*]] = zext i32 [[TMP10]] to i64
+; RV32-NEXT: [[TMP12:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[TMP11]]
+; RV32-NEXT: [[TMP13:%.*]] = trunc i64 [[TMP6]] to i32
+; RV32-NEXT: [[TMP14:%.*]] = mul i32 0, [[TMP13]]
+; RV32-NEXT: [[TMP15:%.*]] = sub i32 1, [[TMP13]]
+; RV32-NEXT: [[TMP16:%.*]] = getelementptr inbounds i32, ptr [[TMP12]], i32 [[TMP14]]
+; RV32-NEXT: [[TMP17:%.*]] = getelementptr inbounds i32, ptr [[TMP16]], i32 [[TMP15]]
+; RV32-NEXT: [[WIDE_LOAD:%.*]] = load <vscale x 4 x i32>, ptr [[TMP17]], align 4
+; RV32-NEXT: [[REVERSE:%.*]] = call <vscale x 4 x i32> @llvm.vector.reverse.nxv4i32(<vscale x 4 x i32> [[WIDE_LOAD]])
+; RV32-NEXT: [[TMP18:%.*]] = add <vscale x 4 x i32> [[REVERSE]], splat (i32 1)
+; RV32-NEXT: [[TMP19:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[TMP11]]
+; RV32-NEXT: [[TMP20:%.*]] = trunc i64 [[TMP6]] to i32
+; RV32-NEXT: [[TMP21:%.*]] = mul i32 0, [[TMP20]]
+; RV32-NEXT: [[TMP22:%.*]] = sub i32 1, [[TMP20]]
+; RV32-NEXT: [[TMP23:%.*]] = getelementptr inbounds i32, ptr [[TMP19]], i32 [[TMP21]]
+; RV32-NEXT: [[TMP24:%.*]] = getelementptr inbounds i32, ptr [[TMP23]], i32 [[TMP22]]
+; RV32-NEXT: [[REVERSE2:%.*]] = call <vscale x 4 x i32> @llvm.vector.reverse.nxv4i32(<vscale x 4 x i32> [[TMP18]])
+; RV32-NEXT: store <vscale x 4 x i32> [[REVERSE2]], ptr [[TMP24]], align 4
+; RV32-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], [[TMP6]]
+; RV32-NEXT: [[TMP25:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
+; RV32-NEXT: br i1 [[TMP25]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
+; RV32: [[MIDDLE_BLOCK]]:
+; RV32-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[TMP0]], [[N_VEC]]
+; RV32-NEXT: br i1 [[CMP_N]], label %[[FOR_COND_CLEANUP_LOOPEXIT:.*]], label %[[SCALAR_PH]]
+; RV32: [[SCALAR_PH]]:
+; RV32-NEXT: [[BC_RESUME_VAL:%.*]] = phi i64 [ [[TMP7]], %[[MIDDLE_BLOCK]] ], [ [[TMP0]], %[[FOR_BODY_PREHEADER]] ]
+; RV32-NEXT: [[BC_RESUME_VAL3:%.*]] = phi i32 [ [[TMP8]], %[[MIDDLE_BLOCK]] ], [ [[N]], %[[FOR_BODY_PREHEADER]] ]
+; RV32-NEXT: br label %[[FOR_BODY:.*]]
+; RV32: [[FOR_COND_CLEANUP_LOOPEXIT]]:
+; RV32-NEXT: br label %[[FOR_COND_CLEANUP]]
+; RV32: [[FOR_COND_CLEANUP]]:
+; RV32-NEXT: ret void
+; RV32: [[FOR_BODY]]:
+; RV32-NEXT: [[INDVARS_IV:%.*]] = phi i64 [ [[BC_RESUME_VAL]], %[[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.*]], %[[FOR_BODY]] ]
+; RV32-NEXT: [[I_0_IN8:%.*]] = phi i32 [ [[BC_RESUME_VAL3]], %[[SCALAR_PH]] ], [ [[I_0:%.*]], %[[FOR_BODY]] ]
+; RV32-NEXT: [[I_0]] = add nsw i32 [[I_0_IN8]], -1
+; RV32-NEXT: [[IDXPROM:%.*]] = zext i32 [[I_0]] to i64
+; RV32-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[IDXPROM]]
+; RV32-NEXT: [[TMP26:%.*]] = load i32, ptr [[ARRAYIDX]], align 4
+; RV32-NEXT: [[ADD9:%.*]] = add i32 [[TMP26]], 1
+; RV32-NEXT: [[ARRAYIDX3:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[IDXPROM]]
+; RV32-NEXT: store i32 [[ADD9]], ptr [[ARRAYIDX3]], align 4
+; RV32-NEXT: [[CMP:%.*]] = icmp ugt i64 [[INDVARS_IV]], 1
+; RV32-NEXT: [[INDVARS_IV_NEXT]] = add nsw i64 [[INDVARS_IV]], -1
+; RV32-NEXT: br i1 [[CMP]], label %[[FOR_BODY]], label %[[FOR_COND_CLEANUP_LOOPEXIT]], !llvm.loop [[LOOP4:![0-9]+]]
+;
+; RV64-UF2-LABEL: define void @vector_reverse_i64(
+; RV64-UF2-SAME: ptr noalias [[A:%.*]], ptr noalias [[B:%.*]], i32 [[N:%.*]]) #[[ATTR0:[0-9]+]] {
+; RV64-UF2-NEXT: [[ENTRY:.*:]]
+; RV64-UF2-NEXT: [[CMP7:%.*]] = icmp sgt i32 [[N]], 0
+; RV64-UF2-NEXT: br i1 [[CMP7]], label %[[FOR_BODY_PREHEADER:.*]], label %[[FOR_COND_CLEANUP:.*]]
+; RV64-UF2: [[FOR_BODY_PREHEADER]]:
+; RV64-UF2-NEXT: [[TMP0:%.*]] = zext i32 [[N]] to i64
+; RV64-UF2-NEXT: [[TMP1:%.*]] = call i64 @llvm.vscale.i64()
+; RV64-UF2-NEXT: [[TMP2:%.*]] = mul i64 [[TMP1]], 8
+; RV64-UF2-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[TMP0]], [[TMP2]]
+; RV64-UF2-NEXT: br i1 [[MIN_ITERS_CHECK]], label %[[SCALAR_PH:.*]], label %[[VECTOR_SCEVCHECK:.*]]
+; RV64-UF2: [[VECTOR_SCEVCHECK]]:
+; RV64-UF2-NEXT: [[TMP3:%.*]] = add nsw i64 [[TMP0]], -1
+; RV64-UF2-NEXT: [[TMP4:%.*]] = add i32 [[N]], -1
+; RV64-UF2-NEXT: [[TMP5:%.*]] = trunc i64 [[TMP3]] to i32
+; RV64-UF2-NEXT: [[MUL:%.*]] = call { i32, i1 } @llvm.umul.with.overflow.i32(i32 1, i32 [[TMP5]])
+; RV64-UF2-NEXT: [[MUL_RESULT:%.*]] = extractvalue { i32, i1 } [[MUL]], 0
+; RV64-UF2-NEXT: [[MUL_OVERFLOW:%.*]] = extractvalue { i32, i1 } [[MUL]], 1
+; RV64-UF2-NEXT: [[TMP6:%.*]] = sub i32 [[TMP4]], [[MUL_RESULT]]
+; RV64-UF2-NEXT: [[TMP7:%.*]] = icmp ugt i32 [[TMP6]], [[TMP4]]
+; RV64-UF2-NEXT: [[TMP8:%.*]] = or i1 [[TMP7]], [[MUL_OVERFLOW]]
+; RV64-UF2-NEXT: [[TMP9:%.*]] = icmp ugt i64 [[TMP3]], 4294967295
+; RV64-UF2-NEXT: [[TMP10:%.*]] = or i1 [[TMP8]], [[TMP9]]
+; RV64-UF2-NEXT: br i1 [[TMP10]], label %[[SCALAR_PH]], label %[[VECTOR_PH:.*]]
+; RV64-UF2: [[VECTOR_PH]]:
+; RV64-UF2-NEXT: [[TMP11:%.*]] = call i64 @llvm.vscale.i64()
+; RV64-UF2-NEXT: [[TMP12:%.*]] = mul i64 [[TMP11]], 8
+; RV64-UF2-NEXT: [[N_MOD_VF:%.*]] = urem i64 [[TMP0]], [[TMP12]]
+; RV64-UF2-NEXT: [[N_VEC:%.*]] = sub i64 [[TMP0]], [[N_MOD_VF]]
+; RV64-UF2-NEXT: [[TMP13:%.*]] = call i64 @llvm.vscale.i64()
+; RV64-UF2-NEXT: [[TMP14:%.*]] = mul i64 [[TMP13]], 4
+; RV64-UF2-NEXT: [[TMP15:%.*]] = mul i64 [[TMP14]], 2
+; RV64-UF2-NEXT: [[TMP16:%.*]] = sub i64 [[TMP0]], [[N_VEC]]
+; RV64-UF2-NEXT: [[DOTCAST:%.*]] = trunc i64 [[N_VEC]] to i32
+; RV64-UF2-NEXT: [[TMP17:%.*]] = sub i32 [[N]], [[DOTCAST]]
+; RV64-UF2-NEXT: br label %[[VECTOR_BODY:.*]]
+; RV64-UF2: [[VECTOR_BODY]]:
+; RV64-UF2-NEXT: [[INDEX:%.*]] = phi i64 [ 0, %[[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], %[[VECTOR_BODY]] ]
+; RV64-UF2-NEXT: [[DOTCAST1:%.*]] = trunc i64 [[INDEX]] to i32
+; RV64-UF2-NEXT: [[OFFSET_IDX:%.*]] = sub i32 [[N]], [[DOTCAST1]]
+; RV64-UF2-NEXT: [[TMP18:%.*]] = add i32 [[OFFSET_IDX]], 0
+; RV64-UF2-NEXT: [[TMP19:%.*]] = add nsw i32 [[TMP18]], -1
+; RV64-UF2-NEXT: [[TMP20:%.*]] = zext i32 [[TMP19]] to i64
+; RV64-UF2-NEXT: [[TMP21:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[TMP20]]
+; RV64-UF2-NEXT: [[TMP22:%.*]] = mul i64 0, [[TMP14]]
+; RV64-UF2-NEXT: [[TMP23:%.*]] = sub i64 1, [[TMP14]]
+; RV64-UF2-NEXT: [[TMP24:%.*]] = getelementptr inbounds i32, ptr [[TMP21]], i64 [[TMP22]]
+; RV64-UF2-NEXT: [[TMP25:%.*]] = getelementptr inbounds i32, ptr [[TMP24]], i64 [[TMP23]]
+; RV64-UF2-NEXT: [[TMP26:%.*]] = mul i64 -1, [[TMP14]]
+; RV64-UF2-NEXT: [[TMP27:%.*]] = sub i64 1, [[TMP14]]
+; RV64-UF2-NEXT: [[TMP28:%.*]] = getelementptr inbounds i32, ptr [[TMP21]], i64 [[TMP26]]
+; RV64-UF2-NEXT: [[TMP29:%.*]] = getelementptr inbounds i32, ptr [[TMP28]], i64 [[TMP27]]
+; RV64-UF2-NEXT: [[WIDE_LOAD:%.*]] = load <vscale x 4 x i32>, ptr [[TMP25]], align 4
+; RV64-UF2-NEXT: [[REVERSE:%.*]] = call <vscale x 4 x i32> @llvm.vector.reverse.nxv4i32(<vscale x 4 x i32> [[WIDE_LOAD]])
+; RV64-UF2-NEXT: [[WIDE_LOAD2:%.*]] = load <vscale x 4 x i32>, ptr [[TMP29]], align 4
+; RV64-UF2-NEXT: [[REVERSE3:%.*]] = call <vscale x 4 x i32> @llvm.vector.reverse.nxv4i32(<vscale x 4 x i32> [[WIDE_LOAD2]])
+; RV64-UF2-NEXT: [[TMP30:%.*]] = add <vscale x 4 x i32> [[REVERSE]], splat (i32 1)
+; RV64-UF2-NEXT: [[TMP31:%.*]] = add <vscale x 4 x i32> [[REVERSE3]], splat (i32 1)
+; RV64-UF2-NEXT: [[TMP32:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[TMP20]]
+; RV64-UF2-NEXT: [[TMP33:%.*]] = mul i64 0, [[TMP14]]
+; RV64-UF2-NEXT: [[TMP34:%.*]] = sub i64 1, [[TMP14]]
+; RV64-UF2-NEXT: [[TMP35:%.*]] = getelementptr inbounds i32, ptr [[TMP32]], i64 [[TMP33]]
+; RV64-UF2-NEXT: [[TMP36:%.*]] = getelementptr inbounds i32, ptr [[TMP35]], i64 [[TMP34]]
+; RV64-UF2-NEXT: [[TMP37:%.*]] = mul i64 -1, [[TMP14]]
+; RV64-UF2-NEXT: [[TMP38:%.*]] = sub i64 1, [[TMP14]]
+; RV64-UF2-NEXT: [[TMP39:%.*]] = getelementptr inbounds i32, ptr [[TMP32]], i64 [[TMP37]]
+; RV64-UF2-NEXT: [[TMP40:%.*]] = getelementptr inbounds i32, ptr [[TMP39]], i64 [[TMP38]]
+; RV64-UF2-NEXT: [[REVERSE4:%.*]] = call <vscale x 4 x i32> @llvm.vector.reverse.nxv4i32(<vscale x 4 x i32> [[TMP30]])
+; RV64-UF2-NEXT: store <vscale x 4 x i32> [[REVERSE4]], ptr [[TMP36]], align 4
+; RV64-UF2-NEXT: [[REVERSE5:%.*]] = call <vscale x 4 x i32> @llvm.vector.reverse.nxv4i32(<vscale x 4 x i32> [[TMP31]])
+; RV64-UF2-NEXT: store <vscale x 4 x i32> [[REVERSE5]], ptr [[TMP40]], align 4
+; RV64-UF2-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], [[TMP15]]
+; RV64-UF2-NEXT: [[TMP41:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
+; RV64-UF2-NEXT: br i1 [[TMP41]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
+; RV64-UF2: [[MIDDLE_BLOCK]]:
+; RV64-UF2-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[TMP0]], [[N_VEC]]
+; RV64-UF2-NEXT: br i1 [[CMP_N]], label %[[FOR_COND_CLEANUP_LOOPEXIT:.*]], label %[[SCALAR_PH]]
+; RV64-UF2: [[SCALAR_PH]]:
+; RV64-UF2-NEXT: [[BC_RESUME_VAL:%.*]] = phi i64 [ [[TMP16]], %[[MIDDLE_BLOCK]] ], [ [[TMP0]], %[[FOR_BODY_PREHEADER]] ], [ [[TMP0]], %[[VECTOR_SCEVCHECK]] ]
+; RV64-UF2-NEXT: [[BC_RESUME_VAL6:%.*]] = phi i32 [ [[TMP17]], %[[MIDDLE_BLOCK]] ], [ [[N]], %[[FOR_BODY_PREHEADER]] ], [ [[N]], %[[VECTOR_SCEVCHECK]] ]
+; RV64-UF2-NEXT: br label %[[FOR_BODY:.*]]
+; RV64-UF2: [[FOR_COND_CLEANUP_LOOPEXIT]]:
+; RV64-UF2-NEXT: br label %[[FOR_COND_CLEANUP]]
+; RV64-UF2: [[FOR_COND_CLEANUP]]:
+; RV64-UF2-NEXT: ret void
+; RV64-UF2: [[FOR_BODY]]:
+; RV64-UF2-NEXT: [[INDVARS_IV:%.*]] = phi i64 [ [[BC_RESUME_VAL]], %[[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.*]], %[[FOR_BODY]] ]
+; RV64-UF2-NEXT: [[I_0_IN8:%.*]] = phi i32 [ [[BC_RESUME_VAL6]], %[[SCALAR_PH]] ], [ [[I_0:%.*]], %[[FOR_BODY]] ]
+; RV64-UF2-NEXT: [[I_0]] = add nsw i32 [[I_0_IN8]], -1
+; RV64-UF2-NEXT: [[IDXPROM:%.*]] = zext i32 [[I_0]] to i64
+; RV64-UF2-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[IDXPROM]]
+; RV64-UF2-NEXT: [[TMP42:%.*]] = load i32, ptr [[ARRAYIDX]], align 4
+; RV64-UF2-NEXT: [[ADD9:%.*]] = add i32 [[TMP42]], 1
+; RV64-UF2-NEXT: [[ARRAYIDX3:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[IDXPROM]]
+; RV64-UF2-NEXT: store i32 [[ADD9]], ptr [[ARRAYIDX3]], align 4
+; RV64-UF2-NEXT: [[CMP:%.*]] = icmp ugt i64 [[INDVARS_IV]], 1
+; RV64-UF2-NEXT: [[INDVARS_IV_NEXT]] = add nsw i64 [[INDVARS_IV]], -1
+; RV64-UF2-NEXT: br i1 [[CMP]], label %[[FOR_BODY]], label %[[FOR_COND_CLEANUP_LOOPEXIT]], !llvm.loop [[LOOP4:![0-9]+]]
+;
+entry:
+ %cmp7 = icmp sgt i32 %n, 0
+ br i1 %cmp7, label %for.body.preheader, label %for.cond.cleanup
+
+for.body.preheader: ; preds = %entry
+ %0 = zext i32 %n to i64
+ br label %for.body
+
+for.cond.cleanup: ; preds = %for.body, %entry
+ ret void
+
+for.body: ; preds = %for.body.preheader, %for.body
+ %indvars.iv = phi i64 [ %0, %for.body.preheader ], [ %indvars.iv.next, %for.body ]
+ %i.0.in8 = phi i32 [ %n, %for.body.preheader ], [ %i.0, %for.body ]
+ %i.0 = add nsw i32 %i.0.in8, -1
+ %idxprom = zext i32 %i.0 to i64
+ %arrayidx = getelementptr inbounds i32, ptr %B, i64 %idxprom
+ %1 = load i32, ptr %arrayidx, align 4
+ %add9 = add i32 %1, 1
+ %arrayidx3 = getelementptr inbounds i32, ptr %A, i64 %idxprom
+ store i32 %add9, ptr %arrayidx3, align 4
+ %cmp = icmp ugt i64 %indvars.iv, 1
+ %indvars.iv.next = add nsw i64 %indvars.iv, -1
+ br i1 %cmp, label %for.body, label %for.cond.cleanup, !llvm.loop !0
+}
+
+define voi...
[truncated]
|
@llvm/pr-subscribers-llvm-transforms Author: Mel Chen (Mel-Chen) ChangesDuplicate riscv-vector-reverse.ll as riscv-vector-reverse-output.ll to verify all generated IR, not just debug output. Patch is 40.78 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/130563.diff 1 Files Affected:
diff --git a/llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse-output.ll b/llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse-output.ll
new file mode 100644
index 0000000000000..a15a8342154ff
--- /dev/null
+++ b/llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse-output.ll
@@ -0,0 +1,620 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+
+;; This is the loop in c++ being vectorize in this file with
+;; vector.reverse
+;; #pragma clang loop vectorize_width(4, scalable)
+;; for (int i = N-1; i >= 0; --i)
+;; a[i] = b[i] + 1.0;
+
+; RUN: opt -passes=loop-vectorize -mtriple=riscv64 -mattr=+v \
+; RUN: -riscv-v-vector-bits-min=128 -S < %s \
+; RUN: | FileCheck --check-prefix=RV64 %s
+
+; RUN: opt -passes=loop-vectorize -mtriple=riscv32 -mattr=+v \
+; RUN: -riscv-v-vector-bits-min=128 -S < %s \
+; RUN: | FileCheck --check-prefix=RV32 %s
+
+; RUN: opt -passes=loop-vectorize -mtriple=riscv64 -mattr=+v \
+; RUN: -riscv-v-vector-bits-min=128 -force-vector-interleave=2 -S < %s \
+; RUN: | FileCheck --check-prefix=RV64-UF2 %s
+
+define void @vector_reverse_i64(ptr noalias %A, ptr noalias %B, i32 %n) {
+; RV64-LABEL: define void @vector_reverse_i64(
+; RV64-SAME: ptr noalias [[A:%.*]], ptr noalias [[B:%.*]], i32 [[N:%.*]]) #[[ATTR0:[0-9]+]] {
+; RV64-NEXT: [[ENTRY:.*:]]
+; RV64-NEXT: [[CMP7:%.*]] = icmp sgt i32 [[N]], 0
+; RV64-NEXT: br i1 [[CMP7]], label %[[FOR_BODY_PREHEADER:.*]], label %[[FOR_COND_CLEANUP:.*]]
+; RV64: [[FOR_BODY_PREHEADER]]:
+; RV64-NEXT: [[TMP0:%.*]] = zext i32 [[N]] to i64
+; RV64-NEXT: [[TMP1:%.*]] = call i64 @llvm.vscale.i64()
+; RV64-NEXT: [[TMP2:%.*]] = mul i64 [[TMP1]], 4
+; RV64-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[TMP0]], [[TMP2]]
+; RV64-NEXT: br i1 [[MIN_ITERS_CHECK]], label %[[SCALAR_PH:.*]], label %[[VECTOR_SCEVCHECK:.*]]
+; RV64: [[VECTOR_SCEVCHECK]]:
+; RV64-NEXT: [[TMP3:%.*]] = add nsw i64 [[TMP0]], -1
+; RV64-NEXT: [[TMP4:%.*]] = add i32 [[N]], -1
+; RV64-NEXT: [[TMP5:%.*]] = trunc i64 [[TMP3]] to i32
+; RV64-NEXT: [[MUL:%.*]] = call { i32, i1 } @llvm.umul.with.overflow.i32(i32 1, i32 [[TMP5]])
+; RV64-NEXT: [[MUL_RESULT:%.*]] = extractvalue { i32, i1 } [[MUL]], 0
+; RV64-NEXT: [[MUL_OVERFLOW:%.*]] = extractvalue { i32, i1 } [[MUL]], 1
+; RV64-NEXT: [[TMP6:%.*]] = sub i32 [[TMP4]], [[MUL_RESULT]]
+; RV64-NEXT: [[TMP7:%.*]] = icmp ugt i32 [[TMP6]], [[TMP4]]
+; RV64-NEXT: [[TMP8:%.*]] = or i1 [[TMP7]], [[MUL_OVERFLOW]]
+; RV64-NEXT: [[TMP9:%.*]] = icmp ugt i64 [[TMP3]], 4294967295
+; RV64-NEXT: [[TMP10:%.*]] = or i1 [[TMP8]], [[TMP9]]
+; RV64-NEXT: br i1 [[TMP10]], label %[[SCALAR_PH]], label %[[VECTOR_PH:.*]]
+; RV64: [[VECTOR_PH]]:
+; RV64-NEXT: [[TMP11:%.*]] = call i64 @llvm.vscale.i64()
+; RV64-NEXT: [[TMP12:%.*]] = mul i64 [[TMP11]], 4
+; RV64-NEXT: [[N_MOD_VF:%.*]] = urem i64 [[TMP0]], [[TMP12]]
+; RV64-NEXT: [[N_VEC:%.*]] = sub i64 [[TMP0]], [[N_MOD_VF]]
+; RV64-NEXT: [[TMP13:%.*]] = call i64 @llvm.vscale.i64()
+; RV64-NEXT: [[TMP14:%.*]] = mul i64 [[TMP13]], 4
+; RV64-NEXT: [[TMP15:%.*]] = sub i64 [[TMP0]], [[N_VEC]]
+; RV64-NEXT: [[DOTCAST:%.*]] = trunc i64 [[N_VEC]] to i32
+; RV64-NEXT: [[TMP16:%.*]] = sub i32 [[N]], [[DOTCAST]]
+; RV64-NEXT: br label %[[VECTOR_BODY:.*]]
+; RV64: [[VECTOR_BODY]]:
+; RV64-NEXT: [[INDEX:%.*]] = phi i64 [ 0, %[[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], %[[VECTOR_BODY]] ]
+; RV64-NEXT: [[DOTCAST1:%.*]] = trunc i64 [[INDEX]] to i32
+; RV64-NEXT: [[OFFSET_IDX:%.*]] = sub i32 [[N]], [[DOTCAST1]]
+; RV64-NEXT: [[TMP17:%.*]] = add i32 [[OFFSET_IDX]], 0
+; RV64-NEXT: [[TMP18:%.*]] = add nsw i32 [[TMP17]], -1
+; RV64-NEXT: [[TMP19:%.*]] = zext i32 [[TMP18]] to i64
+; RV64-NEXT: [[TMP20:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[TMP19]]
+; RV64-NEXT: [[TMP21:%.*]] = mul i64 0, [[TMP14]]
+; RV64-NEXT: [[TMP22:%.*]] = sub i64 1, [[TMP14]]
+; RV64-NEXT: [[TMP23:%.*]] = getelementptr inbounds i32, ptr [[TMP20]], i64 [[TMP21]]
+; RV64-NEXT: [[TMP24:%.*]] = getelementptr inbounds i32, ptr [[TMP23]], i64 [[TMP22]]
+; RV64-NEXT: [[WIDE_LOAD:%.*]] = load <vscale x 4 x i32>, ptr [[TMP24]], align 4
+; RV64-NEXT: [[REVERSE:%.*]] = call <vscale x 4 x i32> @llvm.vector.reverse.nxv4i32(<vscale x 4 x i32> [[WIDE_LOAD]])
+; RV64-NEXT: [[TMP25:%.*]] = add <vscale x 4 x i32> [[REVERSE]], splat (i32 1)
+; RV64-NEXT: [[TMP26:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[TMP19]]
+; RV64-NEXT: [[TMP27:%.*]] = mul i64 0, [[TMP14]]
+; RV64-NEXT: [[TMP28:%.*]] = sub i64 1, [[TMP14]]
+; RV64-NEXT: [[TMP29:%.*]] = getelementptr inbounds i32, ptr [[TMP26]], i64 [[TMP27]]
+; RV64-NEXT: [[TMP30:%.*]] = getelementptr inbounds i32, ptr [[TMP29]], i64 [[TMP28]]
+; RV64-NEXT: [[REVERSE2:%.*]] = call <vscale x 4 x i32> @llvm.vector.reverse.nxv4i32(<vscale x 4 x i32> [[TMP25]])
+; RV64-NEXT: store <vscale x 4 x i32> [[REVERSE2]], ptr [[TMP30]], align 4
+; RV64-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], [[TMP14]]
+; RV64-NEXT: [[TMP31:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
+; RV64-NEXT: br i1 [[TMP31]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
+; RV64: [[MIDDLE_BLOCK]]:
+; RV64-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[TMP0]], [[N_VEC]]
+; RV64-NEXT: br i1 [[CMP_N]], label %[[FOR_COND_CLEANUP_LOOPEXIT:.*]], label %[[SCALAR_PH]]
+; RV64: [[SCALAR_PH]]:
+; RV64-NEXT: [[BC_RESUME_VAL:%.*]] = phi i64 [ [[TMP15]], %[[MIDDLE_BLOCK]] ], [ [[TMP0]], %[[FOR_BODY_PREHEADER]] ], [ [[TMP0]], %[[VECTOR_SCEVCHECK]] ]
+; RV64-NEXT: [[BC_RESUME_VAL3:%.*]] = phi i32 [ [[TMP16]], %[[MIDDLE_BLOCK]] ], [ [[N]], %[[FOR_BODY_PREHEADER]] ], [ [[N]], %[[VECTOR_SCEVCHECK]] ]
+; RV64-NEXT: br label %[[FOR_BODY:.*]]
+; RV64: [[FOR_COND_CLEANUP_LOOPEXIT]]:
+; RV64-NEXT: br label %[[FOR_COND_CLEANUP]]
+; RV64: [[FOR_COND_CLEANUP]]:
+; RV64-NEXT: ret void
+; RV64: [[FOR_BODY]]:
+; RV64-NEXT: [[INDVARS_IV:%.*]] = phi i64 [ [[BC_RESUME_VAL]], %[[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.*]], %[[FOR_BODY]] ]
+; RV64-NEXT: [[I_0_IN8:%.*]] = phi i32 [ [[BC_RESUME_VAL3]], %[[SCALAR_PH]] ], [ [[I_0:%.*]], %[[FOR_BODY]] ]
+; RV64-NEXT: [[I_0]] = add nsw i32 [[I_0_IN8]], -1
+; RV64-NEXT: [[IDXPROM:%.*]] = zext i32 [[I_0]] to i64
+; RV64-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[IDXPROM]]
+; RV64-NEXT: [[TMP32:%.*]] = load i32, ptr [[ARRAYIDX]], align 4
+; RV64-NEXT: [[ADD9:%.*]] = add i32 [[TMP32]], 1
+; RV64-NEXT: [[ARRAYIDX3:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[IDXPROM]]
+; RV64-NEXT: store i32 [[ADD9]], ptr [[ARRAYIDX3]], align 4
+; RV64-NEXT: [[CMP:%.*]] = icmp ugt i64 [[INDVARS_IV]], 1
+; RV64-NEXT: [[INDVARS_IV_NEXT]] = add nsw i64 [[INDVARS_IV]], -1
+; RV64-NEXT: br i1 [[CMP]], label %[[FOR_BODY]], label %[[FOR_COND_CLEANUP_LOOPEXIT]], !llvm.loop [[LOOP4:![0-9]+]]
+;
+; RV32-LABEL: define void @vector_reverse_i64(
+; RV32-SAME: ptr noalias [[A:%.*]], ptr noalias [[B:%.*]], i32 [[N:%.*]]) #[[ATTR0:[0-9]+]] {
+; RV32-NEXT: [[ENTRY:.*:]]
+; RV32-NEXT: [[CMP7:%.*]] = icmp sgt i32 [[N]], 0
+; RV32-NEXT: br i1 [[CMP7]], label %[[FOR_BODY_PREHEADER:.*]], label %[[FOR_COND_CLEANUP:.*]]
+; RV32: [[FOR_BODY_PREHEADER]]:
+; RV32-NEXT: [[TMP0:%.*]] = zext i32 [[N]] to i64
+; RV32-NEXT: [[TMP1:%.*]] = call i64 @llvm.vscale.i64()
+; RV32-NEXT: [[TMP2:%.*]] = mul i64 [[TMP1]], 4
+; RV32-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[TMP0]], [[TMP2]]
+; RV32-NEXT: br i1 [[MIN_ITERS_CHECK]], label %[[SCALAR_PH:.*]], label %[[VECTOR_PH:.*]]
+; RV32: [[VECTOR_PH]]:
+; RV32-NEXT: [[TMP3:%.*]] = call i64 @llvm.vscale.i64()
+; RV32-NEXT: [[TMP4:%.*]] = mul i64 [[TMP3]], 4
+; RV32-NEXT: [[N_MOD_VF:%.*]] = urem i64 [[TMP0]], [[TMP4]]
+; RV32-NEXT: [[N_VEC:%.*]] = sub i64 [[TMP0]], [[N_MOD_VF]]
+; RV32-NEXT: [[TMP5:%.*]] = call i64 @llvm.vscale.i64()
+; RV32-NEXT: [[TMP6:%.*]] = mul i64 [[TMP5]], 4
+; RV32-NEXT: [[TMP7:%.*]] = sub i64 [[TMP0]], [[N_VEC]]
+; RV32-NEXT: [[DOTCAST:%.*]] = trunc i64 [[N_VEC]] to i32
+; RV32-NEXT: [[TMP8:%.*]] = sub i32 [[N]], [[DOTCAST]]
+; RV32-NEXT: br label %[[VECTOR_BODY:.*]]
+; RV32: [[VECTOR_BODY]]:
+; RV32-NEXT: [[INDEX:%.*]] = phi i64 [ 0, %[[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], %[[VECTOR_BODY]] ]
+; RV32-NEXT: [[DOTCAST1:%.*]] = trunc i64 [[INDEX]] to i32
+; RV32-NEXT: [[OFFSET_IDX:%.*]] = sub i32 [[N]], [[DOTCAST1]]
+; RV32-NEXT: [[TMP9:%.*]] = add i32 [[OFFSET_IDX]], 0
+; RV32-NEXT: [[TMP10:%.*]] = add nsw i32 [[TMP9]], -1
+; RV32-NEXT: [[TMP11:%.*]] = zext i32 [[TMP10]] to i64
+; RV32-NEXT: [[TMP12:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[TMP11]]
+; RV32-NEXT: [[TMP13:%.*]] = trunc i64 [[TMP6]] to i32
+; RV32-NEXT: [[TMP14:%.*]] = mul i32 0, [[TMP13]]
+; RV32-NEXT: [[TMP15:%.*]] = sub i32 1, [[TMP13]]
+; RV32-NEXT: [[TMP16:%.*]] = getelementptr inbounds i32, ptr [[TMP12]], i32 [[TMP14]]
+; RV32-NEXT: [[TMP17:%.*]] = getelementptr inbounds i32, ptr [[TMP16]], i32 [[TMP15]]
+; RV32-NEXT: [[WIDE_LOAD:%.*]] = load <vscale x 4 x i32>, ptr [[TMP17]], align 4
+; RV32-NEXT: [[REVERSE:%.*]] = call <vscale x 4 x i32> @llvm.vector.reverse.nxv4i32(<vscale x 4 x i32> [[WIDE_LOAD]])
+; RV32-NEXT: [[TMP18:%.*]] = add <vscale x 4 x i32> [[REVERSE]], splat (i32 1)
+; RV32-NEXT: [[TMP19:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[TMP11]]
+; RV32-NEXT: [[TMP20:%.*]] = trunc i64 [[TMP6]] to i32
+; RV32-NEXT: [[TMP21:%.*]] = mul i32 0, [[TMP20]]
+; RV32-NEXT: [[TMP22:%.*]] = sub i32 1, [[TMP20]]
+; RV32-NEXT: [[TMP23:%.*]] = getelementptr inbounds i32, ptr [[TMP19]], i32 [[TMP21]]
+; RV32-NEXT: [[TMP24:%.*]] = getelementptr inbounds i32, ptr [[TMP23]], i32 [[TMP22]]
+; RV32-NEXT: [[REVERSE2:%.*]] = call <vscale x 4 x i32> @llvm.vector.reverse.nxv4i32(<vscale x 4 x i32> [[TMP18]])
+; RV32-NEXT: store <vscale x 4 x i32> [[REVERSE2]], ptr [[TMP24]], align 4
+; RV32-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], [[TMP6]]
+; RV32-NEXT: [[TMP25:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
+; RV32-NEXT: br i1 [[TMP25]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
+; RV32: [[MIDDLE_BLOCK]]:
+; RV32-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[TMP0]], [[N_VEC]]
+; RV32-NEXT: br i1 [[CMP_N]], label %[[FOR_COND_CLEANUP_LOOPEXIT:.*]], label %[[SCALAR_PH]]
+; RV32: [[SCALAR_PH]]:
+; RV32-NEXT: [[BC_RESUME_VAL:%.*]] = phi i64 [ [[TMP7]], %[[MIDDLE_BLOCK]] ], [ [[TMP0]], %[[FOR_BODY_PREHEADER]] ]
+; RV32-NEXT: [[BC_RESUME_VAL3:%.*]] = phi i32 [ [[TMP8]], %[[MIDDLE_BLOCK]] ], [ [[N]], %[[FOR_BODY_PREHEADER]] ]
+; RV32-NEXT: br label %[[FOR_BODY:.*]]
+; RV32: [[FOR_COND_CLEANUP_LOOPEXIT]]:
+; RV32-NEXT: br label %[[FOR_COND_CLEANUP]]
+; RV32: [[FOR_COND_CLEANUP]]:
+; RV32-NEXT: ret void
+; RV32: [[FOR_BODY]]:
+; RV32-NEXT: [[INDVARS_IV:%.*]] = phi i64 [ [[BC_RESUME_VAL]], %[[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.*]], %[[FOR_BODY]] ]
+; RV32-NEXT: [[I_0_IN8:%.*]] = phi i32 [ [[BC_RESUME_VAL3]], %[[SCALAR_PH]] ], [ [[I_0:%.*]], %[[FOR_BODY]] ]
+; RV32-NEXT: [[I_0]] = add nsw i32 [[I_0_IN8]], -1
+; RV32-NEXT: [[IDXPROM:%.*]] = zext i32 [[I_0]] to i64
+; RV32-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[IDXPROM]]
+; RV32-NEXT: [[TMP26:%.*]] = load i32, ptr [[ARRAYIDX]], align 4
+; RV32-NEXT: [[ADD9:%.*]] = add i32 [[TMP26]], 1
+; RV32-NEXT: [[ARRAYIDX3:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[IDXPROM]]
+; RV32-NEXT: store i32 [[ADD9]], ptr [[ARRAYIDX3]], align 4
+; RV32-NEXT: [[CMP:%.*]] = icmp ugt i64 [[INDVARS_IV]], 1
+; RV32-NEXT: [[INDVARS_IV_NEXT]] = add nsw i64 [[INDVARS_IV]], -1
+; RV32-NEXT: br i1 [[CMP]], label %[[FOR_BODY]], label %[[FOR_COND_CLEANUP_LOOPEXIT]], !llvm.loop [[LOOP4:![0-9]+]]
+;
+; RV64-UF2-LABEL: define void @vector_reverse_i64(
+; RV64-UF2-SAME: ptr noalias [[A:%.*]], ptr noalias [[B:%.*]], i32 [[N:%.*]]) #[[ATTR0:[0-9]+]] {
+; RV64-UF2-NEXT: [[ENTRY:.*:]]
+; RV64-UF2-NEXT: [[CMP7:%.*]] = icmp sgt i32 [[N]], 0
+; RV64-UF2-NEXT: br i1 [[CMP7]], label %[[FOR_BODY_PREHEADER:.*]], label %[[FOR_COND_CLEANUP:.*]]
+; RV64-UF2: [[FOR_BODY_PREHEADER]]:
+; RV64-UF2-NEXT: [[TMP0:%.*]] = zext i32 [[N]] to i64
+; RV64-UF2-NEXT: [[TMP1:%.*]] = call i64 @llvm.vscale.i64()
+; RV64-UF2-NEXT: [[TMP2:%.*]] = mul i64 [[TMP1]], 8
+; RV64-UF2-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[TMP0]], [[TMP2]]
+; RV64-UF2-NEXT: br i1 [[MIN_ITERS_CHECK]], label %[[SCALAR_PH:.*]], label %[[VECTOR_SCEVCHECK:.*]]
+; RV64-UF2: [[VECTOR_SCEVCHECK]]:
+; RV64-UF2-NEXT: [[TMP3:%.*]] = add nsw i64 [[TMP0]], -1
+; RV64-UF2-NEXT: [[TMP4:%.*]] = add i32 [[N]], -1
+; RV64-UF2-NEXT: [[TMP5:%.*]] = trunc i64 [[TMP3]] to i32
+; RV64-UF2-NEXT: [[MUL:%.*]] = call { i32, i1 } @llvm.umul.with.overflow.i32(i32 1, i32 [[TMP5]])
+; RV64-UF2-NEXT: [[MUL_RESULT:%.*]] = extractvalue { i32, i1 } [[MUL]], 0
+; RV64-UF2-NEXT: [[MUL_OVERFLOW:%.*]] = extractvalue { i32, i1 } [[MUL]], 1
+; RV64-UF2-NEXT: [[TMP6:%.*]] = sub i32 [[TMP4]], [[MUL_RESULT]]
+; RV64-UF2-NEXT: [[TMP7:%.*]] = icmp ugt i32 [[TMP6]], [[TMP4]]
+; RV64-UF2-NEXT: [[TMP8:%.*]] = or i1 [[TMP7]], [[MUL_OVERFLOW]]
+; RV64-UF2-NEXT: [[TMP9:%.*]] = icmp ugt i64 [[TMP3]], 4294967295
+; RV64-UF2-NEXT: [[TMP10:%.*]] = or i1 [[TMP8]], [[TMP9]]
+; RV64-UF2-NEXT: br i1 [[TMP10]], label %[[SCALAR_PH]], label %[[VECTOR_PH:.*]]
+; RV64-UF2: [[VECTOR_PH]]:
+; RV64-UF2-NEXT: [[TMP11:%.*]] = call i64 @llvm.vscale.i64()
+; RV64-UF2-NEXT: [[TMP12:%.*]] = mul i64 [[TMP11]], 8
+; RV64-UF2-NEXT: [[N_MOD_VF:%.*]] = urem i64 [[TMP0]], [[TMP12]]
+; RV64-UF2-NEXT: [[N_VEC:%.*]] = sub i64 [[TMP0]], [[N_MOD_VF]]
+; RV64-UF2-NEXT: [[TMP13:%.*]] = call i64 @llvm.vscale.i64()
+; RV64-UF2-NEXT: [[TMP14:%.*]] = mul i64 [[TMP13]], 4
+; RV64-UF2-NEXT: [[TMP15:%.*]] = mul i64 [[TMP14]], 2
+; RV64-UF2-NEXT: [[TMP16:%.*]] = sub i64 [[TMP0]], [[N_VEC]]
+; RV64-UF2-NEXT: [[DOTCAST:%.*]] = trunc i64 [[N_VEC]] to i32
+; RV64-UF2-NEXT: [[TMP17:%.*]] = sub i32 [[N]], [[DOTCAST]]
+; RV64-UF2-NEXT: br label %[[VECTOR_BODY:.*]]
+; RV64-UF2: [[VECTOR_BODY]]:
+; RV64-UF2-NEXT: [[INDEX:%.*]] = phi i64 [ 0, %[[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], %[[VECTOR_BODY]] ]
+; RV64-UF2-NEXT: [[DOTCAST1:%.*]] = trunc i64 [[INDEX]] to i32
+; RV64-UF2-NEXT: [[OFFSET_IDX:%.*]] = sub i32 [[N]], [[DOTCAST1]]
+; RV64-UF2-NEXT: [[TMP18:%.*]] = add i32 [[OFFSET_IDX]], 0
+; RV64-UF2-NEXT: [[TMP19:%.*]] = add nsw i32 [[TMP18]], -1
+; RV64-UF2-NEXT: [[TMP20:%.*]] = zext i32 [[TMP19]] to i64
+; RV64-UF2-NEXT: [[TMP21:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[TMP20]]
+; RV64-UF2-NEXT: [[TMP22:%.*]] = mul i64 0, [[TMP14]]
+; RV64-UF2-NEXT: [[TMP23:%.*]] = sub i64 1, [[TMP14]]
+; RV64-UF2-NEXT: [[TMP24:%.*]] = getelementptr inbounds i32, ptr [[TMP21]], i64 [[TMP22]]
+; RV64-UF2-NEXT: [[TMP25:%.*]] = getelementptr inbounds i32, ptr [[TMP24]], i64 [[TMP23]]
+; RV64-UF2-NEXT: [[TMP26:%.*]] = mul i64 -1, [[TMP14]]
+; RV64-UF2-NEXT: [[TMP27:%.*]] = sub i64 1, [[TMP14]]
+; RV64-UF2-NEXT: [[TMP28:%.*]] = getelementptr inbounds i32, ptr [[TMP21]], i64 [[TMP26]]
+; RV64-UF2-NEXT: [[TMP29:%.*]] = getelementptr inbounds i32, ptr [[TMP28]], i64 [[TMP27]]
+; RV64-UF2-NEXT: [[WIDE_LOAD:%.*]] = load <vscale x 4 x i32>, ptr [[TMP25]], align 4
+; RV64-UF2-NEXT: [[REVERSE:%.*]] = call <vscale x 4 x i32> @llvm.vector.reverse.nxv4i32(<vscale x 4 x i32> [[WIDE_LOAD]])
+; RV64-UF2-NEXT: [[WIDE_LOAD2:%.*]] = load <vscale x 4 x i32>, ptr [[TMP29]], align 4
+; RV64-UF2-NEXT: [[REVERSE3:%.*]] = call <vscale x 4 x i32> @llvm.vector.reverse.nxv4i32(<vscale x 4 x i32> [[WIDE_LOAD2]])
+; RV64-UF2-NEXT: [[TMP30:%.*]] = add <vscale x 4 x i32> [[REVERSE]], splat (i32 1)
+; RV64-UF2-NEXT: [[TMP31:%.*]] = add <vscale x 4 x i32> [[REVERSE3]], splat (i32 1)
+; RV64-UF2-NEXT: [[TMP32:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[TMP20]]
+; RV64-UF2-NEXT: [[TMP33:%.*]] = mul i64 0, [[TMP14]]
+; RV64-UF2-NEXT: [[TMP34:%.*]] = sub i64 1, [[TMP14]]
+; RV64-UF2-NEXT: [[TMP35:%.*]] = getelementptr inbounds i32, ptr [[TMP32]], i64 [[TMP33]]
+; RV64-UF2-NEXT: [[TMP36:%.*]] = getelementptr inbounds i32, ptr [[TMP35]], i64 [[TMP34]]
+; RV64-UF2-NEXT: [[TMP37:%.*]] = mul i64 -1, [[TMP14]]
+; RV64-UF2-NEXT: [[TMP38:%.*]] = sub i64 1, [[TMP14]]
+; RV64-UF2-NEXT: [[TMP39:%.*]] = getelementptr inbounds i32, ptr [[TMP32]], i64 [[TMP37]]
+; RV64-UF2-NEXT: [[TMP40:%.*]] = getelementptr inbounds i32, ptr [[TMP39]], i64 [[TMP38]]
+; RV64-UF2-NEXT: [[REVERSE4:%.*]] = call <vscale x 4 x i32> @llvm.vector.reverse.nxv4i32(<vscale x 4 x i32> [[TMP30]])
+; RV64-UF2-NEXT: store <vscale x 4 x i32> [[REVERSE4]], ptr [[TMP36]], align 4
+; RV64-UF2-NEXT: [[REVERSE5:%.*]] = call <vscale x 4 x i32> @llvm.vector.reverse.nxv4i32(<vscale x 4 x i32> [[TMP31]])
+; RV64-UF2-NEXT: store <vscale x 4 x i32> [[REVERSE5]], ptr [[TMP40]], align 4
+; RV64-UF2-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], [[TMP15]]
+; RV64-UF2-NEXT: [[TMP41:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
+; RV64-UF2-NEXT: br i1 [[TMP41]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
+; RV64-UF2: [[MIDDLE_BLOCK]]:
+; RV64-UF2-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[TMP0]], [[N_VEC]]
+; RV64-UF2-NEXT: br i1 [[CMP_N]], label %[[FOR_COND_CLEANUP_LOOPEXIT:.*]], label %[[SCALAR_PH]]
+; RV64-UF2: [[SCALAR_PH]]:
+; RV64-UF2-NEXT: [[BC_RESUME_VAL:%.*]] = phi i64 [ [[TMP16]], %[[MIDDLE_BLOCK]] ], [ [[TMP0]], %[[FOR_BODY_PREHEADER]] ], [ [[TMP0]], %[[VECTOR_SCEVCHECK]] ]
+; RV64-UF2-NEXT: [[BC_RESUME_VAL6:%.*]] = phi i32 [ [[TMP17]], %[[MIDDLE_BLOCK]] ], [ [[N]], %[[FOR_BODY_PREHEADER]] ], [ [[N]], %[[VECTOR_SCEVCHECK]] ]
+; RV64-UF2-NEXT: br label %[[FOR_BODY:.*]]
+; RV64-UF2: [[FOR_COND_CLEANUP_LOOPEXIT]]:
+; RV64-UF2-NEXT: br label %[[FOR_COND_CLEANUP]]
+; RV64-UF2: [[FOR_COND_CLEANUP]]:
+; RV64-UF2-NEXT: ret void
+; RV64-UF2: [[FOR_BODY]]:
+; RV64-UF2-NEXT: [[INDVARS_IV:%.*]] = phi i64 [ [[BC_RESUME_VAL]], %[[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.*]], %[[FOR_BODY]] ]
+; RV64-UF2-NEXT: [[I_0_IN8:%.*]] = phi i32 [ [[BC_RESUME_VAL6]], %[[SCALAR_PH]] ], [ [[I_0:%.*]], %[[FOR_BODY]] ]
+; RV64-UF2-NEXT: [[I_0]] = add nsw i32 [[I_0_IN8]], -1
+; RV64-UF2-NEXT: [[IDXPROM:%.*]] = zext i32 [[I_0]] to i64
+; RV64-UF2-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds i32, ptr [[B]], i64 [[IDXPROM]]
+; RV64-UF2-NEXT: [[TMP42:%.*]] = load i32, ptr [[ARRAYIDX]], align 4
+; RV64-UF2-NEXT: [[ADD9:%.*]] = add i32 [[TMP42]], 1
+; RV64-UF2-NEXT: [[ARRAYIDX3:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[IDXPROM]]
+; RV64-UF2-NEXT: store i32 [[ADD9]], ptr [[ARRAYIDX3]], align 4
+; RV64-UF2-NEXT: [[CMP:%.*]] = icmp ugt i64 [[INDVARS_IV]], 1
+; RV64-UF2-NEXT: [[INDVARS_IV_NEXT]] = add nsw i64 [[INDVARS_IV]], -1
+; RV64-UF2-NEXT: br i1 [[CMP]], label %[[FOR_BODY]], label %[[FOR_COND_CLEANUP_LOOPEXIT]], !llvm.loop [[LOOP4:![0-9]+]]
+;
+entry:
+ %cmp7 = icmp sgt i32 %n, 0
+ br i1 %cmp7, label %for.body.preheader, label %for.cond.cleanup
+
+for.body.preheader: ; preds = %entry
+ %0 = zext i32 %n to i64
+ br label %for.body
+
+for.cond.cleanup: ; preds = %for.body, %entry
+ ret void
+
+for.body: ; preds = %for.body.preheader, %for.body
+ %indvars.iv = phi i64 [ %0, %for.body.preheader ], [ %indvars.iv.next, %for.body ]
+ %i.0.in8 = phi i32 [ %n, %for.body.preheader ], [ %i.0, %for.body ]
+ %i.0 = add nsw i32 %i.0.in8, -1
+ %idxprom = zext i32 %i.0 to i64
+ %arrayidx = getelementptr inbounds i32, ptr %B, i64 %idxprom
+ %1 = load i32, ptr %arrayidx, align 4
+ %add9 = add i32 %1, 1
+ %arrayidx3 = getelementptr inbounds i32, ptr %A, i64 %idxprom
+ store i32 %add9, ptr %arrayidx3, align 4
+ %cmp = icmp ugt i64 %indvars.iv, 1
+ %indvars.iv.next = add nsw i64 %indvars.iv, -1
+ br i1 %cmp, label %for.body, label %for.cond.cleanup, !llvm.loop !0
+}
+
+define voi...
[truncated]
|
llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse-output.ll
Outdated
Show resolved
Hide resolved
llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse-output.ll
Outdated
Show resolved
Hide resolved
llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse-output.ll
Outdated
Show resolved
Hide resolved
llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse-output.ll
Outdated
Show resolved
Hide resolved
llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse-output.ll
Outdated
Show resolved
Hide resolved
4decdce
to
b1cce79
Compare
; RV64-NEXT: [[REVERSE:%.*]] = call <vscale x 4 x i32> @llvm.vector.reverse.nxv4i32(<vscale x 4 x i32> [[WIDE_LOAD]]) | ||
; RV64-NEXT: [[TMP14:%.*]] = add <vscale x 4 x i32> [[REVERSE]], splat (i32 1) | ||
; RV64-NEXT: [[TMP15:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[TMP8]] | ||
; RV64-NEXT: [[TMP16:%.*]] = mul i64 0, [[TMP5]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's happening purely at the IR level, the IRBuilder should constant-fold it. I suspect #125365 would fix this. EDIT: Sorry, both operands aren't constants, so how can constant-folder fold this? It requires something like InstCombine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought ConstantExpr::getBinOpAbsorber
could fold X * 0 -> 0
? Maybe the operands need to be swapped around here if it only folds RHS constants?
Value *NumElt = Builder.CreateMul(
ConstantInt::get(IndexTy, -(int64_t)CurrentPart), RunTimeVF);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I left this pending for so long: I was away on vacation. Yes, getBinOpAbsorber can fold either LHS const or RHS const: however, is it used in IRBuilder::CreateMul
? I think it only calls a variant of FoldBinOp
:
Value *FoldBinOp(Instruction::BinaryOps Opc, Value *LHS,
Value *RHS) const override {
auto *LC = dyn_cast<Constant>(LHS);
auto *RC = dyn_cast<Constant>(RHS);
if (LC && RC) {
if (ConstantExpr::isDesirableBinOp(Opc))
return ConstantExpr::get(Opc, LC, RC);
return ConstantFoldBinaryInstruction(Opc, LC, RC);
}
return nullptr;
}
I think this is a real problem in LV, and I've seen variations of the problem it many other tests: I will try to attack it in follow-ups to my constant-folding patch.
ping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks.
Please update to current main
and check if the test is passing before merging :)
llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse-output.ll
Outdated
Show resolved
Hide resolved
b1cce79
to
9d825d9
Compare
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/81/builds/5355 Here is the relevant piece of the build log for the reference
|
Duplicate riscv-vector-reverse.ll as riscv-vector-reverse-output.ll to verify all generated IR, not just debug output.
Pre-commit for #128718.