You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In doc/rvv-intrinsic-examples.adoc, the matmul example is out of date. The pointer ptr_a and ptr_b are not updated in the loop.
void matmul_rvv(double *a, double *b, double *c, int n, int m, int p) {
size_t vlmax = __riscv_vsetvlmax_e64m1();
for (int i = 0; i < n; ++i)
for (int j = 0; j < m; ++j) {
double *ptr_a = &a[i * p];
double *ptr_b = &b[j];
int k = p;
// Set accumulator to zero.
vfloat64m1_t vec_s = __riscv_vfmv_v_f_f64m1(0.0, vlmax);
vfloat64m1_t vec_zero = __riscv_vfmv_v_f_f64m1(0.0, vlmax);
for (size_t vl; k > 0; k -= vl) {
vl = __riscv_vsetvl_e64m1(k);
// Load row a[i][k..k+vl)
vfloat64m1_t vec_a = __riscv_vle64_v_f64m1(ptr_a, vl);
// Load column b[k..k+vl)[j]
vfloat64m1_t vec_b =
__riscv_vlse64_v_f64m1(ptr_b, sizeof(double) * m, vl);
// Accumulate dot product of row and column. If vl < vlmax we need to
// preserve the existing values of vec_s, hence the tu policy.
vec_s = __riscv_vfmacc_vv_f64m1_tu(vec_s, vec_a, vec_b, vl);
}
// Final accumulation.
vfloat64m1_t vec_sum =
__riscv_vfredusum_vs_f64m1_f64m1(vec_s, vec_zero, vlmax);
double sum = __riscv_vfmv_f_s_f64m1_f64(vec_sum);
c[i * m + j] = sum;
}
}
The correct example is presented in examples/rvv_matmul.c. I think we should either use the code in examples/rvv_matmul.c directly or fix it in the following way:
- for (size_t vl; k > 0; k -= vl) {
+ for (size_t vl; k > 0; k -= vl, ptr_a += vl, ptr_b += vl * m) {
...
}
The text was updated successfully, but these errors were encountered:
In
doc/rvv-intrinsic-examples.adoc
, the matmul example is out of date. The pointer ptr_a and ptr_b are not updated in the loop.The correct example is presented in examples/rvv_matmul.c. I think we should either use the code in examples/rvv_matmul.c directly or fix it in the following way:
The text was updated successfully, but these errors were encountered: