Add triangular matrix functions and ExecutionPolicy overloads #87

amklinv-nnl · 2021-07-22T21:49:55Z

Closes #45

amklinv-nnl · 2021-07-22T21:51:45Z

I added the overwriting triangular_matrix_[left/right]_product functions, but not the updating ones. Submitting a draft request in case @mhoemmen gets to it first.

mhoemmen

Thanks for the PR and for catching the missing functions! : - ) I caught just a couple issues that we should consider fixing before merging.

include/experimental/__p1673_bits/blas3_matrix_product.hpp

mhoemmen · 2021-07-22T23:17:53Z

include/experimental/__p1673_bits/blas3_matrix_product.hpp

+
+  if constexpr (std::is_same_v<Triangle, lower_triangle_t>) {
+    for (size_type j = 0; j < C.extent(1); ++j) {
+      for (size_type i = j; i < C.extent(0); ++i) {


"Unlike the symmetric rank-1 or rank-k update functions, these functions (that is, {symmetric,hermitian}_matrix_{left,right}_product) assume that the input matrix A -- not the output matrix -- is symmetric." Thus, they should iterate over all entries of C, but only over the lower or upper triangle of A.

Pretty sure I made the exact same mistake with the symmetric functions, so I should probably fix that too.

mhoemmen · 2021-07-22T23:19:27Z

include/experimental/__p1673_bits/blas3_matrix_product.hpp

+         extents<>::size_type numCols_C,
+         class Layout_C,
+         class Accessor_C>
+void symmetric_matrix_product(


This symmetric_matrix_product function isn't in the current version of P1673. It's OK to leave in there, but it doesn't have to be there.

Taking out the trash!

mhoemmen · 2021-07-22T23:25:19Z

include/experimental/__p1673_bits/blas3_triangular_matrix_matrix_solve.hpp

+         class Accessor_X>
+void triangular_matrix_matrix_left_solve(
+  std::experimental::mdspan<ElementType_A, std::experimental::extents<numRows_A, numCols_A>, Layout_A, Accessor_A> A,
+  Triangle t,


Would you consider commenting out t (as in Triangle /* t */) to prevent build warnings for t being unused?

Sure would! How did you trigger those warnings? I didn't see them.

@amklinv-nnl Greetings! Sorry I missed your comment! Adding -Wall to the list of warnings should help.

C = A * B (or C = B * A) was not treating A as a symmetric matrix. This does not have tests yet. Untested code is broken code.

Replaced by symmetric_matrix[right/left]_product

Replaced by left/right equivalents

mhoemmen · 2021-08-12T03:03:55Z

include/experimental/__p1673_bits/blas3_matrix_product.hpp


  if constexpr (std::is_same_v<Triangle, lower_triangle_t>) {
    for (size_type j = 0; j < C.extent(1); ++j) {
      for (size_type i = 0; i < C.extent(0); ++i) {
        C(i,j) = ElementType_C{};
-        for (size_type k = 0; k <= i; ++k) {
+        const size_type k_upper = explicitDiagonal ? i : i - size_type(1);


size_type is unsigned, so if i is zero and explicitDiagonal is false, then i - size_type(i) will be a big positive number. I think the fix would be to change the bounds of the for loop over i.

mhoemmen · 2021-08-12T03:04:46Z

include/experimental/__p1673_bits/blas3_matrix_product.hpp

      }
    }
  }
  else { // upper_triangle_t
    for (size_type j = 0; j < C.extent(1); ++j) {
+      const size_type k_upper = explicitDiagonal ? j : j - size_type(1);


Please see note above -- thanks!

The bound k_upper may be negative for certain values, meaning it cannot be an unsigned type.

Comes with bonus tests for left-multiply. Right should be assumed to not work (especially since it's not even implemented).

mhoemmen · 2021-08-18T20:59:38Z

include/experimental/__p1673_bits/blas3_matrix_product.hpp

@@ -572,7 +572,7 @@ void triangular_matrix_left_product(
      for (size_type i = 0; i < C.extent(0); ++i) {
        C(i,j) = ElementType_C{};
        const ptrdiff_t k_upper = explicitDiagonal ? i : i - size_type(1);


size_type minus size_type is still size_type, so i - size_type(1) will still wrap around to -1 with i=0.

Replaced with ptrdiff_t(1)

mhoemmen · 2021-08-18T21:00:53Z

include/experimental/__p1673_bits/blas3_matrix_product.hpp

+  }
+  else { // lower_triangle_t
+    for (size_type j=0; j < C.extent(1); ++j) {
+      for (ptrdiff_t k=C.extent(0)-1; k >= 0; --k) {


This should be correct, though I tend to prefer k = N; k > 0; --k) count-down loops, as they are more robust to changes in the index type.

mhoemmen · 2021-08-18T21:31:07Z

include/experimental/__p1673_bits/blas3_matrix_product.hpp

+    for (size_type j=0; j < C.extent(1); ++j) {
+      for (size_type k=0; k < C.extent(0); ++k) {
+        for (size_type i=0; i < k; ++i) {
+          C(i,j) += C(k,j)*A(i,k);


Order of multiplication is actually significant, since we support element types like quaternions with noncommutative multiplication. This means that the "left product" matrix A actually needs to go on the left for each element-times-element multiplication.

mhoemmen · 2021-08-18T21:31:56Z

include/experimental/__p1673_bits/blas3_matrix_product.hpp

+          C(i,j) += C(k,j)*A(i,k);
+        }
+        if constexpr (explicitDiagonal) {
+          C(k,j) *= A(k,k);


Given that multiplication order is significant (see above), would you consider spelling out the multiplication instead of using *=, as a way to make clear the order of the factors?

include/experimental/__p1673_bits/blas3_matrix_product.hpp

const ptrdiff_t k_upper = explicitDiagonal ? i : i - size_type(1); is incorrect because size_type - size_type is still size_type. Bad things would happen if i=0. ptrdiff_t(1) is now used instead.

mhoemmen · 2021-08-29T22:35:12Z

tests/gemm.cpp

+    /* C = A * B, where A is triangular mxm */
+    using extents_t = extents<dynamic_extent, dynamic_extent>;
+    using matrix_t = mdspan<double, extents_t, layout_left>;
+    double snan = std::numeric_limits<double>::signaling_NaN();


Would you consider making this constexpr, given that it's a kind of named constant?

mhoemmen · 2021-08-29T22:36:12Z

tests/gemm.cpp

+    for (ptrdiff_t j = 0; j < m; ++j) {
+      for (ptrdiff_t i = 0; i < n; ++i) {
+        // FIXME: Choose a more reasonable value for the tolerance
+        double tol = 1e-9;


Would you consider making constants like these constexpr if possible? Thanks!

tests/gemm.cpp

mhoemmen · 2021-08-29T22:58:19Z

include/experimental/__p1673_bits/blas3_matrix_product.hpp

@@ -673,7 +673,7 @@ void triangular_matrix_right_product(
      const ptrdiff_t k_upper = explicitDiagonal ? j : j - size_type(1);


The variable j has type size_type (std::size_t, an unsigned integer), and j - size_type(1) has type size_type (please see this godbolt example), so k_upper will be the biggest possible 64-bit positive integer. The resulting k <= k_upper comparison will always evaluate to true.

The work-around that comes to mind would be to have a special case for explicitDiagonal being true. That way, one could rewrite the loop bound as k < j. Here is a possible reimplementation of lines 676-681:

for (ptrdiff_t k = 0; k < j; ++k) { C(i,j) += B(i,k) * A(k,j); } if constexpr (explicitDiagonal) { C(i,j) += B(i,j) /* times 1 */; } else { C(i,j) += B(i,j) * A(j,j); }

Is changing size_type(1) to ptrdiff_t(1) acceptable? If not, I can change it.

@amklinv-nnl Please see suggested changes below -- thanks!

Some triangular multiplications were written with C = B(...)*A(...) when they should have been C = A(...)*B(...) and vice-versa. These have been fixed to allow for strange datatypes.

mhoemmen · 2021-08-30T23:10:49Z

include/experimental/__p1673_bits/blas3_matrix_product.hpp

+
+  if constexpr (std::is_same_v<Triangle, lower_triangle_t>) {
+    for (size_type j = 0; j < C.extent(1); ++j) {
+      for (size_type i = 0; i < C.extent(0); ++i) {


Suggested change

for (size_type i = 0; i < C.extent(0); ++i) {

for (ptrdiff_t i = 0; i < static_cast<ptrdiff_t>(C.extent(0)); ++i) {

mhoemmen · 2021-08-30T23:12:30Z

include/experimental/__p1673_bits/blas3_matrix_product.hpp

+    }
+  }
+  else { // upper_triangle_t
+    for (size_type j = 0; j < C.extent(1); ++j) {


Suggested change

for (size_type j = 0; j < C.extent(1); ++j) {

for (ptrdiff_t j = 0; j < static_cast<ptrdiff_t>(C.extent(1)); ++j) {

mhoemmen · 2021-08-30T23:13:40Z

include/experimental/__p1673_bits/blas3_matrix_product.hpp

+  }
+  else { // lower_triangle_t
+    for (size_type j=0; j < C.extent(1); ++j) {
+      for (size_type k=C.extent(0); k > 0; --k) {


Nicely done; thank you! : - )

mhoemmen · 2021-08-31T01:53:25Z

include/experimental/__p1673_bits/blas3_matrix_product.hpp

+      for (size_type i = 0; i <= j; ++i) {
+        C(i,j) = E(i,j);
+        for (size_type k = 0; k < A.extent(1); ++k) {
+          C(i,j) += B(i,k) * A(k,j);


Both the symmetric and Hermitian left and right products should go over the matrix A twice, unlike the triangular left and right products.

I think we should separate that into its own issue.

@amklinv-nnl I might be a bit confused perhaps -- were those functions implemented before? If not and if this PR provides a partial, known-incorrect implementation for later revision, then would you consider putting assert(false); in the functions known to be correct, so that we know to come back to them later? Thanks! : - )

Also replaced untested functions with assert(false). Untested code is broken code.

mhoemmen

Thanks for submitting this! This should fix #111, #112, and #113. There is just one small issue regarding conj; please see note.

mhoemmen · 2021-10-18T19:31:06Z

include/experimental/__p1673_bits/blas3_matrix_product.hpp

@@ -40,6 +40,8 @@
 //@HEADER
 */

+#include <cassert>


We probably should put this inside the include-once guard, but no biggie : - )

mhoemmen · 2021-10-18T19:50:54Z

include/experimental/__p1673_bits/blas3_matrix_product.hpp

        C(i,j) = ElementType_C{};
        for (size_type k = 0; k < A.extent(1); ++k) {
-          C(i,j) += A(i,k) * B(k,j);
+          ElementType_A aik = i <= k ? conj(A(k,i)) : A(i,k);


I don't think this will compile for generic ElementType_A, because std::conj always returns std::complex (even if its argument is real). There might be something in here like a "conditional conj" that returns the same type as its argument, but it's not too hard to write one:

template<class T> T cond_conj(const T& t) { return t; } template<class R> std::complex<R> cond_conj(const std::complex<R>& z) { using std::conj; return conj(z); }

I've spun this off into a separate issue, #115.

mhoemmen · 2021-10-18T19:51:16Z

include/experimental/__p1673_bits/blas3_matrix_product.hpp

        C(i,j) = ElementType_C{};
        for (size_type k = 0; k < A.extent(1); ++k) {
-          C(i,j) += B(i,k) * A(k,j);
+          ElementType_A akj = j <= k ? A(k,j) : conj(A(j,k));


Please see note above; thanks!

I've spun this off into a separate issue, #115.

mhoemmen · 2021-10-18T20:22:40Z

I've spun off remaining things to fix into separate issues, #115 and #116. Thanks so much for your hard work!

amklinv-nnl added 4 commits July 22, 2021 17:48

triangular_matrix_matrix_solve: add left/right slv

1dace4d

See kokkos#45.

triangular_matrix_solve: add EP overloads

a8cc7cf

See kokkos#45

matrix_product: Added left/right functions

fff3e7d

See kokkos#45.

matrix_product: add triangular functions

8b40224

See kokkos#45.

mhoemmen self-requested a review July 22, 2021 23:07

mhoemmen requested changes Jul 22, 2021

View reviewed changes

amklinv-nnl added 4 commits August 11, 2021 16:59

matrix_product: fix symmetric product

124a893

C = A * B (or C = B * A) was not treating A as a symmetric matrix. This does not have tests yet. Untested code is broken code.

matrix_product: removed symmetric_matrix_product

a75d759

Replaced by symmetric_matrix[right/left]_product

matrix_product: remove hermitian product functions

d34c108

Replaced by left/right equivalents

matrix_product: comment unused parameters

7e0ea8b

mhoemmen reviewed Aug 12, 2021

View reviewed changes

amklinv-nnl added 2 commits August 12, 2021 14:13

matrix_product: fix bounds of triangular functions

8fe3c91

The bound k_upper may be negative for certain values, meaning it cannot be an unsigned type.

matrix_product: Add in-place triangular multiply

a80a6bf

Comes with bonus tests for left-multiply. Right should be assumed to not work (especially since it's not even implemented).

mhoemmen reviewed Aug 18, 2021

View reviewed changes

amklinv-nnl added 3 commits August 27, 2021 14:02

matrix_product: implemented tr right product

b9f0f0a

Added more trmm tests

3da461c

matrix_product: fixed bounds issues

9920162

const ptrdiff_t k_upper = explicitDiagonal ? i : i - size_type(1); is incorrect because size_type - size_type is still size_type. Bad things would happen if i=0. ptrdiff_t(1) is now used instead.

mhoemmen requested changes Aug 29, 2021

View reviewed changes

amklinv-nnl added 2 commits August 30, 2021 16:25

matrix_product: replaced unsigned loop indices

679055d

matrix_product: fixed tri mult order

3fa5e38

Some triangular multiplications were written with C = B(...)*A(...) when they should have been C = A(...)*B(...) and vice-versa. These have been fixed to allow for strange datatypes.

amklinv-nnl force-pushed the triangular_matrix branch from b647b1b to 3fa5e38 Compare August 30, 2021 20:39

trmm tests: used constexpr where appropriate

08cd0af

amklinv-nnl marked this pull request as ready for review August 30, 2021 21:19

mhoemmen requested changes Aug 31, 2021

View reviewed changes

amklinv-nnl added 2 commits August 31, 2021 15:33

Move trmm tests to separate file

46df112

matrix_product: added symmetric tests

fdc78f3

amklinv-nnl mentioned this pull request Oct 18, 2021

Spec has {hermitian, symmetric}_matrix_{left,right}_product but in the code name is different #111

Closed

amklinv-nnl added 2 commits October 18, 2021 15:22

matrix_product: added hermitian tests

a404087

matrix_product: fixed hermitian multiply

92b3df9

Also replaced untested functions with assert(false). Untested code is broken code.

mhoemmen requested changes Oct 18, 2021

View reviewed changes

This was referenced Oct 18, 2021

Don't use std::conj if its argument could be real #115

Closed

Implement functions whose body is currently assert(false) #116

Open

mhoemmen merged commit a9fe21a into kokkos:main Oct 18, 2021

This was referenced Oct 18, 2021

BLAS 3: triangular_matrix_{left,right}_product is missing #112

Closed

Spec has triangular_matrix_matrix_{left,right}_solve but in the code name is different #113

Closed

		@@ -673,7 +673,7 @@ void triangular_matrix_right_product(
		const ptrdiff_t k_upper = explicitDiagonal ? j : j - size_type(1);

	for (size_type i = 0; i < C.extent(0); ++i) {
	for (ptrdiff_t i = 0; i < static_cast<ptrdiff_t>(C.extent(0)); ++i) {

	for (size_type j = 0; j < C.extent(1); ++j) {
	for (ptrdiff_t j = 0; j < static_cast<ptrdiff_t>(C.extent(1)); ++j) {

Add triangular matrix functions and ExecutionPolicy overloads #87

Add triangular matrix functions and ExecutionPolicy overloads #87

Conversation

amklinv-nnl commented Jul 22, 2021

amklinv-nnl commented Jul 22, 2021

mhoemmen left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mhoemmen left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mhoemmen commented Oct 18, 2021