This example showcases computing the inversion getrf
operation, which yields the lower triangular matrix getri
operation.
- Parse command line arguments for the dimension of the input matrix.
- Declare and initialize a number of constants for the input and output matrix.
- Allocate and initialize the to-be-inverted input matrix.
- Allocate device memory and copy input data from host to device.
- Create a rocBLAS library handle.
- Invoke the rocSOLVER
getrf
operation with double precision. - Copy the
getrf
info output back to the host, and check whether the operation was successful. - Invoke the rocSOLVER
getri
operation with double precision. - Copy the
getri
info output back to the host, and check again whether the operation was a success. - Validate the solution by checking if
$A\cdot A^{-1} = I$ holds. This is done by computing$1\cdot A \cdot A^{-1} + -1 \cdot I$ using thegemm
operation from rocBLAS, and comparing the elements of the result to 0. - Free device memory, release rocBLAS handle.
-
rocsolver_[sdcz]getrf
computes the LU-factorization of an$m\times n$ matrix$A$ , and optionally also provides a permutation matrix$P$ when partial pivoting is used. Depending on the character matched in[sdcz]
, the factorization can be computed with different precision:-
s
(single-precision:float
) -
d
(double-precision:double
) -
c
(single-precision complex:rocblas_float_complex
) -
z
(double-precision complex:rocblas_double_complex
).
Double precision is used in the example. In this case, the function accepts the following parameters: -
rocblas_handle handle
is a handle to the rocBLAS library, created usingrocblas_create_handle
.-
rocblas_int m
is the number of rows in$A$ . -
rocblas_int n
is the number of columns in$A$ . -
double* A
is a device-pointer to the memory of matrix$A$ . It should hold at least$n\times lda$ elements. Thegetrf
operation is performed in-place, and the resulting$L$ and$U$ matrices are stored in the memory of$A$ . Diagonal elements of$L$ are not stored. -
rocblas_int lda
is the leading dimension of matrix$A$ , which is the stride between the first element of the columns of the matrix. Note that the matrix is laid out in column-major ordering. -
rocblas_int* ipiv
is a device-pointer to where the permutation matrix used for partial pivoting is written to. Note that the permutation matrix can be represented using a single array, and so this parameter requires onlymin(n, m)
ints of memory. Ifipiv[i] = j
, then rowj
was interchanged with rowi
. Note that row indices are 1-indexed. Partial pivoting enables numerical stability for this algorithm. If this is undesired, the fucntionrocsolver_[sdcz]getrf_npvt
can be used to omit partial pivoting. -
rocblas_int* info
is a device-pointer to a single integer that describes the result of the operation. If*info
is0
, the operation was successful. Otherwise*info
holds the first non-zero pivot (1-indexed), and means that$A$ was not invertible.
The function returns a
rocblas_status
value, which indicates whether any errors have occurred during the operation. -
-
rocblas_[sdcz]getri
inverts a batch of$n\times n$ matrix using the LU-factorization previously obtained usinggetrf
. As with the previous operations, different operations forgetri
are available, depending on the character matched in[sdcz]
. -
rocblas_[sdcz]getri
inverts an$n\times n$ matrix using the LU-factorization previously obtained usinggetrf
. As withgetrf
,getri
can be computed with a different precision based on the character matched in[sdcz]
. In the case for double precision, the function accepts the following parameters:-
rocblas_handle handle
is a handle to the rocBLAS library, created usingrocblas_create_handle
. -
rocblas_int n
is the number of rows and columns of the matrix. -
double* A
is a device-pointer to an LU-factorized matrix$A$ . The matrix should have at least$n\times lda$ elements. On successful exit, the values in this matrix are overwritten with the inversion$A^{-1}$ of the original matrix$A$ . -
rocblas_int lda
is the leading dimension of$A$ , which is the stride between the first element of the columns of the matrix. Note that the matrices are laid out in column-major ordering. -
rocblas_int* ipiv
is a device-pointer to the permutation matrix of the LU-factorization of$A$ . If no permutation matrix is available, therocsolver_[sdcz]_getri_npvt
function can be used instead. -
rocblas_int* info
is a device-pointer to a single integer that describes the result of the inversion operation. If*info
is non-zero, then the inversion failed, and the value indicates the first zero pivot in$A$ . If*info
is zero, then the operation was successful andA
holds the inverted matrix$A^{-1}$ of the original matrix$A$ .
The function returns a
rocblas_status
value, which indicates whether any errors have occurred during the operation. -
-
rocblas_[sdcz]gemm
performs a general matrix multiplication in the form of$C = \alpha\cdot op_a(A)\cdot op_b(B) + \beta\cdot C$ . This function is showcased in the rocBLAS level 3 GEMM example.
rocsolver_dgetrf
rocsolver_dgetri
rocblas_create_handle
rocblas_destroy_handle
rocblas_dgemm
rocblas_int
rocblas_handle
rocblas_set_pointer_mode
rocblas_pointer_mode_host
rocblas_operation_none
hipFree
hipMalloc
hipMemcpy
hipMemcpyDeviceToDevice
hipMemcpyDeviceToHost
hipMemcpyHostToDevice