This example illustrates how to solve the standard symmetric-definite eigenvalue problem for a strided batch
Generally, in an eigenvalue problem, we are looking for
equation.
The solver evaluates the following equation for a strided batch of
for each
The set of orthonormalized eigenvectors can be settled to a column of a matrix as
and the eigenvalues as a diagonal matrix:
The solver gives back an array of
The results are verified, in the example, by filling in the equation we wanted to solve for each matrix of the strided batch:
The application provides the following optional command line arguments:
-
-n <n>
with size of the$n \times n$ matrix$A$ . The default value is3
. -
-c <c>
the size of the batch. Default value is3
. -
-p <p>
The size of the padding. This value is used to calculate the stride for the input matrix, eigenvalues and the tridiagonal matrix.
- Parse command line arguments for dimensions of the input matrix.
- Declare the host side inputs and outputs.
- Initialize a random symmetric
$n \times n$ input matrix. - Set the solver parameters.
- Allocate device memory and copy input matrix from host to device.
- Initialize rocBLAS.
- Allocate the required working space on device.
- Compute the eigenvector and eigenvalues.
- Retrieve the results by copying from device to host.
- Print the results
- Validate the results
- Free the memory allocations on device.
- The performance of a numerical multi-linear algebra code can be heavily increased by using tensor contractions [ Y. Shi et al., HiPC, pp 193, 2016. ], thereby similarly to other linear algebra libraries like hipBLAS rocSOLVER also has a
_batched
and a_strided_batched
[ C. Jhurani and P. Mullowney, JPDP Vol 75, pp 133, 2015. ] extensions.
We can apply the same operation for several matrices if we combine them into batched matrices. Batched computation has a performance improvement for a large number of small matrices. For a constant stride between matrices, further acceleration is available by strided batched solvers.
-
rocsolver_[sd]syev_strided_batched(...)
computes the eigenvalues and optionally the eigenvectors of a (strided) batch of matrices.-
There are 2 different function signatures depending on the type of the input matrix:
-
s
single-precision real (float
) -
d
double-precision real (double
) For single- and double-precision complex values, the functionrocsolver_[cz]heev_strided_batched(...)
is available in rocSOLVER.
In this example a double-precision real input matrix is used, in which case the function accepts the following parameters:
rocblas_handle handle
-
rocblas_evect evect
Specifies whether the eigenvectors should also be calculated besides the eigenvalues. The following values are accepted:-
rocblas_evect_original
: Calculate both the eigenvalues and the eigenvectors. -
rocblas_evect_none
: Calculate the eigenvalues only.
-
-
rocblas_fill uplo
: Specifies whether the upper or lower triangle of the symmetric matrix is stored. The following values are accepted:-
rocblas_fill_lower
: The provided*A
pointer points to the lower triangle matrix data. -
rocblas_fill_upper
: The provided*A
pointer points to the upper triangle matrix data.
-
-
rocblas_int n
: Number of rows and columns of$A$ . -
double* A
: Pointer to the first matrix$A$ in device memory. After execution it contains the eigenvectors, if they were requested and the algorithm converged. -
rocblas_int lda
: Leading dimension of matrix$A$ (same for all matrices in the batch). -
rocblas_stride strideA
: Stride from the start of one matrix$A_i$ to the next one$A_{i+1}$ . -
double* D
: Pointer to array$\lambda_i$ . It is initially used to internally store the leading diagonals of the internal tridiagonal matrices$T_i$ associated with the$A_i$ . Eventually this diagonal converges to the resulting eigenvalues. -
rocblas_stride strideD
: Stride from the start of one vector$D_i$ to the next one$D_{j+1}$ . -
double* E
: This array is used to work internally with the tridiagonal matrices$T_i$ associated with the$A_i$ . It stores the super/subdiagonals of these tridiagonal matrices (they are symmetric, so only one of the diagonals is needed). -
rocblas_stride strideE
: Stride from the start of one vector$E_i$ to the next one$E_(i+1)$ . -
rocblas_int* info
: Array of$m$ integers on the GPU. Ifinfo[i]
= 0, successful exit for matrix$A_i$ . Ifinfo[i] > 0
, the algorithm did not converge. -
rocblas_int batch_count
: Number of matrices in the batch.
-
-
- rocBLAS is initialized by calling
rocblas_create_handle(rocblas_handle t*)
and it is terminated by callingrocblas_destroy_handle(t)
.
rocblas_evect
rocblas_evect_original
rocsolver_dsyev_strided_batched
rocblas_create_handle
rocblas_destroy_handle
rocblas_double
rocblas_fill
rocblas_fill_lower
rocblas_handle
rocblas_int
hipFree
hipMalloc
hipMemcpy
hipMemcpyHostToDevice
hipMemcpyDeviceToHost