Skip to content

Add Hermitian matrix tests #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 29 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
31bc048
Added switch for more number types
niemilau Feb 7, 2025
fd510f0
Added template stuff for running the test with complex numbers, curre…
niemilau Feb 10, 2025
4ffd8fd
renamed Real_t => real_t
niemilau Feb 10, 2025
585cf4b
Fixed memory error when copying eigenvectors in the hermitian solver
niemilau Feb 10, 2025
02f8175
Fixed incorrect math in comments...
niemilau Feb 10, 2025
56e7197
Use rocblas_complex_num template directly instead of defining a custo…
niemilau Feb 10, 2025
cba77d9
WIP complex matrices with magma
niemilau Feb 10, 2025
bcb685d
Made complex MAGMA work with HIP and refactored to reduce template mess
niemilau Feb 11, 2025
9899000
prettier printing
niemilau Feb 11, 2025
2d11176
Made CUDA+MAGMA work with complex numbers
Feb 11, 2025
eae0a43
Refactoring, removing duplicate code
niemilau Feb 11, 2025
173e3db
Fixed invalid use of MAGMA lrwork in the real solver
niemilau Feb 12, 2025
8929f1d
minor type alias cleanup
niemilau Feb 12, 2025
10f6b9f
Removed dependency on C++20 features, C++17 is now sufficient
Feb 12, 2025
5f418fe
Updated README and minor code cleanup
niemilau Feb 12, 2025
6c25e1c
remove old header
niemilau Feb 12, 2025
70442c6
Added phase normalization for printed eigenvectors for easier compari…
niemilau Feb 13, 2025
f5512e3
updated README
niemilau Feb 13, 2025
753b11f
Print summary at end
niemilau Feb 13, 2025
25c51ba
Fixed eigenvector rotation - whoops
niemilau Feb 13, 2025
d5ad1cb
added output of cuda11.5.0 test with complex doubles
Feb 13, 2025
cd135b1
added cuda12.6.1 test outputs
Feb 13, 2025
776244c
added flag for skipping the 'with handle creation' tests
Feb 13, 2025
0288cb8
added output of rocm6.2.2, 6.3.2 + magma2.9.0
niemilau Feb 14, 2025
e5c7810
added script for collecting just the timing results from output
Feb 14, 2025
9747bc7
Added gnuplot script for plotting results
Feb 14, 2025
a2d0836
Fixed eigenvector normalization convention (for printing). Now it's c…
niemilau Feb 19, 2025
99c3a92
Fixed bug in host -> device eigenvalue copy
niemilau Feb 21, 2025
a9e2efc
Added plot of performance test results
niemilau Feb 27, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,4 @@
*.x
magma/
.vscode/*
*.code-workspace
31 changes: 22 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,18 @@
# Notes on testing MAGMA

This is a test program for comparing the performance of common GPU eigensolvers.
- Flag `-DCUDA`: uses `cusolver`
- Flag `-DHIP`: uses `rocsolver`
- Flag `-DMAGMA`: uses the `MAGMA` library. Also needs either `-DCUDA` or `-DHIP`.

Works with symmetric and Hermitian matrices.

Needs compiler with C++17 support.

## Summary of test results (LUMI/Mahti)

![eigsolvers_cdouble](https://github.com/user-attachments/assets/a272d929-292e-4550-bbf5-abec422b6925)

## LUMI / MI250x (1 GCD)

### Installing MAGMA
Expand All @@ -23,10 +36,10 @@ ml partition/G
ml rocm/6.0.3
ml magma/2.8.0-cpeGNU-24.03-rocm

hipcc -std=c++14 --offload-arch=gfx90a -O3 -DHIP -lrocblas -lrocsolver eigh.cpp -o rocm6.0.3.x
hipcc -std=c++17 --offload-arch=gfx90a -O3 -DHIP -lrocblas -lrocsolver eigh.cpp -o rocm6.0.3.x -Wno-unused-result
sbatch --partition=dev-g --nodes=1 --ntasks-per-node=1 --cpus-per-task=1 --gpus-per-node=1 --time=01:00:00 -o 'rocm6.0.3.out' --wrap='./rocm6.0.3.x 3,100,200,400,800,1600'

hipcc -std=c++14 --offload-arch=gfx90a -O3 -DMAGMA -DHIP -lmagma eigh.cpp -o magma2.8.0_rocm6.0.3.x
hipcc -std=c++17 --offload-arch=gfx90a -O3 -DMAGMA -DHIP -lmagma eigh.cpp -o magma2.8.0_rocm6.0.3.x -Wno-unused-result
sbatch --partition=dev-g --nodes=1 --ntasks-per-node=1 --cpus-per-task=1 --gpus-per-node=1 --time=01:00:00 -o 'magma2.8.0_rocm6.0.3.out' --wrap='./magma2.8.0_rocm6.0.3.x 3,100,200,400,800,1600,3200,6400,12800'
```

Expand All @@ -37,7 +50,7 @@ ml LUMI/24.03
ml partition/G
ml rocm/6.2.2

hipcc -std=c++14 --offload-arch=gfx90a -O3 -DHIP -lrocblas -lrocsolver eigh.cpp -o rocm6.2.2.x
hipcc -std=c++17 --offload-arch=gfx90a -O3 -DHIP -lrocblas -lrocsolver eigh.cpp -o rocm6.2.2.x -Wno-unused-result
sbatch --partition=dev-g --nodes=1 --ntasks-per-node=1 --cpus-per-task=1 --gpus-per-node=1 --time=01:00:00 -o 'rocm6.2.2.out' --wrap='./rocm6.2.2.x 3,100,200,400,800,1600,3200'
```

Expand All @@ -48,10 +61,10 @@ Container source [here](https://github.com/trossi/containers/tree/main/examples/
```bash
export SINGULARITY_BIND="/pfs,/scratch,/projappl,/project,/flash,/appl"

singularity exec rocm_magma.sif hipcc -std=c++14 --offload-arch=gfx90a -O3 -DHIP -lrocblas -lrocsolver eigh.cpp -o rocm6.3.2.x
singularity exec rocm_magma.sif hipcc -std=c++17 --offload-arch=gfx90a -O3 -DHIP -lrocblas -lrocsolver eigh.cpp -o rocm6.3.2.x -Wno-unused-result
sbatch --partition=dev-g --nodes=1 --ntasks-per-node=1 --cpus-per-task=1 --gpus-per-node=1 --time=01:00:00 -o 'rocm6.3.2.out' --wrap='singularity exec rocm_magma.sif ./rocm6.3.2.x 3,100,200,400,800,1600,3200,6400'

singularity exec rocm_magma.sif hipcc -std=c++14 --offload-arch=gfx90a -O3 -DMAGMA -DHIP -lmagma eigh.cpp -o magma2.9.0_rocm6.3.2.x
singularity exec rocm_magma.sif hipcc -std=c++17 --offload-arch=gfx90a -O3 -DMAGMA -DHIP -lmagma eigh.cpp -o magma2.9.0_rocm6.3.2.x -Wno-unused-result
sbatch --partition=dev-g --nodes=1 --ntasks-per-node=1 --cpus-per-task=1 --gpus-per-node=1 --time=01:00:00 -o 'magma2.9.0_rocm6.3.2.out' --wrap='singularity exec rocm_magma.sif ./magma2.9.0_rocm6.3.2.x 3,100,200,400,800,1600,3200,6400,12800'
```

Expand All @@ -78,10 +91,10 @@ make -j128 lib/libmagma.so GPU_TARGET=Ampere OPENBLASDIR=$OPENBLAS_INSTALL_ROOT
```bash
ml cuda/11.5.0

nvcc -std=c++14 -arch=sm_80 -O3 -DCUDA -lcusolver eigh.cpp -o cuda11.5.0.x
nvcc -std=c++17 -arch=sm_80 -O3 -DCUDA -lcusolver eigh.cpp -o cuda11.5.0.x
sbatch -p gputest --nodes=1 --ntasks-per-node=1 --gres=gpu:a100:1 -t 0:15:00 -o cuda11.5.0.out --wrap='./cuda11.5.0.x 3,100,200,400,800,1600,3200,6400,12800'

nvcc -std=c++14 -arch=sm_80 -O3 -DMAGMA -DCUDA -lmagma -I$PWD/magma/include -L$PWD/magma/lib -Xcompiler \"-Wl,-rpath,$PWD/magma/lib\" eigh.cpp -o magma2.8.0_cuda11.5.0.x
nvcc -std=c++17 -arch=sm_80 -O3 -DMAGMA -DCUDA -lmagma -I$PWD/magma/include -L$PWD/magma/lib -Xcompiler \"-Wl,-rpath,$PWD/magma/lib\" eigh.cpp -o magma2.8.0_cuda11.5.0.x
sbatch -p gputest --nodes=1 --ntasks-per-node=1 --gres=gpu:a100:1 -t 0:15:00 -o magma2.8.0_cuda11.5.0.out --wrap='./magma2.8.0_cuda11.5.0.x 3,100,200,400,800,1600,3200,6400,12800'
```

Expand All @@ -92,9 +105,9 @@ Container source [here](https://github.com/trossi/containers/tree/main/examples/
```bash
export SINGULARITY_BIND="/scratch,/projappl,/appl"

singularity exec -B /local_scratch cuda_magma.sif nvcc -std=c++14 -arch=sm_80 -O3 -DCUDA -lcusolver eigh.cpp -o cuda12.6.1.x
singularity exec -B /local_scratch cuda_magma.sif nvcc -std=c++17 -arch=sm_80 -O3 -DCUDA -lcusolver eigh.cpp -o cuda12.6.1.x
sbatch -p gputest --nodes=1 --ntasks-per-node=1 --gres=gpu:a100:1 -t 0:15:00 -o cuda12.6.1.out --wrap='singularity exec --nv cuda_magma.sif ./cuda12.6.1.x 3,100,200,400,800,1600,3200,6400,12800'

singularity exec -B /local_scratch cuda_magma.sif nvcc -std=c++14 -arch=sm_80 -O3 -DMAGMA -DCUDA -lmagma eigh.cpp -o magma2.9.0_cuda12.6.1.x
singularity exec -B /local_scratch cuda_magma.sif nvcc -std=c++17 -arch=sm_80 -O3 -DMAGMA -DCUDA -lmagma eigh.cpp -o magma2.9.0_cuda12.6.1.x
sbatch -p gputest --nodes=1 --ntasks-per-node=1 --gres=gpu:a100:1 -t 0:15:00 -o magma2.9.0_cuda12.6.1.out --wrap='singularity exec --nv cuda_magma.sif ./magma2.9.0_cuda12.6.1.x 3,100,200,400,800,1600,3200,6400,12800'
```
Loading