update docs for CUDA extension

qutip · Oct 31, 2023 · 7f50a9f · 7f50a9f
1 parent 5bf2a1c
commit 7f50a9f
Show file tree

Hide file tree

Showing 5 changed files with 119 additions and 1 deletion.
diff --git a/docs/Project.toml b/docs/Project.toml
@@ -1,5 +1,6 @@
 [deps]
 BenchmarkTools = "6e4b80f9-dd63-53aa-95a3-0cdb28fa8baf"
+CUDA = "052768ef-5323-5732-b1bb-66c8b64840ba"
 Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
 HierarchicalEOM = "a62dbcb7-80f5-4d31-9a88-8b19fd92b128"
 LaTeXStrings = "b964fa9f-0449-5b57-a5c2-d3ea65f4040f"
@@ -12,6 +13,7 @@ QuantumOptics = "6e0679c1-51ea-5a7c-ac74-d61b76210b0c"
 
 [compat]
 BenchmarkTools = "1.3"
+CUDA = "5"
 Documenter = "0.27, 1"
 HierarchicalEOM = "1"
 LaTeXStrings = "1"

diff --git a/docs/make.jl b/docs/make.jl
@@ -77,7 +77,8 @@ const PAGES = Any[
         "Examples" => EX_output_files,
         "Benchmark Solvers" => BM_output_files,
         "Extensions" => Any[
-            "QuantumOptics.jl" => "extensions/QuantumOptics.md"
+            "QuantumOptics.jl" => "extensions/QuantumOptics.md",
+            "CUDA.jl" => "extensions/CUDA.md"
         ]
     ],
     "Library API" => "libraryAPI.md"

diff --git a/docs/src/extensions/CUDA.md b/docs/src/extensions/CUDA.md
@@ -0,0 +1,108 @@
+# [Extension for CUDA.jl](@id doc-ext-CUDA)
+
+This is an extension to support GPU ([`CUDA.jl`](https://github.com/JuliaGPU/CUDA.jl)) acceleration for solving the [time evolution](@ref doc-Time-Evolution) and [spectrum](@ref doc-Spectrum). This improves the execution time and memory usage especially when the HEOMLS matrix is super large.
+
+!!! compat "Compat"
+    The described feature requires `Julia 1.9+`.
+
+The functions [`evolution`](@ref doc-Time-Evolution) (only supports ODE method with time-independent system Hamiltonian) and [`spectrum`](@ref doc-Spectrum) will automatically choose to solve on CPU or GPU depend on the type of the sparse matrix in `M::AbstractHEOMLSMatrix` objects (i.e., the type of the field `M.data`). 
+```julia
+typeof(M.data) <:   SparseMatrixCSC # solve on CPU
+typeof(M.data) <: CuSparseMatrixCSC # solve on GPU
+```
+
+Therefore, we wrapped several functions in `CUDA` and `CUDA.CUSPARSE` in order to return a new HEOMLS-matrix-type object with `M.data` is in the type of `CuSparseMatrix`, and also change the element type into `ComplexF32` and `Int32` (since GPU performs better in this type). The functions are listed as follows:
+- `cu(M::AbstractHEOMLSMatrix)` : Translate `M.data` into the type `CuSparseMatrixCSC{ComplexF32, Int32}`
+- `CuSparseMatrixCSC(M::AbstractHEOMLSMatrix)` : Translate `M.data` into the type `CuSparseMatrixCSC{ComplexF32, Int32}`
+
+### Demonstration
+
+The extension will be automatically loaded if user imports the package `CUDA.jl` :
+
+````@example CUDA_Ext_example
+using BenchmarkTools
+using CUDA
+using HierarchicalEOM
+using LinearSolve # to change the solver for better GPU performance
+using Plots
+````
+
+### Check version info. of `HierarchicalEOM.jl`
+
+````@example CUDA_Ext_example
+HierarchicalEOM.versioninfo()
+````
+
+### Check version info. of `CUDA.jl`
+
+````@example CUDA_Ext_example
+CUDA.versioninfo()
+````
+
+### Setup
+
+Here, we demonstrate this extension by using the example of [the single-impurity Anderson model](@ref exp-SIAM). 
+
+````@example CUDA_Ext_example
+ϵ  = -5
+U  = 10
+Γ  = 2
+μ  = 0
+W  = 10
+kT = 0.5
+N  = 5
+tier  = 3
+
+tlist = 0f0:1f-1:10f0  # same as 0:0.1:10 but in the type of `Float32`
+ωlist = -10f0:1f0:10f0 # same as -10:1:10 but in the type of `Float32`
+
+σm = [0 1; 0  0]
+σz = [1 0; 0 -1]
+II = [1 0; 0  1]
+d_up = kron(     σm, II)
+d_dn = kron(-1 * σz, σm)
+ρ0   = kron([1 0; 0 0], [1 0; 0 0])
+Hsys = ϵ * (d_up' * d_up + d_dn' * d_dn) + U * (d_up' * d_up * d_dn' * d_dn)
+
+bath_up = Fermion_Lorentz_Pade(d_up, Γ, μ, W, kT, N)
+bath_dn = Fermion_Lorentz_Pade(d_dn, Γ, μ, W, kT, N)
+bath_list = [bath_up, bath_dn]
+
+# even HEOMLS matrix
+M_even_cpu = M_Fermion(Hsys, tier, bath_list; verbose=false)
+M_even_gpu = cu(M_even_cpu)
+
+# odd HEOMLS matrix
+M_odd_cpu  = M_Fermion(Hsys, tier, bath_list, ODD; verbose=false)
+M_odd_gpu  = cu(M_odd_cpu)
+
+# solve steady state with CPU
+ados_ss = SteadyState(M_even_cpu);
+````
+
+!!! note "Note"
+    This extension does not support for solving [`SteadyState`](@ref doc-Stationary-State) on GPU since it is not efficient and might get wrong solutions. If you really want to obtain the stationary state with GPU, you can repeatedly solve the [`evolution`](@ref doc-Time-Evolution) until you find it.
+
+### Solving time evolution with CPU
+
+````@example CUDA_Ext_example
+@benchmark ados_list_cpu = evolution(M_even_cpu, ρ0, tlist; verbose=false)
+````
+
+### Solving time evolution with GPU
+
+````@example CUDA_Ext_example
+@benchmark ados_list_gpu = evolution(M_even_gpu, ρ0, tlist; verbose=false)
+````
+
+### Solving Spectrum with CPU
+
+````@example CUDA_Ext_example
+@benchmark dos_cpu = spectrum(M_odd_cpu, ados_ss, d_up, ωlist; verbose=false)
+````
+
+### Solving Spectrum with GPU
+
+````@example CUDA_Ext_example
+@benchmark dos_gpu = spectrum(M_odd_gpu, ados_ss, d_up, ωlist; solver=KrylovJL_BICGSTAB(rtol=1f-10, atol=1f-12), verbose=false)
+````
diff --git a/docs/src/spectrum.md b/docs/src/spectrum.md
@@ -13,6 +13,9 @@ The function [`spectrum`](@ref) will automatically detect the [parity](@ref doc-
 
 `HierarchicalEOM.jl` wraps some of the functions in [LinearSolve.jl](http://linearsolve.sciml.ai/stable/), which is a very rich numerical library for solving the linear problems and provides many solvers. It offers quite a few options for the user to tailor the solver to their specific needs. The default solver (and its corresponding settings) are chosen to suit commonly encountered problems and should work fine for most of the cases. If you require more specialized methods, such as the choice of algorithm, please refer to [benchmark for LinearSolve solvers](@ref benchmark-LS-solvers) and also the documentation of [LinearSolve.jl](http://linearsolve.sciml.ai/stable/).
 
+!!! compat "Extension for CUDA.jl"
+    `HierarchicalEOM.jl` provides an extension to support GPU ([`CUDA.jl`](https://github.com/JuliaGPU/CUDA.jl)) acceleration for solving the spectrum, but this feature requires `Julia 1.9+` and `HierarchicalEOM 1.1+`. See [here](@ref doc-ext-CUDA) for more details.
+
 ## [Power Spectral Density](@id doc-PSD)
 Start from the spectrum for bosonic systems (power spectral density) in the time-domain. We write the system two-time correlation function in terms of the propagator ``\hat{\mathcal{G}}(t)=\exp(\hat{\mathcal{M}} t)`` for ``t>0``. The power spectral density ``S(\omega)`` can be obtained as
 ```math

diff --git a/docs/src/time_evolution.md b/docs/src/time_evolution.md
@@ -53,6 +53,10 @@ end
 ## Ordinary Differential Equation Method
 The first method is implemented by solving the ordinary differential equation (ODE) as shown above. `HierarchicalEOM.jl` wraps some of the functions in [`DifferentialEquations.jl`](https://diffeq.sciml.ai/stable/), which is a very rich numerical library for solving the differential equations and provides many ODE solvers. It offers quite a few options for the user to tailor the solver to their specific needs. The default solver (and its corresponding settings) are chosen to suit commonly encountered problems and should work fine for most of the cases. If you require more specialized methods, such as the choice of algorithm, please refer to [benchmarks for DifferentialEquations solvers](@ref benchmark-ODE-solvers) and also the documentation of [`DifferentialEquations.jl`](https://diffeq.sciml.ai/stable/).
 
+!!! compat "Extension for CUDA.jl"
+    `HierarchicalEOM.jl` provides an extension to support GPU ([`CUDA.jl`](https://github.com/JuliaGPU/CUDA.jl)) acceleration for solving the time evolution (only for ODE method with time-independent system Hamiltonian), but this feature requires `Julia 1.9+` and `HierarchicalEOM 1.1+`. See [here](@ref doc-ext-CUDA) for more details.
+
+
 ### Given the initial state as Density Operator (`AbstractMatrix` type)
 
 See the docstring of this method: