-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I do not know how to tune the amgcl to make solver converged #272
Comments
Hello,
I could not improve on the convergence results you show. You could try to use grid coordinates to generate near-nullspace vectors (see https://amgcl.readthedocs.io/en/latest/tutorial/Nullspace.html).
The CUDA backend does not support block-valued solution, but you could use VexCL backend with CUDA as VexCL backend. There is an example of using VexCL backend with block-valued systems here: https://amgcl.readthedocs.io/en/latest/tutorial/Serena.html
You should put it into the solution vector. Vector x is treated as both input (for initial approximation) and output (for the result). |
Hello Sir, I have been studying and trying many options in AMGCL for the past few days, but still, there has been no progress. After using the nullspace, I found that the improvement in convergence is not significant. I'm not sure if this is because the C matrix I constructed is incorrect. My stiffness matrix is a general tetrahedral stiffness matrix, and I have removed the rows and columns corresponding to the fixed points. This means that I have also removed the coordinates of the fixed points in the C matrix (this matrix is also uploaded in the previous link), and I am not sure if this approach is correct. Additionally, I tried using ns_search to find approximate nullspace, but the results are still not satisfactory. Here are some simple results from my tests: currently, using the CG solver with the block option gives the best result, but it takes about 1.6 seconds. I have also implemented the VexCL backend with block-valued systems with the CUDA option, but it also takes about 0.62 seconds to compute. However, when I tested with the Intel MKL's Pardiso library, the results only took about 0.608 seconds. Do you have any further suggestions? ./solver -A A_Shoe.mtx -f f_Shoe.mtx solver.maxiter=1000 -b3 SolverType: BiCGStab PreconditionerNumber of levels: 3 level unknowns nonzeros memory
Iterations: 290 [Profile: 3.483 s] (100.00%) ./solver -A A_Shoe.mtx -f f_Shoe.mtx solver.type=cg solver.maxiter=1000 -b3 SolverType: CG PreconditionerNumber of levels: 3 level unknowns nonzeros memory
Iterations: 513 [Profile: 3.301 s] (100.00%) ./solver -A A_Shoe.mtx -f f_Shoe.mtx solver.type=cg solver.maxiter=1000 -b3 precond.coarsening.aggr.eps_strong=0 -C C_Shoe.mtx Iterations: 890 [Profile: 5.619 s] (100.00%) ./solver -A A_Shoe.mtx -f f_Shoe.mtx solver.type=cg solver.maxiter=1000 precond.coarsening.aggr.eps_strong=0 -C C_Shoe.mtx Iterations: 471 [Profile: 5.453 s] (100.00%) ./ns_search -A A_Shoe.mtx -f f_Shoe.mtx solver.type=cg solver.maxiter=1000 precond.coarsening.aggr.eps_strong=0 -n6 -o N6.mtx ./solver -A A_Shoe.mtx -f f_Shoe.mtx solver.type=cg solver.maxiter=1000 precond.coarsening.aggr.eps_strong=0 -N N6.mtx Iterations: 235 [Profile: 3.683 s] (100.00%) ./solver -A A_Shoe.mtx -f f_Shoe.mtx solver.type=cg solver.maxiter=1000 -b3 precond.coarsening.aggr.eps_strong=0 -N N6.mtx [Profile: 3.762 s] (100.00%) VexCL backend with block-valued systems with the CUDA option:
Matrix A_0 has 48033 rows and 48033 columns.
|
It looks like the problem is just too difficult to solve with a simple iterative approach. Intel Pardiso is a direct solver, so it does not have this problem. Direct solvers usually do not scale as good as iterative ones, but if you do not need to solve larger systems, and are satisfied with the Pardiso performance, then may be the direct solver is your best approach here. Otherwise, you probably need to research on the methods to solve your particular problem. I am sorry I could not be of more help. |
Hello Sir, I want to express my heartfelt gratitude for your outstanding work in creating AMGCL!
I have encountered an issue related to the convergence of the linear system matrix A_hand, which is causing slow convergence. A_hand, along with vector f_hand, is computed from a 3D tetrahedral mesh used for Finite Element Method (FEM) object simulation. A_hand is a matrix of approximately 176328 x 176328 dimensions. I have also followed the recommendations mentioned in issue #175. Specifically, I have used the provided main_2.cpp, which employs a block-structure with 3x3 blocks and sets precond.coarsening.aggr.block_size=3. I have also experimented with other meshes containing A_shoe, which has only 48033 unknowns. However, it still requires an excessive number of iterations to achieve a relatively better result, as indicated below.
I have three questions here:
Is there a more effective approach to expedite the convergence of the linear system? I am uncertain why the convergence for such a matrix is slow and how to fine-tune the AMG solver. To provide more insight, I have included a matrix portrait below.
How can I implement the block-structure in the CUDA-compatible AMG solver? The file with a block-structure I previously used was designed for Eigen, and I am unsure how to adapt it to the CUDA version (block structure for CUDA?).
If i could get a good initial solution for result vector x, how to specify it in the code? (Since here im doing an project that need to design material distribution in an object (update material in each iteration). I think the solution computed from last iteration would help for converge in the next iteration.)
Attached, you will find the main.cpp and the results. The matrix can be downloaded from the provided link.
I would be immensely grateful for any suggestions you may offer regarding optimal parameters and the combination of options to expedite the solution.
Thank you very much for your time, Sir.
result of A_hand
Matrix A has 176328 rows and 176328 columns.
Matrix A is rowmajor
Solver
Type: BiCGStab
Unknowns: 58776
Memory footprint: 9.42 M
Preconditioner
Number of levels: 3
Operator complexity: 1.35
Grid complexity: 1.08
Memory footprint: 113.82 M
level unknowns nonzeros memory
500 4.39234e-06
[Profile: 20.294 s] (100.00%)
[ self: 7.117 s] ( 35.07%)
[ setup: 0.149 s] ( 0.73%)
[ solve: 13.028 s] ( 64.20%)
result of A_Shoe
Matrix A has 48033 rows and 48033 columns.
Matrix A is rowmajor
Solver
Type: BiCGStab
Unknowns: 16011
Memory footprint: 2.57 M
Preconditioner
Number of levels: 3
Operator complexity: 1.13
Grid complexity: 1.08
Memory footprint: 25.40 M
level unknowns nonzeros memory
399 7.81433e-09
[Profile: 3.833 s] (100.00%)
[ self: 1.882 s] ( 49.10%)
[ setup: 0.022 s] ( 0.58%)
[ solve: 1.929 s] ( 50.32%)
Matrix portrait of A_hand
Matrix portrait of A_shoe
The text was updated successfully, but these errors were encountered: