GMRES Fails Silently From Stagnation #729

packquickly · 2023-05-03T20:25:32Z

GMRES often fails to converge when restart is less than the problem dimension, but will return without indicating failure.

This failure (likely due to stagnation) seems to depend on the absolute difference between the problem dimension and the Krylov subspace dimension (restart.) ie. roughly 70% of the time GMRES will fail silently on a problem of dimension 25 if restart = 24, and will still fail 70% of the time with a problem of dimension 100 if restart = 99. For a difference in dimension of 5, the test below returns an incorrect value greater than 99% of the time.

Reproducing code:

import Krylov: gmres
import LinearAlgebra: cond, norm

for problem_dim in [25, 100]
    for excess_dim in [0, 1, 5]
        succeeded = 0
        restart = problem_dim - excess_dim
        for i = 1:100 
            condition_number = Inf
            matrix = nothing
            while condition_number > 1000
                matrix = randn(Float64, (problem_dim, problem_dim))
                condition_number = cond(matrix)
            end

            true_vec = randn(Float64, (problem_dim,))
            b = matrix * true_vec

            (julia_soln, state) = gmres(matrix, b, memory=restart, restart=true)

            residual_norm = norm(julia_soln - true_vec)

            succeeded += (residual_norm < 1)  # very loose tolerance!
        end
        println("problem dim: $problem_dim. restart dim: $restart. succeeded: $succeeded")
    end
end

amontoison · 2023-05-03T20:41:48Z

Hi @RaderJason!

Did you look at the stats structure returned by GMRES?
You just need to check stats.solved and stats.status.

packquickly · 2023-05-03T21:12:00Z

Ah! Indeed these are caught by stats. However, why does this not return an error outright?

amontoison · 2023-05-03T21:43:01Z

Because it's not an error if GMRES didn't converged. :)
The options that you provide (tolerances, restart, preconditioners, etc...) probably need to be adjusted.
If GMRES doesn't return an error, you are also able to use the approximate solution in another Krylov method or switch to a direct sparse solver.

You could also use atol=0.0 and rtol=0.0 and returned the best solution after a fixed number of iterations (or amount of time).

packquickly · 2023-05-03T22:08:14Z

Okay, just note that downstream applications are not taking this into account, and are failing silently (how I found this.)

The options are somewhat of a mixed bag from what I've gathered. Tolerance isn't applicable in the test I gave above. This fails by a very wide margin. restarts is also not a key option. 70% of the time when restart is 1 dimension less than the dimension of the linear problem the test above fails. ie. you'd need to set restarts >= problem_dim, recovering gmres and not gmres + restarts.

I am abusing this algorithm somewhat, as I'm using a dense matrix rather than a sparse one. This does certainly play a role, but the failure rate of gmres + restarts is surprisingly high even in the sparse setting for a small difference between the problem dimension and the Krylov subspace dimension.

Note that everything I mentioned (except perhaps disagreeing about whether this should raise an error :) ) is an issue with gmres + restarts as an algorithm and not the implementation here. Feel free to close the issue on this basis.

amontoison · 2023-05-03T22:50:50Z

Okay, just note that downstream applications are not taking this into account, and are failing silently (how I found this.)

Except LinearSolve.jl, do you know other packages that are not taking this into account?
If you want to use Krylov methods, don't use LinearSolve.jl directly.
KrylovKit.jl doesn't support preconditioners and IterativeSolvers.jl in unmaintained with many bugs.

The options are somewhat of a mixed bag from what I've gathered. Tolerance isn't applicable in the test I gave above. This fails by a very wide margin. restarts is also not a key option. 70% of the time when restart is 1 dimension less than the dimension of the linear problem the test above fails. ie. you'd need to set restarts >= problem_dim, recovering gmres and not gmres + restarts.

It depends of your linear system but with a dense random matrix, it's not relevant to test the restart option.
The eigenvalues are not clustered and you have this kind of behaviour.

I am abusing this algorithm somewhat, as I'm using a dense matrix rather than a sparse one. This does certainly play a role, but the failure rate of gmres + restarts is surprisingly high even in the sparse setting for a small difference between the problem dimension and the Krylov subspace dimension.

It's normal because the convergence rate of Krylov methods depends of the eigenvalues / singular values.

Note that everything I mentioned (except perhaps disagreeing about whether this should raise an error :) ) is an issue with gmres + restarts as an algorithm and not the implementation here. Feel free to close the issue on this basis.

I should probably add a comment about the restart option in the docstring of GMRES and GPMR.
For information DQGMRES is a good alternative to restarted GMRES.

This was referenced May 3, 2023

Every GMRES Algorithm Fails Silently From Stagnation SciML/LinearSolve.jl#297

Closed

GMRES Fails Silently From Stagnation Jutho/KrylovKit.jl#70

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GMRES Fails Silently From Stagnation #729

GMRES Fails Silently From Stagnation #729

packquickly commented May 3, 2023 •

edited

Loading

amontoison commented May 3, 2023 •

edited

Loading

packquickly commented May 3, 2023

amontoison commented May 3, 2023 •

edited

Loading

packquickly commented May 3, 2023

amontoison commented May 3, 2023

GMRES Fails Silently From Stagnation #729

GMRES Fails Silently From Stagnation #729

Comments

packquickly commented May 3, 2023 • edited Loading

amontoison commented May 3, 2023 • edited Loading

packquickly commented May 3, 2023

amontoison commented May 3, 2023 • edited Loading

packquickly commented May 3, 2023

amontoison commented May 3, 2023

packquickly commented May 3, 2023 •

edited

Loading

amontoison commented May 3, 2023 •

edited

Loading

amontoison commented May 3, 2023 •

edited

Loading