Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify damping and add lambda_max #65

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 13 additions & 15 deletions src/simulate.jl
Original file line number Diff line number Diff line change
Expand Up @@ -75,16 +75,16 @@ Run a simulation for the given model, simulation plan and exogenous data.
* `model` - the [`Model`](@ref ModelBaseEcon.Model) instance to simulate.
* `plan` - the [`Plan`](@ref) for the simulation.
* `data` - a 2D `Array` containing the exogenous data. This includes the
initial and final conditions.
initial and final conditions.

### Options as keyword arguments
* `fctype::`[`FinalCondition`](@ref) - set the desired final condition type
for the simulation. The default value is [`fcgiven`](@ref). Other possible
for the simulation. The default value is [`fcgiven`](@ref). Other possible
values include [`fclevel`](@ref), [`fcslope`](@ref) and
[`fcnatural`](@ref).
[`fcnatural`](@ref).
* `initial_guess::AbstractMatrix{Float64}` - a 2D `Array` containing the
initial guess for the solution. This is used to start the Newton-Raphson
algorithm. The default value is an empty array (`zeros(0,0)`), in which case
initial guess for the solution. This is used to start the Newton-Raphson
algorithm. The default value is an empty array (`zeros(0,0)`), in which case
we use the exogenous data for the initial condition. You can use the steady
state solution using [`steadystatearray`](@ref).
* `deviation::Bool` - set to `true` if the `data` is given in deviations from
Expand All @@ -97,10 +97,10 @@ Run a simulation for the given model, simulation plan and exogenous data.
* `verbose::Bool` - control whether or not to print progress information.
Default value is taken from `model.options`.
* `tol::Float64` - set the desired accuracy. Default value is taken from
`model.options`.
`model.options`.
* `maxiter::Int` - algorithm fails if the desired accuracy is not reached
within this maximum number of iterations. Default value is taken from
`model.options`.
`model.options`.
The following options are specific to the `:stackedtime` solver
* `sim_solver` - specify the non-linear solver to use. Available options are
- `:sim_nr` : (default) Newton-Raphson, with possible damping, see below.
Expand All @@ -111,7 +111,7 @@ The following options are specific to the `:stackedtime` solver
will do this at each iteration. Default is `false`. (Superseded by the
`damping` option described below)
* `damping` - Specifies the style of damping that can be applied to the
Newton non-linear solver. Available options are:
Newton non-linear solver. Available options are:
- if not given the default behaviour is no damping, i.e. the damping
coefficient is set to 1.0 in each iteration.
- number: the damping coefficient will be set to the given number (rather than 1)
Expand All @@ -122,13 +122,11 @@ The following options are specific to the `:stackedtime` solver
- `:linesearch` or `:armijo` : same as setting `linesearch=true`. The
Armijo rule is taken from "C.T.Kelly, Iterative Methods for Linear and
Nonlinear Equations, ch.8.1, p.137"
- `(:armijo, :sigma => 0.5, :alpha => 1e-4)` - override the default
parameters of the Armijo rule.
- `:br81` : (experimental) implements the damping algorithm in "Bank, R.E.,
Rose, D.J. Global approximate Newton methods. Numer. Math. 37, 279–295
(1981)."
- `(:br81, :rateK => 10, :delta => 0.1)` : override the default parameters
of the Bank & Rose (1981) algorithm.
- `(:armijo, :sigma => 0.5, :alpha => 1e-4, :lambda_min => 1e-5, :lambda_max => 1.0, :lambda_growth => 1.05)` -
extends the Armijo rule with additional controls for minimum and maximum damping factors and a lambda growth parameter.
- `:br81` - (experimental) implements the damping algorithm from "Bank, R.E., Rose, D.J. Global approximate Newton methods. Numer. Math. 37, 279–295 (1981)."
- `(:br81, :rateK => 10, :delta => 0.1, :lambda_min => 1e-5, :lambda_max => 1.0, :lambda_growth => 1.05)`
- extends `:br81` with additional controls for minimum and maximum damping factors and a lambda growth parameter.

"""
function simulate end
Expand Down
83 changes: 53 additions & 30 deletions src/stackedtime/sim_nr.jl
Original file line number Diff line number Diff line change
Expand Up @@ -24,21 +24,21 @@ Solve the simulation problem using a Newton iteration with damping.
- `damping_schedule(vector)` returns `λ=vector[it]` on iteration `it`. If
the vector is shorter than the number of iterations, it keeps returning
`array[end]`.
- `damping_amijo(α=1e-4, σ=0.5)` implements a standard linesearch algorithm
- `damping_amijo(sigma = 0.5, alpha = 1e-4, lambda_min = 1e-5, lambda_max = 1.0, lambda_growth = 1.05)` implements a standard linesearch algorithm
based on the Armijo rule
- `damping_bank_rose(delta=0.1, rateK=10.0)` implements a the damping
algorithm of Bank and Rose 1980

- `damping_br81(delta = 0.1, lambda_min = 1e-5, lambda_max = 1.0, lambda_growth = 1.05)` implements a the damping
algorithm of Bank and Rose (1981)
##### Conventions for custom damping function.
The `damping` callback function is expected to have the following signature:

function custom_damping(k::Int, λ::Float64, nR::Float64, R::AbstractVector{Float64},
J::Union{Nothing,Factorization,AbstractMatrix{Float64}}=nothing,
Δx::Union{Nothing,AbstractVector{Float64}}=nothing
)::Tuple{Bool, Float64}
# <your code goes here>
end

The first call will be with `k=0`, before the solver enters the Newton
iterations loop. This should allow any initialization and defaults to be setup.
In this call, the values of `R` and `nR` will equal the residual and its norm at
Expand All @@ -49,7 +49,7 @@ Each subsequent call will be with `k` between 1 and `maxiter` (possibly multiple
calls with the same `k`) will have the current `λ` (which equals the one returned by the
previous call), the current `R` (and its Inf-norm `nR`), the Jacobian `J` and
the Newton direction `Δx`.

The damping function must return a tuple `(accept, λ)`. The same Newton
iteration `k` will continue until the damping function returns `accept=true`,
after which will begin the next Newton iteration (`k=k+1``).
Expand All @@ -69,7 +69,7 @@ to accept this step, by returning `(true, λ)`, or reject it and propose a new
to try, by returning `(false, new_λ)`. Don't return `(false, λ)` because this
will make it an infinite loop. Good luck!

"""
"""
function sim_nr!(x::AbstractArray{Float64}, sd::StackedTimeSolverData,
maxiter::Int64, tol::Float64, verbose::Bool, damping::Function
)
Expand Down Expand Up @@ -141,44 +141,63 @@ function damping_schedule(lambda::AbstractVector{<:Real}; verbose::Bool=false)
end

# the Armijo rule: C.T.Kelly, Iterative Methods for Linear and Nonlinear Equations, ch.8.1, p.137
function damping_armijo(; alpha::Real=1e-4, sigma::Real=0.5, lambda_min::Real=0.00001, verbose::Bool=false)
function damping_armijo(; alpha::Real=1e-4, sigma::Real=0.5, lambda_min::Real=1e-5, lambda_max::Real=1.0, lambda_growth::Real=1.1, verbose::Bool=false)
α = convert(Float64, alpha)
σ = convert(Float64, sigma)
λ_min = convert(Float64, lambda_min)
λ_max = convert(Float64, lambda_max)
λ_growth = convert(Float64, lambda_growth)
nF2_it = 0 # iteration number at which nF2 is valid
nF2 = NaN # the norm of the residual at the beginning of iteration nF2_it
return function (it::Int, λ::Float64, nF::Float64, F::AbstractVector{Float64},
::Union{Nothing,Factorization,AbstractMatrix{Float64}}=nothing,
::Union{Nothing,AbstractVector{Float64}}=nothing
)
# @printf " it=%d, λ=%g, nF=%g\n" it λ nF
it < 1 && return true, 1.0
it < 1 && return true, min(1.0, λ_max)

if nF2_it != it
# first time we're called this iteration
nF2 = norm(F, 2) # store the residual
# First call in this iteration: Store the residual norm
nF2 = norm(F, 2)
nF2_it = it
return false, 1.0 # try λ=1.0, a full Newton step, first
return false, min(1.0, λ_max)
end

if λ < λ_min
# λ too small
verbose && @warn "Linesearch failed."
verbose && @warn "Linesearch failed: λ fell below λ_min."
return true, λ
end

if norm(F, 2) < (1.0 - α * λ) * nF2
# Armijo test pass => accept the given λ
return true, λ
# Armijo test passed => accept the given λ
new_λ = min(λ * λ_growth, λ_max) # Gradually increase λ but cap at λ_max
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should call this solver something else if we are not in fact implementing the armijo rule.


if abs(norm(F, 2) - nF2) < 1e-12 # Convergence check to break loops
verbose && @info "Solver converged: residual change too small."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check for verbose should be before the check on the norm to save processing.

return true, new_λ
end

return true, new_λ
else
# reject and try a smaller λ
return false, σ * λ
# Reject and try a smaller λ
new_λ = max(σ * λ, λ_min)

if λ == new_λ # Prevent infinite shrinking loops
verbose && @warn "Stuck in shrinking loop, forcing exit."
return true, λ
end

return false, new_λ
end
end
end

# Bank, R.E., Rose, D.J. Global approximate Newton methods. Numer. Math. 37, 279–295 (1981).
# https://doi.org/10.1007/BF01398257
function damping_br81(; delta::Real=0.1, rateK::Real=10.0, lambda_min::Real=1e-5, verbose::Bool=false)
function damping_br81(; delta::Real=0.1, rateK::Real=10.0, lambda_min::Real=1e-5, lambda_max::Real=1.0, lambda_growth::Real=1.05, verbose::Bool=false)
δ = convert(Float64, delta)
λ_min = convert(Float64, lambda_min)
λ_max = convert(Float64, lambda_max)
λ_growth = convert(Float64, lambda_growth)
bigK = 0.0 # Initialize with 0.0 (effectively the full Newton step)
nF2_it = 0 # iteration number at which nF2 is valid
nF2 = NaN # the norm of the residual at the beginning of iteration nF2_it
Expand All @@ -187,33 +206,37 @@ function damping_br81(; delta::Real=0.1, rateK::Real=10.0, lambda_min::Real=1e-5
::Union{Nothing,Factorization,AbstractMatrix{Float64}}=nothing,
::Union{Nothing,AbstractVector{Float64}}=nothing
)
# @printf " it=%d, λ=%g, nF=%g\n" it λ nF
it < 1 && (bigK = 0.0; return true, 1.0)
# Initialization step
it < 1 && (bigK = 0.0; return true, λ_max)

if nF2_it != it
# first time we're called this iteration
# First time we're called in this iteration
nF2 = norm(F, 2) # store the residual
nF2_it = it
return false, calc_λ()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

calc_lambda should respect/enforce lambda_max, or is that option only in the signature to make all the signatures align?

end

if (1 - δ * λ) * nF2 < norm(F, 2)
# test failed => reject and try smaller λ
# If test failed, decrease step size
if bigK == 0.0
bigK = 1.0
else
bigK = rateK * bigK
bigK *= rateK # Increase `bigK` slower to prevent excessive reductions in λ
end
λ = calc_λ()
if λ > λ_min
return false, λ
else
# λ too small
verbose && @warn "Linesearch failed."
return true, λ_min
end
else
# lower bigK for next iteration ...
bigK = bigK / rateK
# ... and accept given λ
# Lower `bigK` more aggressively when convergence is happening
bigK /= sqrt(rateK)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this change; why not have a different rateK? Using the square root is actually less agressive than the previous implementation since rateK > 1.
I don't have access to the Bank & Rose paper, but we should also here write a different dampener if we are changing the algorithm. Same this with the lambda growth line below.


# If λ is near the lower bound for many steps, slowly increase it
λ = min(λ * λ_growth, λ_max)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not quite clear to me why we need to grow lambda here? The br algorithm already increases lambda when it lowers bigK.


return true, λ
end
end
Expand Down
Loading