Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TrustRegion bugs #210

Merged
merged 23 commits into from
Sep 29, 2023
Merged
Changes from 1 commit
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
081f3b1
fix incorrect gradient and Gauss-Newton Hessian proxy
FHoltorf Sep 14, 2023
f20e897
fix Cauchy point calculation in dogleg step
FHoltorf Sep 14, 2023
6f3556e
expose quadratic form structure explicitly
FHoltorf Sep 14, 2023
146dec9
add NLsolve trust region updating scheme and change GN step to -J\fu …
FHoltorf Sep 14, 2023
884aafc
add Nocedal and Wright trust region updating scheme
FHoltorf Sep 14, 2023
094eb34
add meaningful description for NLsolve and NW trust region updating s…
FHoltorf Sep 25, 2023
0e99655
cache memory for cauchy step to enable non-allocating code
FHoltorf Sep 25, 2023
439415b
parameter types should not be converted to eltype(u). For now, defaul…
FHoltorf Sep 26, 2023
0ba652b
finish rebase to master
FHoltorf Sep 26, 2023
6f7ef29
introduce better variable names (and also ones that are more consiste…
FHoltorf Sep 26, 2023
79ab992
fix consistency test
FHoltorf Sep 26, 2023
b749501
fix oop perform step and Fan scheme initialization
FHoltorf Sep 26, 2023
21d246d
improve comment
FHoltorf Sep 26, 2023
177def2
use less conservative step acceptance policy
FHoltorf Sep 26, 2023
0190764
choose Float64 as default type for trust region adaptation parameters…
FHoltorf Sep 26, 2023
5f298e3
hardcode NLsolve parameters
FHoltorf Sep 26, 2023
b2b5d89
add NLsolve-like trust region initialization
FHoltorf Sep 27, 2023
f67ced5
avoid recomputation of GN step if TR step was rejected. Faster and av…
FHoltorf Sep 27, 2023
34821e5
run SciML formatter
FHoltorf Sep 27, 2023
609f67c
rename NW -> NocedalWright
FHoltorf Sep 27, 2023
32cf735
convergence check for NocedalWright
FHoltorf Sep 27, 2023
e77fa76
test new trust region schemes
FHoltorf Sep 27, 2023
f5c66c4
run formatter
FHoltorf Sep 27, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 25 additions & 22 deletions src/trustRegion.jl
Original file line number Diff line number Diff line change
Expand Up @@ -203,6 +203,7 @@
shrink_counter::Int
du
u_tmp
u_gauss_newton
u_cauchy
fu_new
make_new_J::Bool
Expand All @@ -229,6 +230,7 @@
linsolve_kwargs)
u_tmp = zero(u)
u_cauchy = zero(u)
u_gauss_newton = zero(u)

loss_new = loss
H = zero(J)
Expand All @@ -246,8 +248,8 @@
# set default type for all trust region parameters
trustType = floatType
if radius_update_scheme == RadiusUpdateSchemes.NLsolve
max_trust_radius = convert(trustType, Inf)
initial_trust_radius = norm(u0) > 0 ? convert(trustType, norm(u0)) : one(trustType)

Check warning on line 252 in src/trustRegion.jl

View check run for this annotation

Codecov / codecov/patch

src/trustRegion.jl#L251-L252

Added lines #L251 - L252 were not covered by tests
else
max_trust_radius = convert(trustType, alg.max_trust_radius)
if iszero(max_trust_radius)
Expand All @@ -265,14 +267,13 @@
expand_factor = convert(trustType, alg.expand_factor)

# Parameters for the Schemes
floatType = typeof(r)
p1 = convert(floatType, 0.0)
p2 = convert(floatType, 0.0)
p3 = convert(floatType, 0.0)
p4 = convert(floatType, 0.0)
ϵ = convert(floatType, 1.0e-8)
if radius_update_scheme === RadiusUpdateSchemes.NLsolve
p1 = convert(floatType, 0.5)

Check warning on line 276 in src/trustRegion.jl

View check run for this annotation

Codecov / codecov/patch

src/trustRegion.jl#L276

Added line #L276 was not covered by tests
elseif radius_update_scheme === RadiusUpdateSchemes.Hei
step_threshold = convert(trustType, 0.0)
shrink_threshold = convert(trustType, 0.25)
Expand Down Expand Up @@ -321,28 +322,30 @@
jac_cache, false, maxiters, internalnorm, ReturnCode.Default, abstol, prob,
radius_update_scheme, initial_trust_radius, max_trust_radius, step_threshold,
shrink_threshold, expand_threshold, shrink_factor, expand_factor, loss, loss_new,
H, g, shrink_counter, du, u_tmp, u_cauchy, fu_new, make_new_J, r, p1, p2, p3, p4, ϵ,
H, g, shrink_counter, du, u_tmp, u_gauss_newton, u_cauchy, fu_new, make_new_J, r, p1, p2, p3, p4, ϵ,
NLStats(1, 0, 0, 0, 0))
end

isinplace(::TrustRegionCache{iip}) where {iip} = iip

function perform_step!(cache::TrustRegionCache{true})
@unpack make_new_J, J, fu, f, u, p, u_tmp, alg, linsolve = cache
@unpack make_new_J, J, fu, f, u, p, u_gauss_newton, alg, linsolve = cache
if cache.make_new_J
jacobian!!(J, cache)
mul!(cache.H, J', J)
mul!(cache.g, J', fu)
cache.stats.njacs += 1
end

# do not use A = cache.H, b = _vec(cache.g) since it is equivalent
# to A = cache.J, b = _vec(fu) as long as the Jacobian is non-singular
linres = dolinsolve(alg.precs, linsolve, A = J, b = _vec(fu),
linu = _vec(u_tmp),
# do not use A = cache.H, b = _vec(cache.g) since it is equivalent
# to A = cache.J, b = _vec(fu) as long as the Jacobian is non-singular
linres = dolinsolve(alg.precs, linsolve, A = J, b = _vec(fu),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems to be mutating J (resp. A) by default. I assume that is intended but requires care when the trust region step is being rejected. For now I simply avoid recomputing the Gauss-Newton step in that case since that is just unnecessary work (although we used to do it for some reason). That said, this won't work for Jacobian reuse trust region methods which may be on the horizon. Just leaving this note here since this is quite subtle and can lead to hard-to-detect problems.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@frankschae @avik-pal make note.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recently saw this as well while debugging a Gauss Newton bug, do we know why this is happening? Given that alias_A = true

linu = _vec(u_gauss_newton),
p = p, reltol = cache.abstol)
cache.linsolve = linres.cache
cache.u_tmp .= -1 .* u_tmp
cache.linsolve = linres.cache
@. cache.u_gauss_newton = -1 * u_gauss_newton
end

# Compute dogleg step
dogleg!(cache)

# Compute the potentially new u
Expand All @@ -363,11 +366,10 @@
cache.H = J' * J
cache.g = J' * fu
cache.stats.njacs += 1
cache.u_gauss_newton = -1 .* (cache.H \ cache.g)
end

@unpack g, H = cache
# Compute the Newton step.
cache.u_tmp = -1 .* (H \ g)
dogleg!(cache)

# Compute the potentially new u
Expand Down Expand Up @@ -435,42 +437,42 @@

elseif radius_update_scheme === RadiusUpdateSchemes.NLsolve
# accept/reject decision
if r > cache.step_threshold # accept
take_step!(cache)
cache.loss = cache.loss_new
cache.make_new_J = true

Check warning on line 443 in src/trustRegion.jl

View check run for this annotation

Codecov / codecov/patch

src/trustRegion.jl#L440-L443

Added lines #L440 - L443 were not covered by tests
else # reject
cache.make_new_J = false

Check warning on line 445 in src/trustRegion.jl

View check run for this annotation

Codecov / codecov/patch

src/trustRegion.jl#L445

Added line #L445 was not covered by tests
end

# trust region update
if r < 1//10 # cache.shrink_threshold
cache.trust_r *= 1//2 # cache.shrink_factor
elseif r >= 9//10 # cache.expand_threshold
cache.trust_r = 2 * norm(cache.du) # cache.expand_factor * norm(cache.du)
elseif r >= 1//2 # cache.p1
cache.trust_r = max(cache.trust_r, 2*norm(cache.du)) # cache.expand_factor * norm(cache.du))

Check warning on line 454 in src/trustRegion.jl

View check run for this annotation

Codecov / codecov/patch

src/trustRegion.jl#L449-L454

Added lines #L449 - L454 were not covered by tests
end

# convergence test
if iszero(cache.fu) || cache.internalnorm(cache.fu) < cache.abstol
cache.force_stop = true

Check warning on line 459 in src/trustRegion.jl

View check run for this annotation

Codecov / codecov/patch

src/trustRegion.jl#L458-L459

Added lines #L458 - L459 were not covered by tests
end

elseif radius_update_scheme === RadiusUpdateSchemes.NW
# accept/reject decision
if r > cache.step_threshold # accept
take_step!(cache)
cache.loss = cache.loss_new
cache.make_new_J = true

Check warning on line 467 in src/trustRegion.jl

View check run for this annotation

Codecov / codecov/patch

src/trustRegion.jl#L464-L467

Added lines #L464 - L467 were not covered by tests
else # reject
cache.make_new_J = false

Check warning on line 469 in src/trustRegion.jl

View check run for this annotation

Codecov / codecov/patch

src/trustRegion.jl#L469

Added line #L469 was not covered by tests
end

if r < 1 // 4
cache.trust_r = (1 // 4) * norm(cache.du)
elseif (r > (3 // 4)) && abs(norm(cache.du) - cache.trust_r)/cache.trust_r < 1e-6
cache.trust_r = min(2*cache.trust_r, cache.max_trust_r)

Check warning on line 475 in src/trustRegion.jl

View check run for this annotation

Codecov / codecov/patch

src/trustRegion.jl#L472-L475

Added lines #L472 - L475 were not covered by tests
end

elseif radius_update_scheme === RadiusUpdateSchemes.Hei
Expand Down Expand Up @@ -566,41 +568,42 @@
end

function dogleg!(cache::TrustRegionCache{true})
@unpack u_tmp, u_cauchy, trust_r = cache
@unpack u_tmp, u_gauss_newton, u_cauchy, trust_r = cache

# Take the full Gauss-Newton step if lies within the trust region.
if norm(u_tmp) ≤ trust_r
cache.du .= u_tmp
if norm(u_gauss_newton) ≤ trust_r
cache.du .= u_gauss_newton
return
end

# Take intersection of steepest descent direction and trust region if Cauchy point lies outside of trust region
l_grad = norm(cache.g) # length of the gradient
d_cauchy = l_grad^3 / dot(cache.g, cache.H, cache.g) # distance of the cauchy point from the current iterate
if d_cauchy > trust_r
if d_cauchy >= trust_r
@. cache.du = - (trust_r/l_grad) * cache.g # step to the end of the trust region
return
end

# Take the intersection of dogled with trust region if Cauchy point lies inside the trust region
@. u_cauchy = - (d_cauchy/l_grad) * cache.g # compute Cauchy point
@. u_tmp -= u_cauchy # calf of the dogleg -- use u_tmp to avoid allocation
@. u_tmp = u_gauss_newton - u_cauchy # calf of the dogleg -- use u_tmp to avoid allocation

Check warning on line 589 in src/trustRegion.jl

View check run for this annotation

Codecov / codecov/patch

src/trustRegion.jl#L588-L589

Added lines #L588 - L589 were not covered by tests

a = dot(u_tmp, u_tmp)
b = 2*dot(u_cauchy, u_tmp)
c = d_cauchy^2 - trust_r^2
aux = max(b^2 - 4*a*c, 0.0) # technically guaranteed to be non-negative but hedging against floating point issues
τ = (-b + sqrt(aux)) / (2*a) # stepsize along dogleg to trust region boundary

Check warning on line 595 in src/trustRegion.jl

View check run for this annotation

Codecov / codecov/patch

src/trustRegion.jl#L591-L595

Added lines #L591 - L595 were not covered by tests

@. cache.du = u_cauchy + τ * u_tmp

Check warning on line 597 in src/trustRegion.jl

View check run for this annotation

Codecov / codecov/patch

src/trustRegion.jl#L597

Added line #L597 was not covered by tests
end


function dogleg!(cache::TrustRegionCache{false})
@unpack u_tmp, u_cauchy, trust_r = cache
@unpack u_tmp, u_gauss_newton, u_cauchy, trust_r = cache

# Take the full Gauss-Newton step if lies within the trust region.
if norm(u_tmp) ≤ trust_r
cache.du = deepcopy(u_tmp)
if norm(u_gauss_newton) ≤ trust_r
cache.du = deepcopy(u_gauss_newton)
return
end

Expand All @@ -614,7 +617,7 @@

# Take the intersection of dogled with trust region if Cauchy point lies inside the trust region
u_cauchy = - (d_cauchy/l_grad) * cache.g # compute Cauchy point
u_tmp -= u_cauchy # calf of the dogleg -- use u_tmp to avoid allocation
u_tmp = u_gauss_newton - u_cauchy # calf of the dogleg
a = dot(u_tmp, u_tmp)
b = 2*dot(u_cauchy, u_tmp)
c = d_cauchy^2 - trust_r^2
Expand Down
Loading