Change the stopping criterion#281
Change the stopping criterion#281MaxenceGollier wants to merge 3 commits intoJuliaSmoothOptimizers:masterfrom
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #281 +/- ##
===========================================
+ Coverage 61.53% 84.46% +22.93%
===========================================
Files 11 13 +2
Lines 1292 1590 +298
===========================================
+ Hits 795 1343 +548
+ Misses 497 247 -250 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
At least on the CIs, it looks like this does not really change the number of iterations, but we will definitely win in terms of robustness. |
|
In TRDH, if we assume that |
| - `sub_kwargs::NamedTuple = NamedTuple()`: a named tuple containing the keyword arguments to be sent to the subsolver. The solver will fail if invalid keyword arguments are provided to the subsolver. For example, if the subsolver is `R2Solver`, you can pass `sub_kwargs = (max_iter = 100, σmin = 1e-6,)`. | ||
|
|
||
| The algorithm stops either when `√(ξₖ/νₖ) < atol + rtol*√(ξ₀/ν₀) ` or `ξₖ < 0` and `√(-ξₖ/νₖ) < neg_tol` where ξₖ := f(xₖ) + h(xₖ) - φ(sₖ; xₖ) - ψ(sₖ; xₖ), and √(ξₖ/νₖ) is a stationarity measure. | ||
| The algorithm stops when `‖sᶜᵖ‖/ν < atol + rtol*‖s₀ᶜᵖ‖/ν ` where sᶜᵖ ∈ argminₛ f(xₖ) + ∇f(xₖ)ᵀs + ψ(s; xₖ) ½ ν⁻¹ ‖s‖². |
There was a problem hiding this comment.
I would keep both stopping conditions. It's not always true that one measure is smaller than the other.
There was a problem hiding this comment.
Having two stopping conditions in one solver is odd in my opinion.
I could add a keyword argument to switch between these if we really want to keep it.
Numerically, I strongly believe that taking the square root of something that can suffer from catastrophic cancellation and/or be very small is a bad idea that makes the solvers not robust. Please take a look at my comment below where I see a 25% increase in problems I can solve with this PR.
I think neg_tol is also a weird argument to have and that it should be removed from all solvers.
Same argument applies on the implementation of R2DH, we have So I suggest to use rather |
I don't see how |
|
Ok let me correct myself, we have |
To prove my point, see the benchmark results in my penalty solver: In particular the line The other lines do not see such improvements because for Can you please take a look @dpo ? |
To keep things similar, can't we use the last inequality from this comment:
to have a measure based on |
Switch from$$√\xi/\nu$$ to $$‖s_{cp}‖/\nu$$ , following multiple discussions, I think that this will really improve the robustness of the solvers.
I am not sure what I should do for R2DH and TRDH;
We compute
So if the$$‖s‖/‖dkσk‖_{\infty}$$ ?
spectral_testisfalse, what should be the measure ?I will also make the changements for LM and LMTR. We can compare in terms of number of iteration we the other CI runs.