From 4d579877fe410f81336ad9be1fe69657a0b13c3b Mon Sep 17 00:00:00 2001 From: MohamedLaghdafHABIBOULLAH Date: Wed, 4 Feb 2026 14:35:23 -0500 Subject: [PATCH 1/3] Address editor's comments --- paper/paper.bib | 11 +++++++++++ paper/paper.md | 16 ++++++++-------- 2 files changed, 19 insertions(+), 8 deletions(-) diff --git a/paper/paper.bib b/paper/paper.bib index d997f8aa..83068dd8 100644 --- a/paper/paper.bib +++ b/paper/paper.bib @@ -159,3 +159,14 @@ @article{ eckstein-bertsekas-1992 publisher = {Springer}, doi = {10.1007/BF01581204} } + +@techreport{allaire-le-digabel-orban-2025, + title = {An inexact modified quasi-Newton method for nonsmooth regularized optimization}, + author = {Nathan Allaire and S{\'e}bastien Le Digabel and Dominique Orban}, + institution = {GERAD}, + type = {Cahier}, + number = {G-2025-73}, + year = {2025}, + address = {Montr{\'e}al, Canada}, + url = {10.13140/RG.2.2.32728.97288} +} diff --git a/paper/paper.md b/paper/paper.md index 1ebb6067..8b673121 100644 --- a/paper/paper.md +++ b/paper/paper.md @@ -50,11 +50,11 @@ Currently, the following solvers are implemented: All solvers rely on first derivatives of $f$ and $c$, and optionally on their second derivatives in the form of Hessian-vector products. If second derivatives are not available, quasi-Newton approximations can be used. In addition, the proximal mapping of the nonsmooth part $h$, or adequate models thereof, must be evaluated. -At each iteration, a step is computed by solving a subproblem of the form \eqref{eq:nlp} inexactly, in which $f$, $h$, and $c$ are replaced with appropriate models about the current iterate. +At each iteration, a step is computed by solving a subproblem of the form \eqref{eq:nlp} inexactly, in which $f$, $h$, and $c$ are replaced with appropriate models around the current iterate. The solvers R2, R2DH and TRDH are particularly well suited to solve the subproblems, though they are general enough to solve \eqref{eq:nlp}. -All solvers are implemented in place, so re-solves incur no allocations. +All solvers are allocation-free, so re-solves incur no additional allocations. To illustrate our claim of extensibility, a first version of the AL solver was implemented by an external contributor. -Furthermore, a nonsmooth penalty approach, described in [@diouane-gollier-orban-2024] is currently being developed, that relies on the library to efficiently solve the subproblems. +Furthermore, a nonsmooth penalty approach, described in [@diouane-gollier-orban-2024], is currently being developed, that relies on the library to efficiently solve the subproblems. @@ -85,7 +85,7 @@ Given $f$ and $h$, the companion package [RegularizedProblems.jl](https://github reg_nlp = RegularizedNLPModel(f, h) ``` -They can also be paired into a *Regularized Nonlinear Least-Squares Model* if $f(x) = \tfrac{1}{2} \|F(x)\|^2$ for some residual $F: \mathbb{R}^n \to \mathbb{R}^m$, in the case of the **LM** and **LMTR** solvers. +They can also be paired into a *Regularized Nonlinear Least-Squares Model*, used by the **LM** and **LMTR** solvers, if $f(x) = \tfrac{1}{2} \|F(x)\|^2$ for some residual $F: \mathbb{R}^n \to \mathbb{R}^m$. ```julia reg_nls = RegularizedNLSModel(F, h) @@ -96,7 +96,7 @@ This design makes for a convenient source of problem instances for benchmarking ## Support for both exact and approximate Hessian -In contrast with [ProximalAlgorithms.jl](https://github.com/JuliaFirstOrder/ProximalAlgorithms.jl), [RegularizedOptimization.jl](https://github.com/JuliaSmoothOptimizers/RegularizedOptimization.jl), methods such as **R2N** and **TR** methods support exact Hessians as well as several Hessian approximations of $f$. +In contrast to [ProximalAlgorithms.jl](https://github.com/JuliaFirstOrder/ProximalAlgorithms.jl), [RegularizedOptimization.jl](https://github.com/JuliaSmoothOptimizers/RegularizedOptimization.jl), methods such as **R2N** and **TR** support exact Hessians as well as several Hessian approximations of $f$. Hessian–vector products $v \mapsto Hv$ can be obtained via automatic differentiation through [ADNLPModels.jl](https://github.com/JuliaSmoothOptimizers/ADNLPModels.jl) or implemented manually. Limited-memory and diagonal quasi-Newton approximations can be selected from [LinearOperators.jl](https://github.com/JuliaSmoothOptimizers/LinearOperators.jl). This design allows solvers to exploit second-order information without explicitly forming dense or sparse Hessians, which is often expensive in time and memory, particularly at large scale. @@ -105,7 +105,7 @@ This design allows solvers to exploit second-order information without explicitl We illustrate the capabilities of [RegularizedOptimization.jl](https://github.com/JuliaSmoothOptimizers/RegularizedOptimization.jl) on a Support Vector Machine (SVM) model with a $\ell_{1/2}^{1/2}$ penalty for image classification [@aravkin-baraldi-orban-2024]. -Below is a condensed example showing how to define and solve the problem, and perform a solve followed by a re-solve: +Below is a condensed example showing how to define the problem and perform a solve followed by a re-solve: ```julia using LinearAlgebra, Random, ProximalOperators @@ -129,7 +129,7 @@ solve!(solver, reg_nlp, stats; atol=1e-5, rtol=1e-5, verbose=1, sub_kwargs=(max_ We compare **TR**, **R2N**, **LM** and **LMTR** from our library on the SVM problem. Experiments were performed on macOS (arm64) on an Apple M2 (8-core) machine, using Julia 1.11.7. -The table reports the convergence status of each solver, the number of evaluations of $f$, the number of evaluations of $\nabla f$, the number of proximal operator evaluations, the elapsed time and the final objective value. +The table reports the convergence status of each solver, the number of evaluations of $f$, the number of evaluations of $\nabla f$, the number of proximal operator evaluations, the elapsed time, and the final objective value. For TR and R2N, we use limited-memory SR1 Hessian approximations. The subproblem solver is **R2**. @@ -144,7 +144,7 @@ Note that the final objective values differ due to the nonconvexity of the probl However, it requires more proximal evaluations, but these are inexpensive. **LMTR** and **LM** require the fewest function evaluations, but incur many Jacobian–vector products, and are the slowest in terms of time. -Ongoing research aims to reduce the number of proximal evaluations. +Ongoing research aims to reduce the number of proximal evaluations, for instance by allowing inexact proximal computations [@allaire-le-digabel-orban-2025]. # Acknowledgements From 61189afa136bbbb5eea284d25ccc788ad89c4d95 Mon Sep 17 00:00:00 2001 From: Mohamed Laghdaf <81633807+MohamedLaghdafHABIBOULLAH@users.noreply.github.com> Date: Wed, 4 Feb 2026 16:55:28 -0500 Subject: [PATCH 2/3] Update paper/paper.md Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --- paper/paper.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/paper/paper.md b/paper/paper.md index 8b673121..43d04a87 100644 --- a/paper/paper.md +++ b/paper/paper.md @@ -96,7 +96,7 @@ This design makes for a convenient source of problem instances for benchmarking ## Support for both exact and approximate Hessian -In contrast to [ProximalAlgorithms.jl](https://github.com/JuliaFirstOrder/ProximalAlgorithms.jl), [RegularizedOptimization.jl](https://github.com/JuliaSmoothOptimizers/RegularizedOptimization.jl), methods such as **R2N** and **TR** support exact Hessians as well as several Hessian approximations of $f$. +In contrast to [ProximalAlgorithms.jl](https://github.com/JuliaFirstOrder/ProximalAlgorithms.jl), [RegularizedOptimization.jl](https://github.com/JuliaSmoothOptimizers/RegularizedOptimization.jl) methods such as **R2N** and **TR** support exact Hessians as well as several Hessian approximations of $f$. Hessian–vector products $v \mapsto Hv$ can be obtained via automatic differentiation through [ADNLPModels.jl](https://github.com/JuliaSmoothOptimizers/ADNLPModels.jl) or implemented manually. Limited-memory and diagonal quasi-Newton approximations can be selected from [LinearOperators.jl](https://github.com/JuliaSmoothOptimizers/LinearOperators.jl). This design allows solvers to exploit second-order information without explicitly forming dense or sparse Hessians, which is often expensive in time and memory, particularly at large scale. From ffcfaea5e7bd0b9a0b902f3d77e93114c95f5228 Mon Sep 17 00:00:00 2001 From: Mohamed Laghdaf <81633807+MohamedLaghdafHABIBOULLAH@users.noreply.github.com> Date: Wed, 4 Feb 2026 16:55:34 -0500 Subject: [PATCH 3/3] Update paper/paper.bib Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --- paper/paper.bib | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/paper/paper.bib b/paper/paper.bib index 83068dd8..70230d87 100644 --- a/paper/paper.bib +++ b/paper/paper.bib @@ -168,5 +168,5 @@ @techreport{allaire-le-digabel-orban-2025 number = {G-2025-73}, year = {2025}, address = {Montr{\'e}al, Canada}, - url = {10.13140/RG.2.2.32728.97288} + doi = {10.13140/RG.2.2.32728.97288} }