Skip to main content
Log in

Globally convergent coderivative-based generalized Newton methods in nonsmooth optimization

  • Full Length Paper
  • Series A
  • Published:
Mathematical Programming Submit manuscript

Abstract

This paper proposes and justifies two globally convergent Newton-type methods to solve unconstrained and constrained problems of nonsmooth optimization by using tools of variational analysis and generalized differentiation. Both methods are coderivative-based and employ generalized Hessians (coderivatives of subgradient mappings) associated with objective functions, which are either of class \({{\mathcal {C}}}^{1,1}\), or are represented in the form of convex composite optimization, where one of the terms may be extended-real-valued. The proposed globally convergent algorithms are of two types. The first one extends the damped Newton method and requires positive-definiteness of the generalized Hessians for its well-posedness and efficient performance, while the other algorithm is of the regularized Newton-type being well-defined when the generalized Hessians are merely positive-semidefinite. The obtained convergence rates for both methods are at least linear, but become superlinear under the semismooth\(^*\) property of subgradient mappings. Problems of convex composite optimization are investigated with and without the strong convexity assumption on smooth parts of objective functions by implementing the machinery of forward–backward envelopes. Numerical experiments are conducted for Lasso problems and for box constrained quadratic programs with providing performance comparisons of the new algorithms and some other first-order and second-order methods that are highly recognized in nonsmooth optimization.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Notes

  1. https://web.stanford.edu/ boyd/papers/admm/lasso/lasso.html.

  2. https://github.com/he9180/FISTA-lasso.

  3. https://www.polyu.edu.hk/ama/profile/dfsun/.

References

  1. Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces, 2nd edn. Springer, New York (2017)

    Google Scholar 

  2. Beck, A.: Introduction to Nonlinear Optimization: Theory, Algorithms, and Applications with MATLAB. SIAM, Philadelphia, PA (2014)

    Google Scholar 

  3. Beck, A.: First-Order Methods in Optimization. SIAM, Philadelphia, PA (2017)

    Google Scholar 

  4. Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2, 183–202 (2009)

    MathSciNet  Google Scholar 

  5. Becker, S., Fadili, M.J.: A quasi-Newton proximal splitting method. Adv. Neural Inform. Process. Syst. 25, 2618–2626 (2012)

    Google Scholar 

  6. Bertsekas, D.P.: Nonlinear Programming, 3rd edn. Athena Scientific, Belmont, MA (2016)

    Google Scholar 

  7. Bonnans, J.F.: Local analysis of Newton-type methods for variational inequalities and nonlinear programming. Appl. Math. Optim. 29, 161–186 (1994)

    MathSciNet  Google Scholar 

  8. Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learning 3, 1–122 (2010)

    Google Scholar 

  9. Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004)

    Google Scholar 

  10. Chieu, N.H., Chuong, T.D., Yao, J.-C., Yen, N.D.: Characterizing convexity of a function by its Fréchet and limiting second-order subdifferentials. Set-Valued Var. Anal. 19, 75–96 (2011)

    MathSciNet  Google Scholar 

  11. Chieu, N.H., Lee, G.M., Yen, N.D.: Second-order subdifferentials and optimality conditions for \({{\cal{C} }}^1\)-smooth optimization problems. Appl. Anal. Optim. 1, 461–476 (2017)

    MathSciNet  Google Scholar 

  12. Chieu, N.M., Hien, L.V., Nghia, T.T.A.: Characterization of tilt stability via subgradient graphical derivative with applications to nonlinear programming. SIAM J. Optim. 28, 2246–2273 (2018)

    MathSciNet  Google Scholar 

  13. Colombo, G., Henrion, R., Hoang, N.D., Mordukhovich, B.S.: Optimal control of the sweeping process over polyhedral controlled sets. J. Diff. Eqs. 260, 3397–3447 (2016)

    MathSciNet  Google Scholar 

  14. Combettes, P.L., Pesquet, J.-C.: Proximal splitting methods in signal processing. In: Bauschke, H.H., et al. (eds.) Fixed-Point Algorithms for Inverse Problems in Science and Engineering, pp. 185–212. Springer, New York (2011)

    Google Scholar 

  15. Dan, H., Yamashita, N., Fukushima, M.: Convergence properties of the inexact Levenberg–Marquardt method under local error bound conditions. Optim. Meth. Softw. 17, 605–626 (2002)

    MathSciNet  Google Scholar 

  16. Dennis, J.E., Moré, J.J.: Quasi-Newton methods, motivation and theory. SIAM Rev. 19, 46–89 (1977)

    MathSciNet  Google Scholar 

  17. Ding, C., Sun, D., Ye, J.J.: First-order optimality conditions for mathematical programs with semidefinite cone complementarity constraints. Math. Program. 147, 539–379 (2014)

    MathSciNet  Google Scholar 

  18. Dias, S., Smirnov, G.: On the Newton method for set-valued maps. Nonlinear Anal. TMA 75, 1219–1230 (2012)

    MathSciNet  Google Scholar 

  19. Dontchev, A.L., Rockafellar, R.T.: Characterizations of strong regularity for variational inequalities over polyhedral convex sets. SIAM J. Optim. 6, 1087–1105 (1996)

    MathSciNet  Google Scholar 

  20. Dontchev, A.L., Rockafellar, R.T.: Implicit Functions and Solution Mappings: A View from Variational Analysis, 2nd edn. Springer, New York (2014)

    Google Scholar 

  21. Drusvyatskiy, D., Lewis, A.S.: Tilt stability, uniform quadratic growth, and strong metric regularity of the subdifferential. SIAM J. Optim. 23, 256–267 (2013)

    MathSciNet  Google Scholar 

  22. Drusvyatskiy, D., Mordukhovich, B.S., Nghia, T.T.A.: Second-order growth, tilt stability, and metric regularity of the subdifferential. J. Convex Anal. 21, 1165–1192 (2014)

    MathSciNet  Google Scholar 

  23. Emich, K., Henrion, R.: A simple formula for the second-order subdifferential of maximum functions. Vietnam J. Math. 42, 467–478 (2014)

    MathSciNet  Google Scholar 

  24. Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression. Ann. Statist. 32, 407–499 (2004)

    MathSciNet  Google Scholar 

  25. Facchinei, F.: Minimization of SC1 functions and the Maratos effect. Oper. Res. Lett. 17, 131–137 (1995)

    MathSciNet  Google Scholar 

  26. Facchinei, F., Pang, J.-C.: Finite-Dimensional Variational Inequalities and Complementarity Problems, vol. II. Springer, New York (2003)

    Google Scholar 

  27. Friedlander, M.P., Goodwin, A., Hoheisel, T.: From perspective maps to epigraphical projections. Math. Oper. Res. (2022). https://doi.org/10.1287/moor.2022.1317

    Article  Google Scholar 

  28. Gabay, D., Mercier, B.: A dual algorithm for the solution of nonlinear variational problems via finite element approximations. Comput. Math. Appl. 2, 17–40 (1976)

    Google Scholar 

  29. Glowinski, R., Marroco, A.: Sur l’approximation, par éléments finis d’ordre un, et la résolution, par pénalisation-dualité, d’une classe de problémes de Dirichlet non linéares. Revue Francaise d’Automatique, Informatique et Recherche Operationelle 9, 41–76 (1975)

    Google Scholar 

  30. Gfrerer, H.: On directional metric regularity, subregularity and optimality conditions for nonsmooth mathematical programs. Set-Valued Var. Anal. 21, 151–176 (2013)

    MathSciNet  Google Scholar 

  31. Gfrerer, H., Mordukhovich, B.S.: Complete characterization of tilt stability in nonlinear programming under weakest qualification conditions. SIAM J. Optim. 25, 2081–2119 (2015)

    MathSciNet  Google Scholar 

  32. Gfrerer, H., Outrata, J.V.: On a semismooth\(^*\) Newton method for solving generalized equations. SIAM J. Optim. 31, 489–517 (2021)

    MathSciNet  Google Scholar 

  33. Ginchev, I., Mordukhovich, B.S.: On directionally dependent subdifferentials. C. R. Acad. Bulg. Sci. 64, 497–508 (2011)

    MathSciNet  Google Scholar 

  34. Hang, N.T.V., Mordukhovich, B.S., Sarabi, M.E.: Augmented Lagrangian method for second-order conic programs under second-order sufficiency. J. Glob. Optim. 82, 51–81 (2022)

    Google Scholar 

  35. Henrion, R., Mordukhovich, B.S., Nam, N.M.: Second-order analysis of polyhedral systems in finite and infinite dimensions with applications to robust stability of variational inequalities. SIAM J. Optim. 20, 2199–2227 (2010)

    MathSciNet  Google Scholar 

  36. Henrion, R., Römisch, W.: On \(M\)-stationary points for a stochastic equilibrium problem under equilibrium constraints in electricity spot market modeling. Appl. Math. 52, 473–494 (2007)

    MathSciNet  Google Scholar 

  37. Henrion, R., Outrata, J., Surowiec, T.: On the co-derivative of normal cone mappings to inequality systems. Nonlinear Anal. 71, 1213–1226 (2009)

    MathSciNet  Google Scholar 

  38. Hestenes, M.R.: Multiplier and gradient methods. J. Optim. Theory Appl. 4, 303–320 (1969)

    MathSciNet  Google Scholar 

  39. Hintermüller, M., Ito, K., Kunisch, K.: The primal-dual active set strategy as a semismooth Newton method. SIAM J. Optim. 13, 865–888 (2002)

    MathSciNet  Google Scholar 

  40. Hiriart-Urruty, J.-B., Strodiot, J.-J., Nguyen, V.H.: Generalized Hessian matrix and second-order optimality conditions for problems with \(\cal{C} ^{1,1}\) data. Appl. Math. Optim. 11, 43–56 (1984)

    MathSciNet  Google Scholar 

  41. Ho, C.H., Lin, C.J.: Large-scale linear support vector regression. J. Mach. Learn. Res. 13, 3323–3348 (2012)

    MathSciNet  Google Scholar 

  42. Hoheisel, T., Kanzow, C., Mordukhovich, B.S., Phan, H.M.: Generalized Newton’s methods for nonsmooth equations based on graphical derivatives, Nonlinear Anal. 75, 1324–1340 (2012); Erratum in Nonlinear Anal. 86, 157–158 (2013)

  43. Hsieh, C.J., Chang, K.W., Lin, C.J.: A dual coordinate descent method for large-scale linear SVM. Proceedings 25th International Conference on Machine Learning, pp. 408–415. Helsinki, Finland (2008)

  44. Izmailov, A.F., Solodov, M.V.: Newton-Type Methods for Optimization and Variational Problems. Springer, New York (2014)

    Google Scholar 

  45. Izmailov, A.F., Solodov, M.V., Uskov, E.T.: Globalizing stabilized sequential quadratic programming method by smooth primal-dual exact penalty function. J. Optim. Theor. Appl. 169, 1–31 (2016)

    MathSciNet  Google Scholar 

  46. Khanh, P.D., Mordukhovich, B.S., Phat, V.T.: A generalized Newton method for subgradient systems. Math. Oper. Res. (2022). https://doi.org/10.1287/moor.2022.1320

    Article  Google Scholar 

  47. Klatte, D., Kummer, B.: Nonsmooth Equations in Optimization. Regularity, Calculus, Methods and Applications. Kluwer Academic Publishers, Dordrecht (2002)

    Google Scholar 

  48. Kummer, B.: Newton’s method for non-differentiable functions. In: Guddat et al. (eds) Advances in Mathematical Optimization, pp. 114–124. Akademie-Verlag, Berlin (1988)

  49. Lee, J.D., Sun, Y., Saunders, M.A.: Proximal Newton-type methods for minimizing composite functions. SIAM J. Optim. 24, 1420–1443 (2014)

    MathSciNet  Google Scholar 

  50. Li, D.H., Fukushima, M., Qi, L., Yamashita, N.: Regularized Newton methods for convex minimization problems with singular solutions. Comput. Optim. Appl. 28, 131–147 (2004)

    MathSciNet  Google Scholar 

  51. Li, X., Sun, D., Toh, K.-C.: A highly efficient semismooth Newton augmented Lagrangian method for solving Lasso problems. SIAM J. Optim. 28, 433–458 (2018)

    MathSciNet  Google Scholar 

  52. Lions, P.-L., Mercier, B.: Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numer. Anal. 16, 964–979 (1979)

    MathSciNet  Google Scholar 

  53. Meng, F., Sun, D., Zhao, Z.: Semismoothness of solutions to generalized equations and the Moreau–Yosida regularization. Math. Program. 104, 561–581 (2005)

    MathSciNet  Google Scholar 

  54. Mohammadi, A., Mordukhovich, B.S., Sarabi, M.E.: Variational analysis of composite models with applications to continuous optimization. Math. Oper. Res. 47, 397–426 (2022)

    MathSciNet  Google Scholar 

  55. Mohammadi, A., Mordukhovich, B.S., Sarabi, M.E.: Parabolic regularity in geometric variational analysis. Trans. Amer. Math. Soc. 374, 1711–1763 (2021)

    MathSciNet  Google Scholar 

  56. Mohammadi, A., Sarabi, M.E.: Twice epi-differentiability of extended-real-valued functions with applications in composite optimization. SIAM J. Optim. 30, 2379–2409 (2020)

    MathSciNet  Google Scholar 

  57. Mordukhovich, B.S.: Sensitivity analysis in nonsmooth optimization. In: Field, D.A., Komkov, V.(eds) Theoretical Aspects of Industrial Design, pp. 32–46. SIAM Proc. Appl. Math. 58. Philadelphia, PA (1992)

  58. Mordukhovich, B.S.: Complete characterizations of openness, metric regularity, and Lipschitzian properties of multifunctions. Trans. Am. Math. Soc. 340, 1–35 (1993)

    MathSciNet  Google Scholar 

  59. Mordukhovich, B.S.: Variational Analysis and Generalized Differentiation, I: Basic Theory, II: Applications. Springer, Berlin (2006)

    Google Scholar 

  60. Mordukhovich, B.S.: Variational Analysis and Applications. Springer, Cham (2018)

    Google Scholar 

  61. Mordukhovich, B.S., Nghia, T.T.A.: Second-order characterizations of tilt stability with applications to nonlinear programming. Math. Program. 149, 83–104 (2015)

    MathSciNet  Google Scholar 

  62. Mordukhovich, B.S., Nghia, T.T.A.: Local monotonicity and full stability of parametric variational systems. SIAM J. Optim. 26, 1032–1059 (2016)

    MathSciNet  Google Scholar 

  63. Mordukhovich, B.S., Outrata, J.V.: On second-order subdifferentials and their applications. SIAM J. Optim. 12, 139–169 (2001)

    MathSciNet  Google Scholar 

  64. Mordukhovich, B.S., Outrata, J.V., Sarabi, M.E.: Full stability of local optimal solutions in second-order cone programming. SIAM J. Optim. 14, 1581–1613 (2014)

    Google Scholar 

  65. Mordukhovich, B.S., Rockafellar, R.T.: Second-order subdifferential calculus with applications to tilt stability in optimization. SIAM J. Optim. 22, 953–986 (2012)

    MathSciNet  Google Scholar 

  66. Mordukhovich, B.S., Rockafellar, R.T., Sarabi, M.E.: Characterizations of full stability in constrained optimization. SIAM J. Optim. 23, 1810–1849 (2013)

    MathSciNet  Google Scholar 

  67. Mordukhovich, B.S., Sarabi, M.E.: Generalized differentiation of piecewise linear functions in second-order variational analysis. Nonlinear Anal. 132, 240–273 (2016)

    MathSciNet  Google Scholar 

  68. Mordukhovich, B.S., Sarabi, M.E.: Generalized Newton algorithms for tilt-stable minimizers in nonsmooth optimization. SIAM J. Optim. 31, 1184–1214 (2021)

    MathSciNet  Google Scholar 

  69. Mordukhovich, B.S., Yuan, X., Zheng, S., Zhang, J.: A globally convergent proximal Newton-type method in nonsmooth convex optimization. Math. Program. 198, 899–936 (2023)

    MathSciNet  Google Scholar 

  70. Nam, N.M.: Coderivatives of normal cone mappings and Lipschitzian stability of parametric variational inequalities. Nonlinear Anal. 73, 2271–2282 (2010)

    MathSciNet  Google Scholar 

  71. Nesterov, Yu.: Lectures on Convex Optimization, 2nd edn. Springer, Cham (2018)

    Google Scholar 

  72. Nocedal, J., Wright, S.: Numerical Optimization. Springer, New York (2006)

    Google Scholar 

  73. Outrata, J.V., Sun, D.: On the coderivative of the projection operator onto the second-order cone. Set-Valued Anal. 16, 999–1014 (2008)

    MathSciNet  Google Scholar 

  74. Pang, J.S.: Newton’s method for B-differentiable equations. Math. Oper. Res. 15, 311–341 (1990)

    MathSciNet  Google Scholar 

  75. Pang, J.S., Qi, L.: A globally convergent Newton method for convex SC1 minimization problems. J. Optim. Theory Appl. 85, 633–648 (1995)

    MathSciNet  Google Scholar 

  76. Patrinos, P., Bemporad, A.: Proximal Newton methods for convex composite optimization. In: IEEE Conference on Decision and Control, pp. 2358–2363 (2013)

  77. Patrinos, P., Stella, L., Bemporad, A.: Forward-backward truncated Newton methods for convex composite optimization. arXiv:1402.6655 (2014)

  78. Pelckmans, K., De Brabanter, J., De Moor, B., Suykens, J.A.K.: Convex clustering shrinkage. In: PASCAL Workshop on Statistics and Optimization of Clustering, pp. 1–6. London (2005)

  79. Poliquin, R.A., Rockafellar, R.T.: Tilt stability of a local minimum. SIAM J. Optim. 8, 287–299 (1998)

    MathSciNet  Google Scholar 

  80. Polyak, B.T.: Introduction to Optimization. Optimization Software, New York (1987)

    Google Scholar 

  81. Powell, M.J.D.: A method for nonlinear constraints in minimization problems. In: Fletcher, R. (ed.) Optimization, pp. 283–298. Academic Press, New York (1969)

    Google Scholar 

  82. Qi, L.: Convergence analysis of some algorithms for solving nonsmooth equations. Math. Oper. Res. 18, 227–244 (1993)

    MathSciNet  Google Scholar 

  83. Qi, L., Sun, J.: A nonsmooth version of Newton’s method. Math. Program. 58, 353–367 (1993)

    MathSciNet  Google Scholar 

  84. Qui, N.T.: Generalized differentiation of a class of normal cone operators. J. Optim. Theory Appl. 161, 398–429 (2014)

    MathSciNet  Google Scholar 

  85. Robinson, S.M.: Newton’s method for a class of nonsmooth functions. Set-Valued Anal. 2, 291–305 (1994)

    MathSciNet  Google Scholar 

  86. Rockafellar, R.T.: Augmented Lagrangian multiplier functions and duality in nonconvex programming. SIAM J. Control 12, 268–285 (1974)

    MathSciNet  Google Scholar 

  87. Rockafellar, R.T.: Augmented Lagrangians and hidden convexity in sufficient conditions for local optimality. Math. Program. 198, 159–194 (2023)

    MathSciNet  Google Scholar 

  88. Rockafellar, R.T., Wets, R.J.-B.: Variational Analysis. Springer, Berlin (1998)

    Google Scholar 

  89. She, Y.: Sparse regression with exact clustering. Elect. J. Stat. 4, 1055–1096 (2010)

    MathSciNet  Google Scholar 

  90. Stella, L., Themelis, A., Patrinos, P.: Forward-backward quasi-Newton methods for nonsmooth optimization problems. Comput. Optim. Appl. 67, 443–487 (2017)

    MathSciNet  Google Scholar 

  91. Stella, L., Themelis, A., Patrinos, P.: Forward-backward envelope for the sum of two nonconvex functions: further properties and nonmonotone linesearch algorithms. SIAM J. Optim. 28, 2274–2303 (2018)

    MathSciNet  Google Scholar 

  92. Tibshirani, R.: Regression shrinkage and selection via the Lasso. J. R. Stat. Soc. 58, 267–288 (1996)

    MathSciNet  Google Scholar 

  93. Ulbrich, M.: Semismooth Newton Methods for Variational Inequalities and Constrained Optimization Problems in Function Spaces. SIAM, Philadelphia, PA (2011)

    Google Scholar 

  94. Yamashita, N., Fukushima, M.: On the rate of convergence of the Levenberg–Marquardt method. In: Alefeld, G., Chen, X. (eds.) Topics in Numerical Analysis, vol. 15, pp. 239–249. Springer, Vienna (2001)

    Google Scholar 

  95. Yao, J.-C., Yen, N.D.: Coderivative calculation related to a parametric affine variational inequality. Part 1: Basic calculation. Acta Math. Vietnam. 34, 157–172 (2009)

    MathSciNet  Google Scholar 

Download references

Acknowledgements

The authors are very grateful to three anonymous referees for their helpful remarks and suggestions, which allowed us to significantly improve the original presentation.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Pham Duy Khanh or Boris S. Mordukhovich.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Pham Duy Khanh: Research of this author is funded by the Ministry of Education and Training Research Funding under grant B2023-SPS-02, Boris S. Mordukhovich: Research of this author was partly supported by the US National Science Foundation under Grants DMS-1808978 and DMS-2204519, by the US Air Force Office of Scientific Research under Grant #15RT0462, and by the Australian Research Council under Discovery Project DP-190100555. Vo Thanh Phat: Research of this author was partly supported by the US National Science Foundation under Grants DMS-1808978 and DMS-2204519, and by the US Air Force Office of Scientific Research under Grant #15RT0462, Dat Ba Tran: Research of this author was partly supported by the US National Science Foundation under Grant DMS-1808978 and DMS-2204519.

Appendix: Some technical lemmas

Appendix: Some technical lemmas

This section contains four technical lemmas used in the text.

The first lemma is a local version of [44, Lemma 2.20]. The proof of this result is similar to the original one, and thus it is omitted.

Lemma 3

Let \(\Omega \subset \mathrm{I\!R}^n\) be an open set, and let \(\varphi :\Omega \rightarrow \mathrm{I\!R}\) be a continuously differentiable function such that \(\nabla \varphi \) is Lipschitz continuous with modulus \(L >0\). Then for any \(\sigma >0\), \(x \in \Omega \), and \(d \in \mathrm{I\!R}^n\) satisfying \(\langle \nabla \varphi (x),d\rangle <0\), the following inequality

$$\begin{aligned} \varphi (x+\tau d)\le \varphi (x) +\sigma \tau \langle \nabla \varphi (x),d\rangle \end{aligned}$$
(7.1)

holds whenever \(\tau \in (0,\overline{\tau }]\) and \(x+\tau d \in \Omega \), where

$$\begin{aligned} \overline{\tau }:= \frac{2(\sigma -1)\langle \nabla \varphi (x),d\rangle }{L\Vert d\Vert ^2}>0. \end{aligned}$$

The next lemma provides conditions for the R-linear and Q-linear convergence of sequences.

Lemma 4

(estimates for convergence rates) Let \(\{\alpha _k\}, \{\beta _k \}\), and \(\{\gamma _k \}\) be sequences of positive numbers. Assume that there exist numbers \(c_i>0\), \(i=1,2,3\), and \(k_0\in \mathrm{I\!N}\) such that for all \(k \ge k_0\) we have the estimates:

  1. (i)

    \(\alpha _k - \alpha _{k+1} \ge c_1 \beta _k^2\).

  2. (ii)

    \(\beta _k \ge c_2 \gamma _k\).

  3. (iii)

    \(c_3 \gamma _k^2 \ge \alpha _k\).

Then the sequence \(\{\alpha _k \}\) Q-linearly converges to zero, and the sequences \(\{\beta _k \}\) and \(\{\gamma _k \}\) R-linearly converge to zero as \(k\rightarrow \infty \).

Proof

Combining (i), (ii), and (iii) yields the inequalities

$$\begin{aligned} \alpha _{k} - \alpha _{k+1} \ge c_1 \beta _k^2 \ge c_1 c_2^2 \gamma _k^2 \ge \frac{c_1c_2^2}{c_3}\alpha _{k} \quad \text {for all }\; k \ge k_0, \end{aligned}$$

which imply that \(\alpha _{k+1}\le q \alpha _k\), where \(q:= 1- (c_1c_2^2)/c_3\in (0,1)\). This verifies that the sequence \(\{\alpha _{k}\}\) Q-linearly converges to zero. Using the latter and the assumed condition (i) ensures that

$$\begin{aligned} \beta _k^2 \le \frac{1}{c_1}(\alpha _k - \alpha _{k+1}) \le \frac{1}{c_1}\alpha _k \le \frac{1}{c_1}q\alpha _{k-1} \le \ldots \le \frac{1}{c_1} q^k \alpha _0 \quad \text {for all }\; k\ge k_0, \end{aligned}$$

which tells us that \(\beta _k \le c\mu ^k\), where \(c:= \sqrt{\alpha _0/c_1}\) and \(\mu :=\sqrt{q}\). This justifies the R-linear convergence of the sequence \(\{\beta _k \}\) to zero. Furthermore, it easily follows from (ii) that the sequence \(\{\gamma _k \}\) R-linearly converge to zero, and thus we are done with the proof. \(\square \)

Now we obtain a useful result on the semismooth\(^*\) property of compositions.

Lemma 5

(semismooth\(^*\) property of composition mappings) Let \(A \in \mathrm{I\!R}^{n\times n}\) be a symmetric nonsingular matrix, \(b \in \mathrm{I\!R}^n\), \({\bar{x}}\in \mathrm{I\!R}^n\), and \(f:\mathrm{I\!R}^n \rightarrow \mathrm{I\!R}^n\) be continuous and semismooth\(^*\) at \({\bar{y}}:=A{\bar{x}}+b\). Then the mapping \(g: \mathrm{I\!R}^n \rightarrow \mathrm{I\!R}^n\) defined by \(g(x):= f(Ax+b)\) is semismooth\(^*\) at \({\bar{x}}\).

Proof

Using the coderivative chain rule from [59, Theorem 1.66], we get

$$\begin{aligned} D^*g(x)(w)= A^* D^*f(Ax+b)(w) \quad \text {for all }\; w \in \mathrm{I\!R}^n. \end{aligned}$$
(7.2)

Denote \(\mu := \sqrt{\textrm{max}\{1,\Vert A\Vert ^2 \}\cdot \textrm{max}\{1,\Vert A^{-1}\Vert ^2\}}>0\). Picking any \(\varepsilon >0\) and employing the semismooth\(^*\) property of f at \({\bar{y}}\), we find \(\delta >0\) such that

$$\begin{aligned} |\langle x^*, y-{\bar{y}}\rangle - \langle y^*, f(y) - f({\bar{y}})\rangle |\le \frac{\varepsilon }{\mu } \big \Vert (y-{\bar{y}},f(y)-f({\bar{y}}))\big \Vert \cdot \Vert (x^*,y^*)\Vert \nonumber \\ \end{aligned}$$
(7.3)

for all \(y \in {\mathbb {B}}_\delta ({\bar{y}})\) and all \((x^*,y^*) \in \textrm{gph}\,D^*f(y)\). Denoting \(r:= \delta / \Vert A\Vert >0\) gives us \(y:=Ax+b \in {\mathbb {B}}_\delta ({\bar{y}})\) whenever \(x \in {\mathbb {B}}_r({\bar{x}})\). Picking now \(x \in {\mathbb {B}}_r({\bar{x}})\) and \((z^*,w^*) \in \textrm{gph}\,D^*g(x)\), we get \((A^{-1}z^*,w^*)\in \textrm{gph}\,D^*f(Ax+b)\) due to (7.2). It follows from (7.3) that

$$\begin{aligned} |\langle z^*, x-{\bar{x}}\rangle - \langle w^*,g(x) - g({\bar{x}})\rangle |= & {} |\langle A^{-1}z^*, y-{\bar{y}}\rangle - \langle w^*, f(y) - f({\bar{y}})\rangle | \\\le & {} \frac{\varepsilon }{\mu }\big \Vert (y-{\bar{y}},f(y)-f({\bar{y}}))\Vert \cdot \big \Vert (A^{-1}z^*,w^*)\big \Vert \\= & {} \frac{\varepsilon }{\mu }\big \Vert (Ax-A{\bar{x}},g(x)-g({\bar{x}})) \big \Vert \cdot \big \Vert (A^{-1}z^*,w^*)\big \Vert \\\le & {} \varepsilon \Vert (x- {\bar{x}},g(x)-g({\bar{x}}))\Vert \cdot \Vert (z^*,w^*)\Vert , \end{aligned}$$

which verifies the semismooth\(^*\) property of g at \({\bar{x}}\). \(\square \)

The final lemma establishes tilt stability of strongly convex functions at stationary points.

Lemma 6

(strong convexity and tilt-stability) Let \(\varphi :\mathrm{I\!R}^n\rightarrow \overline{\mathrm{I\!R}}\) be an l.s.c. and strongly convex function with modulus \(\kappa >0\), and let \({\bar{x}}\in \textrm{dom}\,\varphi \) such that \(0\in \partial \varphi ({\bar{x}})\). Then \({\bar{x}}\) is a tilt-stable local minimizer of \(\varphi \) with modulus \(\kappa ^{-1}\).

Proof

By the second-order characterization of strongly convex functions [10, Theorem 5.1], we have

$$\begin{aligned}\langle z, w\rangle \ge \kappa \Vert w\Vert ^2 \quad \text {for all }\; z \in \partial ^2\varphi (x,y)(w),\; (x,y) \in \textrm{gph}\,\partial \varphi ,\;\text{ and } \;w \in \mathrm{I\!R}^n.\end{aligned}$$

This implies in turn that \({\bar{x}}\) is a tilt-stable local minimizer with modulus \(\kappa ^{-1}\) by second-order characterization of tilt stability taken from [61, Theorem 3.5]. \(\square \)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Khanh, P.D., Mordukhovich, B.S., Phat, V.T. et al. Globally convergent coderivative-based generalized Newton methods in nonsmooth optimization. Math. Program. 205, 373–429 (2024). https://doi.org/10.1007/s10107-023-01980-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10107-023-01980-2

Keywords

Mathematics Subject Classification

Navigation