A family of inexact SQA methods for non-smooth convex minimization with provable convergence guarantees based on the Luo–Tseng error bound property

Yue, Man-Chung; Zhou, Zirui; So, Anthony Man-Cho

doi:10.1007/s10107-018-1280-6

A family of inexact SQA methods for non-smooth convex minimization with provable convergence guarantees based on the Luo–Tseng error bound property

Full Length Paper
Series B
Published: 30 April 2018

Volume 174, pages 327–358, (2019)
Cite this article

Mathematical Programming Submit manuscript

1277 Accesses
19 Citations
Explore all metrics

Abstract

We propose a new family of inexact sequential quadratic approximation (SQA) methods, which we call the inexact regularized proximal Newton (IRPN) method, for minimizing the sum of two closed proper convex functions, one of which is smooth and the other is possibly non-smooth. Our proposed method features strong convergence guarantees even when applied to problems with degenerate solutions while allowing the inner minimization to be solved inexactly. Specifically, we prove that when the problem possesses the so-called Luo–Tseng error bound (EB) property, IRPN converges globally to an optimal solution, and the local convergence rate of the sequence of iterates generated by IRPN is linear, superlinear, or even quadratic, depending on the choice of parameters of the algorithm. Prior to this work, such EB property has been extensively used to establish the linear convergence of various first-order methods. However, to the best of our knowledge, this work is the first to use the Luo–Tseng EB property to establish the superlinear convergence of SQA-type methods for non-smooth convex minimization. As a consequence of our result, IRPN is capable of solving regularized regression or classification problems under the high-dimensional setting with provable convergence guarantees. We compare our proposed IRPN with several empirically efficient algorithms by applying them to the \(\ell _1\)-regularized logistic regression problem. Experiment results show the competitiveness of our proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A globally convergent proximal Newton-type method in nonsmooth convex optimization

Article 22 March 2022

Linear convergence of first order methods for non-strongly convex optimization

Article 22 January 2018

Global Convergence of ADMM in Nonconvex Nonsmooth Optimization

Article 07 June 2018

Notes

Some authors refer to this as a convex composite minimization problem.
For instance, the exact Hessian \(H_k=\nabla ^2 f(x^k)\) is used in [17]. If f is not strongly convex, then neither is the quadratic model (2). As such, the inner problem can have multiple minimizers and the next iterate \(x^{k+1}\) is not well defined.
In [16] the authors considered global versions of the Luo–Tseng EB and KL properties and showed that they are equivalent. However, none of the scenarios listed in Fact 3 except (S1) are known to possess the global Luo–Tseng EB property stated in [16].
Note that Assumption 1(a) is not required for Corollary 2 to hold; cf. Proposition 1.
A similar EB property has been studied by Pang [30] for linearly constrained variational inequalities.
The code can be downloaded from https://github.com/ZiruiZhou/IRPN.
\(\Vert A\Vert ^2\) is computed via the MATLAB code lambda = eigs(A*A’,1,’LM’).

References

Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009)
Article MathSciNet MATH Google Scholar
Becker, S., Fadili, J.: A quasi-Newton proximal splitting method. In Pereira, F.C.N., Burges, C.J.C., Bottou, L., Weinberger, K. Q. (eds), Advances in Neural Information Processing Systems 25: Proceedings of the 2012 Conference, pp. 2618–2626 (2012)
Bhatia, R.: Matrix Analysis, Volume 169 of Graduate. Springer, New York (1997)
Book Google Scholar
Byrd, R.H., Nocedal, J., Oztoprak, F.: An inexact successive quadratic approximation method for L-1 regularized optimization. Math. Program. Ser. B 157(2), 375–396 (2016)
Article MathSciNet MATH Google Scholar
Combettes, P.L., Wajs, V.R.: Signal recovery by proximal forward-backward splitting. Multiscale Model. Simul. 4(4), 1168–1200 (2005)
Article MathSciNet MATH Google Scholar
Dontchev, A.L., Rockafellar, R.T.: Implicit Functions and Solution Mappings. Springer Monographs in Mathematics. Springer, New York (2009)
Facchinei, F., Fischer, A., Herrich, M.: A family of Newton methods for nonsmooth constrained systems with nonisolated solutions. Math. Methods Oper. Res. 77(3), 433–443 (2013)
Article MathSciNet MATH Google Scholar
Facchinei, F., Fischer, A., Herrich, M.: An LP-Newton method: nonsmooth equations, KKT systems, and nonisolated solutions. Math. Program. Ser. A 146(1–2), 1–36 (2014)
Article MathSciNet MATH Google Scholar
Facchinei, F., Pang, J.-S.: Finite–Dimensional Variational Inequalities and Complementarity Problems, vol. 1. Springer, New York (2003)
MATH Google Scholar
Fan, R.-E., Chang, K.-W., Hsieh, C.-J., Wang, X.-R., Lin, C.-J.: LIBLINEAR: a library for large linear classification. J. Mach. Learn. Res. 9(Aug), 1871–1874 (2008)
MATH Google Scholar
Fischer, A.: Local behavior of an iterative framework for generalized equations with nonisolated solutions. Math. Program. Ser. A 94(1), 91–124 (2002)
Article MathSciNet MATH Google Scholar
Fischer, A., Herrich, M., Izmailov, A.F., Solodov, M.V.: A globally convergent LP-Newton method. SIAM J. Optim. 26(4), 2012–2033 (2016)
Article MathSciNet MATH Google Scholar
Friedman, J., Hastie, T., Höfling, H., Tibshirani, R.: Pathwise coordinate optimization. Ann. Appl. Stat. 1(2), 302–332 (2007)
Article MathSciNet MATH Google Scholar
Hou, K., Zhou, Z., So, A.M.-C., Luo, Z.-Q.: On the linear convergence of the proximal gradient method for trace norm regularization. In Burges, C.J.C., Bottou, L., Welling, M., Ghahramani, Z., Weinberger, K. Q., (eds), Advances in Neural Information Processing Systems 26: Proceedings of the 2013 Conference, pp. 710–718 (2013)
Hsieh, C.-J., Dhillon, I. S., Ravikumar, P. K., Sustik, M. A.: Sparse inverse covariance matrix estimation using quadratic approximation. In: Shawe-Taylor, J., Zemel, R. S., Bartlett, P., Pereira, F.C.N., Weinberger, K.Q. (eds), Advances in Neural Information Processing Systems 24: Proceedings of the 2011 Conference, pp. 2330–2338 (2011)
Karimi, H., Nutini, J., Schmidt, M.: Linear convergence of gradient and proximal-gradient Methods under the Polyak-Łojasiewicz Condition. In: Frasconi, P., Landwehr, N., Manco, G., Vreeken, J. (eds) Machine Learning and Knowledge Discovery in Databases (ECML PKDD 2016), Part I, Vol. 9851 of Lecture Notes in Artificial Intelligence, pp. 795–811. Springer International Publishing AG, Cham, Switzerland (2016)
Lee, J.D., Sun, Y., Saunders, M.A.: Proximal Newton-type methods for minimizing composite functions. SIAM J. Optim. 24(3), 1420–1443 (2014)
Article MathSciNet MATH Google Scholar
Li, D.-H., Fukushima, M., Qi, L., Yamashita, N.: Regularized Newton methods for convex minimization problems with singular solutions. Comput. Optim. Appl. 28(2), 131–147 (2004)
Article MathSciNet MATH Google Scholar
Li, G., Pong, T.K.: Calculus of the exponent of Kurdyka–Łojasiewicz inequality and its applications to linear convergence of first-order methods. Found. Comput. Math. (2017). https://doi.org/10.1007/s10208-017-9366-8
LIBSVM Data: Classification, Regression, and Multi-label. https://www.csie.ntu.edu.tw/ cjlin/libsvmtools/datasets/
Liu, H., So, A. M.-C., Wu, W.: Quadratic optimization with orthogonality constraint: explicit Łojasiewicz exponent and linear convergence of retraction-based line-search and stochastic variance-reduced gradient methods. Preprint (2017)
Luo, Z.-Q., Tseng, P.: Error bound and convergence analysis of matrix splitting algorithms for the affine variational inequality problem. SIAM J. Optim. 2(1), 43–54 (1992)
Article MathSciNet MATH Google Scholar
Luo, Z.-Q., Tseng, P.: On the linear convergence of descent methods for convex essentially smooth minimization. SIAM J. Control Optim. 30(2), 408–425 (1992)
Article MathSciNet MATH Google Scholar
Luo, Z.-Q., Tseng, P.: Error bounds and convergence analysis of feasible descent methods: a general approach. Ann. Oper. Res. 46(1), 157–178 (1993)
Article MathSciNet MATH Google Scholar
Moré, J.J.: The Levenberg–Marquardt algorithm: implementation and theory. In: Watson, G.A. (ed.) Numerical Analysis, Volume 630 of Lecture Notes in Mathematics, pp. 105–116. Springer, Berlin (1978)
Nesterov, Yu.: Introductory Lectures on Convex Optimization: A Basic Course. Kluwer, Boston (2004)
Book MATH Google Scholar
Nocedal, J., Wright, S.J.: Numerical Optimization. Springer Series in Operations Research and Financial Engineering, second edn. Springer, New York (2006)
O’Donoghue, B., Candès, E.: Adaptive restart for accelerated gradient schemes. Found. Comput. Math. 15(3), 715–732 (2015)
Article MathSciNet MATH Google Scholar
Olsen, P.A., Oztoprak, F., Nocedal, J., Rennie, S.: Newton-like methods for sparse inverse covariance estimation. In: Pereira, F.C.N., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds), Advances in Neural Information Processing Systems 25: Proceedings of the 2012 Conference, pp. 755–763 (2012)
Pang, J.-S.: A posteriori error bounds for the linearly-constrained variational inequality problem. Math. Oper. Res. 12(3), 474–484 (1987)
Article MathSciNet Google Scholar
Pang, J.-S.: Error bounds in mathematical programming. Math. Program. 79(1–3), 299–332 (1997)
MathSciNet MATH Google Scholar
Parikh, N., Boyd, S.: Proximal algorithms. Foundations and Trends\(\textregistered \) in Optimization 1(3), 127–239 (2014)
Qi, H., Sun, D.: A quadratically convergent newton method for computing the nearest correlation matrix. SIAM J. Matrix Anal. Appl. 28(2), 360–385 (2006)
Article MathSciNet MATH Google Scholar
Sardy, S., Antoniadis, A., Tseng, P.: Automatic smoothing with wavelets for a wide class of distributions. J. Comput. Gr. Stat. 13(2), 399–421 (2004)
Article MathSciNet Google Scholar
Scheinberg, K., Tang, X.: Practical inexact proximal quasi-newton method with global complexity analysis. Math. Program. Ser. A 160(1–2), 495–529 (2016)
Article MathSciNet MATH Google Scholar
Schmidt, M., van den Berg, E., Friedlander, M.P., Murphy, K.: Optimizing costly functions with simple constraints: a limited-memory projected quasi-Newton algorithm. In: Proceedings of the 12th International Conference on Artificial Intelligence and Statistics (AISTATS 2009), pp. 456–463 (2009)
Tseng, P.: Error bounds and superlinear convergence analysis of some Newton-type methods in optimization. In: Nonlinear Optimization and Related Topics, vol. 36 of Applied Optimization, pp. 445–462. Springer, Dordrecht (2000)
Tseng, P.: Approximation accuracy, gradient methods, and error bound for structured convex optimization. Math. Program. Ser. B 125(2), 263–295 (2010)
Article MathSciNet MATH Google Scholar
Tseng, P., Yun, S.: A coordinate gradient descent method for nonsmooth separable minimization. Math. Program. Ser. B 117(1–2), 387–423 (2009)
Article MathSciNet MATH Google Scholar
Wen, B., Chen, X., Pong, T.K.: Linear convergence of proximal gradient algorithm with extrapolation for a class of nonconvex nonsmooth minimization problems. SIAM J. Optim. 27(1), 124–145 (2017)
Article MathSciNet MATH Google Scholar
Wright, S.J., Nowak, R.D., Figueiredo, M.A.T.: Sparse reconstruction by separable approximation. IEEE Trans. Signal Process. 57(7), 2479–2493 (2009)
Article MathSciNet MATH Google Scholar
Yamashita, N., Fukushima, M.: On the rate of convergence of the Levenberg–Marquardt method. In: Alefeld, G., Chen, X. (eds.) Topics in Numerical Analysis, Volume 15 of Computing Supplement, pp. 239–249. Springer, Wien (2001)
Chapter MATH Google Scholar
Yen, I. E.-H., Hsieh, C.-J., Ravikumar, P. K., Dhillon, I. S.: Constant nullspace strong convexity and fast convergence of proximal methods under high-dimensional settings. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds), Advances in Neural Information Processing Systems 27: Proceedings of the 2014 Conference, pp. 1008–1016 (2014)
Yuan, G.-X., Chang, K.-W., Hsieh, C.-J., Lin, C.-J.: A comparison of optimization methods and software for large-scale L1-regularized linear classification. J. Mach. Learn. Res. 11(Nov), 3183–3234 (2010)
MathSciNet MATH Google Scholar
Yuan, G.-X., Ho, C.-H., Lin, C.-J.: An improved GLMNET for L1-regularized logistic regression. J. Mach. Learn. Res. 13(1), 1999–2030 (2012)
MathSciNet MATH Google Scholar
Yun, S., Toh, K.-C.: A coordinate gradient descent method for \(\ell _1\)-regularized convex minimization. Comput. Optim. Appl. 48(2), 273–307 (2011)
Article MathSciNet MATH Google Scholar
Zhang, H., Jiang, J., Luo, Z.-Q.: On the linear convergence of a proximal gradient method for a class of nonsmooth convex minimization problems. J. Oper. Res. Soc. China 1(2), 163–186 (2013)
Article MATH Google Scholar
Zhong, K., Yen, I.E.-H., Dhillon, I.S., Ravikumar, P.: Proximal quasi–Newton for computationally intensive \(\ell _1\)–regularized \(M\)-estimators. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds) Advances in Neural Information Processing Systems 27: Proceedings of the 2014 Conference, pp. 2375–2383 (2014)
Zhou, Z., So, A.M.-C.: A unified approach to error bounds for structured convex optimization problems. Math. Program. Ser. A 165(2), 689–728 (2017)
Article MathSciNet MATH Google Scholar
Zhou, Z., Zhang, Q., So, A.M.-C.: \(\ell _{1,p}\)-norm regularization: error bounds and convergence rate analysis of first-order methods. In: Proceedings of the 32nd International Conference on Machine Learning (ICML 2015), pp. 1501–1510 (2015)

Download references

Acknowledgements

We thank the anonymous reviewers for their detailed and helpful comments. Most of the work of the first and second authors was done when they were Ph.D. students at the Department of Systems Engineering and Engineering Management of The Chinese University of Hong Kong.

Author information

Authors and Affiliations

Imperial College Business School, Imperial College London, London, UK
Man-Chung Yue
Department of Mathematics, Hong Kong Baptist University, Kowloon Tong, Kowloon, Hong Kong
Zirui Zhou
Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong
Anthony Man-Cho So

Authors

Man-Chung Yue
View author publications
You can also search for this author in PubMed Google Scholar
Zirui Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Anthony Man-Cho So
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zirui Zhou.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This research is supported in part by the Hong Kong Research Grants Council (RGC) General Research Fund (GRF) Projects CUHK 14206814 and CUHK 14208117 and in part by a gift grant from Microsoft Research Asia.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yue, MC., Zhou, Z. & So, A.MC. A family of inexact SQA methods for non-smooth convex minimization with provable convergence guarantees based on the Luo–Tseng error bound property. Math. Program. 174, 327–358 (2019). https://doi.org/10.1007/s10107-018-1280-6

Download citation

Received: 01 February 2017
Accepted: 13 April 2018
Published: 30 April 2018
Issue Date: 01 March 2019
DOI: https://doi.org/10.1007/s10107-018-1280-6

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A family of inexact SQA methods for non-smooth convex minimization with provable convergence guarantees based on the Luo–Tseng error bound property

Abstract

Access this article

Similar content being viewed by others

A globally convergent proximal Newton-type method in nonsmooth convex optimization

Linear convergence of first order methods for non-strongly convex optimization

Global Convergence of ADMM in Nonconvex Nonsmooth Optimization

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

A family of inexact SQA methods for non-smooth convex minimization with provable convergence guarantees based on the Luo–Tseng error bound property

Abstract

Access this article

Similar content being viewed by others

A globally convergent proximal Newton-type method in nonsmooth convex optimization

Linear convergence of first order methods for non-strongly convex optimization

Global Convergence of ADMM in Nonconvex Nonsmooth Optimization

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation