Subgradient Methods for Sharp Weakly Convex Functions

Davis, Damek; Drusvyatskiy, Dmitriy; MacPhee, Kellie J.; Paquette, Courtney

doi:10.1007/s10957-018-1372-8

Subgradient Methods for Sharp Weakly Convex Functions

Published: 10 September 2018

Volume 179, pages 962–982, (2018)
Cite this article

Journal of Optimization Theory and Applications Aims and scope Submit manuscript

Damek Davis¹,
Dmitriy Drusvyatskiy ORCID: orcid.org/0000-0001-5245-0458²,
Kellie J. MacPhee² &
…
Courtney Paquette³

2008 Accesses
32 Citations
Explore all metrics

Abstract

Subgradient methods converge linearly on a convex function that grows sharply away from its solution set. In this work, we show that the same is true for sharp functions that are only weakly convex, provided that the subgradient methods are initialized within a fixed tube around the solution set. A variety of statistical and signal processing tasks come equipped with good initialization and provably lead to formulations that are both weakly convex and sharp. Therefore, in such settings, subgradient methods can serve as inexpensive local search procedures. We illustrate the proposed techniques on phase retrieval and covariance estimation problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Stochastic algorithms with geometric step decay converge linearly on sharp functions

Article 05 September 2023

Implicit Regularization in Nonconvex Statistical Estimation: Gradient Descent Converges Linearly for Phase Retrieval, Matrix Completion, and Blind Deconvolution

Article Open access 05 August 2019

New results on subgradient methods for strongly convex optimization problems with a unified analysis

Article 26 March 2016

Notes

To the best of our knowledge, the class of weakly convex functions was introduced in 1972 in a local publication of the Institute of Cybernetics in Kiev and then used in [19]. These are the functions that are lower bounded by a linear function up to a little-o term that is uniform in the base point. The term \(\rho \)-weak convexity, as we use it here, corresponds to the setting when the error term is a quadratic. A number of other names are used in the literature instead of \(\rho \)-weak convexity, such as semi-convex, prox-regular, and lower-\(C^2\).
Note this potential function is different from the direct generalization of the one used in phase retrieval, \(\min _{x}\sum _i \left| \Vert X^Ta_i\Vert ^2-b_i\right| \). The reason for the more exotic formulation is that the differences \(a_{2i}a_{2i}^T-a_{2i-1}a_{2i-1}^T\) in (5) allow the authors of [38] to prove sharpness of the objective function by appealing to an \(\ell _2/\ell _1\) restricted isometry property of the resulting linear map. See [38, Section III.B] for details. It is not clear whether the naive objective function is also sharp with high probability under reasonable statistical assumptions.

References

Jane, P., Netrapalli, P.: Fast exact matrix completion with finite samples. In: Grnwald, P., Hazan, E., Kale, S. (eds.) Proceedings of The 28th Conference on Learning Theory, Proceedings of Machine Learning Research, vol. 40, pp. 1007–1034. PMLR, Paris, France (2015). http://proceedings.mlr.press/v40/Jain15.html
Tu, S., Boczar, R., Simchowitz, M., Soltanolkotabi, M., Recht, B.: Low-rank solutions of linear matrix equations via procrustes flow. In: Proceedings of the 33rd International Conference on International Conference on Machine Learning—Volume 48, ICML’16, pp. 964–973. JMLR.org (2016). http://dl.acm.org/citation.cfm?id=3045390.3045493
Candès, E., Li, X., Soltanolkotabi, M.: Phase retrieval via Wirtinger flow: theory and algorithms. IEEE Trans. Inf. Theory 61(4), 1985–2007 (2015). https://doi.org/10.1109/TIT.2015.2399924
Article MathSciNet MATH Google Scholar
Jain, P., Jin, C., Kakade, S., Netrapalli, P.: Global convergence of non-convex gradient descent for computing matrix squareroot. In: Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 479–488 (2017). http://proceedings.mlr.press/v54/jain17a.html
Chen, Y., Wainwright, M.: Fast low-rank estimation by projected gradient descent: general statistical and algorithmic guarantees (2015). arXiv:1509.03025
Meka, R., Jain, P., Dhillon, I.: Guaranteed rank minimization via singular value projection (2009). arXiv:0909.5457
Boumal, N.: Nonconvex phase synchronization. SIAM J. Optim. 26(4), 2355–2377 (2016)
Article MathSciNet Google Scholar
Brutzkus, A., Globerson, A.: Globally optimal gradient descent for a ConvNet with Gaussian inputs (2017). arXiv:1702.07966
Eremin, I.: The relaxation method of solving systems of inequalities with convex functions on the left-hand side. Dokl. Akad. Nauk SSSR 160, 994–996 (1965)
MathSciNet Google Scholar
Poljak, B.: Minimization of unsmooth functionals. USSR Comput. Math. Math. Phys. 9, 14–29 (1969)
Article Google Scholar
Shor, N.: The rate of convergence of the method of the generalized gradient descent with expansion of space. Kibernetika (Kiev) 2, 80–85 (1970)
MathSciNet MATH Google Scholar
Goffin, J.: On convergence rates of subgradient optimization methods. Math. Program. 13(3), 329–347 (1977). https://doi.org/10.1007/BF01584346
Article MathSciNet MATH Google Scholar
Supittayapornpong, S., Neely, M.: Staggered time average algorithm for stochastic non-smooth optimization with \({{O}}(1/t)\) convergence (2016). arXiv:1607.02842
Yang, T., Lin, Q.: RSG: beating subgradient method without smoothness and strong convexity (2016). arXiv:1512.03107
Johnstone, P., Moulin, P.: Faster subgradient methods for functions with Hölderian growth (2017). arXiv:1704.00196
Polyak, B.: Sharp minima. Institute of Control Sciences Lecture Notes, Moscow, USSR; Presented at the IIASA Workshop on Generalized Lagrangians and Their Applications, IIASA, Laxenburg, Austria, vol. 3, pp. 369–380 (1979)
Burke, J., Ferris, M.: Weak sharp minima in mathematical programming. SIAM J. Control Optim. 31(5), 1340–1359 (1993). https://doi.org/10.1137/0331063
Article MathSciNet MATH Google Scholar
Poljak, B.: Subgradient methods: a survey of Soviet research. In: Nonsmooth Optimization (Proc. IIASA Workshop, Laxenburg, 1977), IIASA Proc. Ser., vol. 3, pp. 5–29. Pergamon, Oxford (1978)
Nurminskii, E.A.: The quasigradient method for the solving of the nonlinear programming problems. Cybernetics 9(1), 145–150 (1973). https://doi.org/10.1007/BF01068677
Article Google Scholar
Nurminskii, E.A.: Minimization of nondifferentiable functions in the presence of noise. Cybernetics 10(4), 619–621 (1974). https://doi.org/10.1007/BF01071541
Article MathSciNet Google Scholar
Davis, D., Drusvyatskiy, D.: Stochastic subgradient method converges at the rate \(O(k^{-1/4})\) on weakly convex functions (2018). arXiv:1802.02988
Drusvyatskiy, D., Lewis, A.: Error bounds, quadratic growth, and linear convergence of proximal methods. To appear in Mathematics of Operations Research (2016). arXiv:1602.06661
Duchi, J., Ruan, F.: Stochastic methods for composite optimization problems (2017). Preprint arXiv:1703.08570
Duchi, J., Ruan, F.: Solving (most) of a set of quadratic equalities: composite optimization for robust phase retrieval (2017). arXiv:1705.02356
Drusvyatskiy, D., Paquette, C.: Efficiency of minimizing compositions of convex functions and smooth maps (2016). Preprint arXiv:1605.00125
Davis, D., Grimmer, B.: Proximally guided stochastic method for nonsmooth, nonconvex problems (2017). Preprint arXiv:1707.03505
Drusvyatskiy, D.: The proximal point method revisited. To appear in SIAG/OPT Views and News (2018). arXiv:1712.06038
Rockafellar, R., Wets, R.B.: Variational Analysis. Grundlehren der mathematischen Wissenschaften, vol. 317. Springer, Berlin (1998)
Google Scholar
Mordukhovich, B.: Variational Analysis and Generalized Differentiation I: Basic Theory. Grundlehren der mathematischen Wissenschaften, vol. 330. Springer, Berlin (2006)
Google Scholar
Penot, J.P.: Calculus Without Derivatives, Graduate Texts in Mathematics, vol. 266. Springer, New York (2013). https://doi.org/10.1007/978-1-4614-4538-8
Book Google Scholar
Ioffe, A.: Variational Analysis of Regular Mappings. Springer Monographs in Mathematics. Springer, Cham (2017). https://doi-org.offcampus.lib.washington.edu/10.1007/978-3-319-64277-2. Theory and Applications
Book Google Scholar
Poliquin, R., Rockafellar, R.: Prox-regular functions in variational analysis. Trans. Am. Math. Soc. 348, 1805–1838 (1996)
Article MathSciNet Google Scholar
Eldar, Y., Mendelson, S.: Phase retrieval: stability and recovery guarantees. Appl. Comput. Harmon. Anal. 36(3), 473–494 (2014). https://doi.org/10.1016/j.acha.2013.08.003
Article MathSciNet MATH Google Scholar
Davis, D., Drusvyatskiy, D., Paquette, C.: The nonsmooth landscape of phase retrieval (2017). arXiv:1711.03247
Chen, Y., Candès, E.: Solving random quadratic systems of equations is nearly as easy as solving linear systems. Commun. Pure Appl. Math. 70(5), 822–883 (2017). https://doi-org.offcampus.lib.washington.edu/10.1002/cpa.21638
Article MathSciNet Google Scholar
Sun, J., Qu, Q., Wright, J.: A geometric analysis of phase retrieval. To appear in Foundations of Computational Mathematic (2017). arXiv:1602.06664
Tan, Y., Vershynin, R.: Phase retrieval via randomized Kaczmarz: theoretical guarantees (2017). arXiv:1605.08285
Chen, Y., Chi, Y., Goldsmith, A.J.: Exact and stable covariance estimation from quadratic sampling via convex programming. IEEE Trans. Inf. Theory 61(7), 4034–4059 (2015)
Article MathSciNet Google Scholar
Clarke, F.: Optimization and Nonsmooth Analysis. Wiley, New York (1983)
MATH Google Scholar

Download references

Acknowledgements

Research of Drusvyatskiy and MacPhee was partially supported by the AFOSR YIP Award FA9550-15-1-0237 and by the NSF DMS 1651851 and CCF 1740551 Awards. Research of Paquette was supported by NSF CCF 1740796.

Author information

Authors and Affiliations

School of Operations Research and Information Engineering, Cornell University, Ithaca, NY, 14850, USA
Damek Davis
Department of Mathematics, University of Washington, Seattle, WA, 98195, USA
Dmitriy Drusvyatskiy & Kellie J. MacPhee
Industrial and Systems Engineering Department, Lehigh University, Bethlehem, PA, 18015, USA
Courtney Paquette

Authors

Damek Davis
View author publications
You can also search for this author in PubMed Google Scholar
Dmitriy Drusvyatskiy
View author publications
You can also search for this author in PubMed Google Scholar
Kellie J. MacPhee
View author publications
You can also search for this author in PubMed Google Scholar
Courtney Paquette
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dmitriy Drusvyatskiy.

Additional information

Communicated by Evgeni A. Nurminski.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Davis, D., Drusvyatskiy, D., MacPhee, K.J. et al. Subgradient Methods for Sharp Weakly Convex Functions. J Optim Theory Appl 179, 962–982 (2018). https://doi.org/10.1007/s10957-018-1372-8

Download citation

Received: 08 March 2018
Accepted: 04 August 2018
Published: 10 September 2018
Issue Date: December 2018
DOI: https://doi.org/10.1007/s10957-018-1372-8

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Subgradient Methods for Sharp Weakly Convex Functions

Abstract

Access this article

Similar content being viewed by others

Stochastic algorithms with geometric step decay converge linearly on sharp functions

Implicit Regularization in Nonconvex Statistical Estimation: Gradient Descent Converges Linearly for Phase Retrieval, Matrix Completion, and Blind Deconvolution

New results on subgradient methods for strongly convex optimization problems with a unified analysis

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Subgradient Methods for Sharp Weakly Convex Functions

Abstract

Access this article

Similar content being viewed by others

Stochastic algorithms with geometric step decay converge linearly on sharp functions

Implicit Regularization in Nonconvex Statistical Estimation: Gradient Descent Converges Linearly for Phase Retrieval, Matrix Completion, and Blind Deconvolution

New results on subgradient methods for strongly convex optimization problems with a unified analysis

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation