Skip to main content
Log in

Run-and-Inspect Method for nonconvex optimization and global optimality bounds for R-local minimizers

  • Full Length Paper
  • Series B
  • Published:
Mathematical Programming Submit manuscript

Abstract

Many optimization algorithms converge to stationary points. When the underlying problem is nonconvex, they may get trapped at local minimizers or stagnate near saddle points. We propose the Run-and-Inspect Method, which adds an “inspect” phase to existing algorithms that helps escape from non-global stationary points. It samples a set of points in a radius R around the current point. When a sample point yields a sufficient decrease in the objective, we resume an existing algorithm from that point. If no sufficient decrease is found, the current point is called an approximate R-local minimizer. We show that an R-local minimizer is globally optimal, up to a specific error depending on R, if the objective function can be implicitly decomposed into a smooth convex function plus a restricted function that is possibly nonconvex, nonsmooth. Therefore, for such nonconvex objective functions, verifying global optimality is fundamentally easier. For high-dimensional problems, we introduce blockwise inspections to overcome the curse of dimensionality while still maintaining optimality bounds up to a factor equal to the number of blocks. We also present the sample complexities of these methods. When we apply our method to the existing algorithms on a set of artificial and realistic nonconvex problems, we find significantly improved chances of obtaining global minima.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. Do not confuse with the Lipschitz constant L of \(\nabla f\).

  2. https://archive.ics.uci.edu/ml/datasets/iris.

References

  1. Chaudhari, P., Choromanska, A., Soatto, S., LeCun, Y.: Entropy-SGD: biasing gradient descent into wide valleys. arXiv preprint arXiv:1611.01838 (2016)

  2. Chaudhari, P., Oberman, A., Osher, S., Soatto, S., Carlier, G.: Deep relaxation: partial differential equations for optimizing deep neural networks. Res. Math. Sci. 5(3), 30 (2018)

    Article  MathSciNet  Google Scholar 

  3. Conn, A.R., Gould, N.I., Toint, P.L.: Trust Region Methods. SIAM, Philadelphia (2000)

    Book  MATH  Google Scholar 

  4. Conn, A.R., Scheinberg, K., Vicente, L.N.: Introduction to Derivative-Free Optimization. No. 8 in MPS-SIAM Series on Optimization. Society for Industrial and Applied Mathematics/Mathematical Programming Society, Philadelphia (2009)

    Book  Google Scholar 

  5. Fox, J.: An R and S-Plus Companion to Applied Regression. Sage Publications, Thousand Oaks (2002)

    Google Scholar 

  6. Ge, R., Huang, F., Jin, C., Yuan, Y.: Escaping from saddle points—online stochastic gradient for tensor decomposition. In: Conference on Learning Theory, pp. 797–842 (2015)

  7. Ge, R., Lee, J.D., Ma, T.: Matrix completion has no spurious local minimum. In: Lee, D.D., Sugiyama, M., Luxburg, U.V., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems, pp. 2973–2981. Curran Associates, Inc. (2016)

  8. Ghadimi, S., Lan, G.: Accelerated gradient methods for nonconvex nonlinear and stochastic programming. Math. Program. 156(1–2), 59–99 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  9. Jin, C., Ge, R., Netrapalli, P., Kakade, S.M., Jordan, M.I.: How to escape saddle points efficiently. arXiv preprint arXiv:1703.00887 (2017)

  10. Karimi, H., Nutini, J., Schmidt, M.: Linear convergence of gradient and proximal-gradient methods under the Polyak–Łojasiewicz condition. In: Frasconi, P., Landwehr, N., Manco, G., Vreeken, J. (eds.) Machine Learning and Knowledge Discovery in Databases, vol. 9851, pp. 795–811. Springer, Cham (2016)

    Chapter  Google Scholar 

  11. Kirkpatrick, S., Gelatt Jr., C.D., Vecchi, M.P.: Optimization by simulated annealing. Science 220(4598), 671–680 (1983). https://doi.org/10.1126/science.220.4598.671

    Article  MathSciNet  MATH  Google Scholar 

  12. Martínez, J.M., Raydan, M.: Cubic-regularization counterpart of a variable-norm trust-region method for unconstrained minimization. J. Glob. Optim. 68(2), 367–385 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  13. Nesterov, Y., Polyak, B.T.: Cubic regularization of Newton method and its global performance. Math. Program. 108(1), 177–205 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  14. Panageas, I., Piliouras, G.: Gradient descent converges to minimizers: the case of non-isolated critical points. CoRR arXiv:1605.00405 (2016)

  15. Pascanu, R., Dauphin, Y.N., Ganguli, S., Bengio, Y.: On the saddle point problem for non-convex optimization. arXiv preprint arXiv:1405.4604 (2014)

  16. Peng, Z., Wu, T., Xu, Y., Yan, M., Yin, W.: Coordinate friendly structures, algorithms and applications. Ann. Math. Sci. Appl. 1(1), 57–119 (2016)

    MathSciNet  MATH  Google Scholar 

  17. Polyak, B.T.: Gradient methods for minimizing functionals. Zhurnal Vychislitel’noi Matematiki i Matematicheskoi Fiziki 3(4), 643–653 (1963)

    MathSciNet  MATH  Google Scholar 

  18. Reddi, S.J., Hefny, A., Sra, S., Poczos, B., Smola, A.: Stochastic variance reduction for nonconvex optimization. In: International Conference on Machine Learning, pp. 314–323 (2016)

  19. Sagun, L., Bottou, L., LeCun, Y.: Singularity of the Hessian in deep learning. arXiv preprint arXiv:1611.07476 (2016)

  20. Shen, X., Gu, Y.: Nonconvex sparse logistic regression with weakly convex regularization. IEEE Trans. Signal Process. 66(12), 3199–3211 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  21. Sun, J., Qu, Q., Wright, J.: Complete dictionary recovery over the sphere. In: 2015 International Conference on Sampling Theory and Applications (SampTA), pp. 407–410. IEEE (2015)

  22. Sun, J., Qu, Q., Wright, J.: A geometric analysis of phase retrieval. In: 2016 IEEE International Symposium on Information Theory (ISIT), pp. 2379–2383. IEEE (2016)

  23. Wang, Y., Yin, W., Zeng, J.: Global convergence of ADMM in nonconvex nonsmooth optimization. J. Sci. Comput. 78, 29–63 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  24. Wu, L., Zhu, Z., Weinan, E.: Towards understanding generalization of deep learning: perspective of loss landscapes. arXiv preprint arXiv:1706.10239 (2017)

  25. Xu, Y., Yin, W.: A block coordinate descent method for regularized multiconvex optimization with applications to nonnegative tensor factorization and completion. SIAM J. Imaging Sci. 6(3), 1758–1789 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  26. Xu, Z., Chang, X., Xu, F., Zhang, H.: \(l_{1/2}\) regularization: a thresholding representation theory and a fast solver. IEEE Trans. Neural Netw. Learn. Syst. 23(7), 1013–1027 (2012)

    Article  Google Scholar 

  27. Yin, P., Pham, M., Oberman, A., Osher, S.: Stochastic backward Euler: an implicit gradient descent algorithm for k-means clustering. J. Sci. Comput. 77(2), 1133–1146 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  28. Zeng, J., Peng, Z., Lin, S.: GAITA: a Gauss–Seidel iterative thresholding algorithm for \(\ell _q\) regularized least squares regression. J. Comput. Appl. Math. 319, 220–235 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  29. Zhang, C.H.: Nearly unbiased variable selection under minimax concave penalty. Ann. Stat. 38(2), 894–942 (2010)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wotao Yin.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work of Y. Chen is supported in part by Tsinghua Xuetang Mathematics Program and Top Open Program for his short-term visit to UCLA. The work of Y. Sun and W. Yin is supported in part by NSF Grant DMS-1720237 and ONR Grant 000141712162.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, Y., Sun, Y. & Yin, W. Run-and-Inspect Method for nonconvex optimization and global optimality bounds for R-local minimizers. Math. Program. 176, 39–67 (2019). https://doi.org/10.1007/s10107-019-01397-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10107-019-01397-w

Keywords

Mathematics Subject Classification

Navigation