Structured nonconvex and nonsmooth optimization: algorithms and iteration complexity analysis

Abstract

Nonconvex and nonsmooth optimization problems are frequently encountered in much of statistics, business, science and engineering, but they are not yet widely recognized as a technology in the sense of scalability. A reason for this relatively low degree of popularity is the lack of a well developed system of theory and algorithms to support the applications, as is the case for its convex counterpart. This paper aims to take one step in the direction of disciplined nonconvex and nonsmooth optimization. In particular, we consider in this paper some constrained nonconvex optimization models in block decision variables, with or without coupled affine constraints. In the absence of coupled constraints, we show a sublinear rate of convergence to an \(\epsilon \)-stationary solution in the form of variational inequality for a generalized conditional gradient method, where the convergence rate is dependent on the Hölderian continuity of the gradient of the smooth part of the objective. For the model with coupled affine constraints, we introduce corresponding \(\epsilon \)-stationarity conditions, and apply two proximal-type variants of the ADMM to solve such a model, assuming the proximal ADMM updates can be implemented for all the block variables except for the last block, for which either a gradient step or a majorization–minimization step is implemented. We show an iteration complexity bound of \(O(1/\epsilon ^2)\) to reach an \(\epsilon \)-stationary solution for both algorithms. Moreover, we show that the same iteration complexity of a proximal BCD method follows immediately. Numerical results are provided to illustrate the efficacy of the proposed algorithms for tensor robust PCA and tensor sparse PCA problems.

This is a preview of subscription content, log in to check access.

References

  1. 1.

    Allen, G.: Sparse higher-order principal components analysis. In: The 15th International Conference on Artificial Intelligence and Statistics (2012)

  2. 2.

    Ames, B., Hong, M.: Alternating direction method of multipliers for penalized zero-variance discriminant analysis. Comput. Optim. Appl. 64(3), 725–754 (2016). https://doi.org/10.1007/s10589-016-9828-y

    MathSciNet  MATH  Article  Google Scholar 

  3. 3.

    Attouch, H., Bolte, J., Redont, P., Soubeyran, A.: Proximal alternating minimization and projection methods for nonconvex problems: an approach based on the Kurdyka–Łojasiewicz inequality. Math. Oper. Res. 35(2), 438–457 (2010)

    MathSciNet  MATH  Article  Google Scholar 

  4. 4.

    Bach, F.: Duality between subgradient and conditional gradient methods. SIAM J. Optim. 25(1), 115–129 (2015)

    MathSciNet  MATH  Article  Google Scholar 

  5. 5.

    Beck, A., Shtern, S.: Linearly convergent away-step conditional gradient for nonstrongly convex functions. Math. Program. 164(1–2), 1–27 (2017)

    MathSciNet  MATH  Article  Google Scholar 

  6. 6.

    Bian, W., Chen, X.: Worst-case complexity of smoothing quadratic regularization methods for non-Lipschitzian optimization. SIAM J. Optim. 23, 1718–1741 (2013)

    MathSciNet  MATH  Article  Google Scholar 

  7. 7.

    Bian, W., Chen, X., Ye, Y.: Complexity analysis of interior point algorithms for non-Lipschitz and nonconvex minimization. Math. Program. 149, 301–327 (2015)

    MathSciNet  MATH  Article  Google Scholar 

  8. 8.

    Bolte, J., Daniilidis, A., Lewis, A.: The Łojasiewicz inequality for nonsmooth subanalytic functions with applications to subgradient dynamical systems. SIAM J. Optim. 17, 1205–1223 (2006)

    MathSciNet  MATH  Article  Google Scholar 

  9. 9.

    Bolte, J., Daniilidis, A., Lewis, A., Shiota, M.: Clarke subgradients of stratifiable functions. SIAM J. Optim. 18, 556–572 (2007)

    MathSciNet  MATH  Article  Google Scholar 

  10. 10.

    Bolte, J., Daniilidis, A., Ley, O., Mazet, L.: Characterizations of Łojasiewicz inequalities: subgradient flows, talweg, convexity. Trans. Am. Math. Soc. 362(6), 3319–3363 (2010)

    MATH  Article  Google Scholar 

  11. 11.

    Bolte, J., Sabach, S., Teboulle, M.: Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Math. Program. 146, 459–494 (2014)

    MathSciNet  MATH  Article  Google Scholar 

  12. 12.

    Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3(1), 1–122 (2011)

    MATH  Article  Google Scholar 

  13. 13.

    Bredies, K.: A forward-backward splitting algorithm for the minimization of non-smooth convex functionals in Banach space. Inverse Probl. 25(1), 711–723 (2009)

    MathSciNet  MATH  Article  Google Scholar 

  14. 14.

    Bredies, K., Lorenz, D.A., Maass, P.: A generalized conditional gradient method and its connection to an iterative shrinkage method. Comput. Optim. Appl. 42(2), 173–193 (2009)

    MathSciNet  MATH  Article  Google Scholar 

  15. 15.

    Candès, E.J., Wakin, M.B., Boyd, S.P.: Enhancing sparsity by reweighted \(\ell _1\) minimization. J. Fourier Anal. Appl. 14(5–6), 877–905 (2008)

    MathSciNet  MATH  Article  Google Scholar 

  16. 16.

    Cartis, C., Gould, N.I.M., Toint, PhL: On the complexity of steepest descent, Newton’s and regularized Newton’s methods for nonconvex unconstrained optimization. SIAM J. Optim. 20(6), 2833–2852 (2010)

    MathSciNet  MATH  Article  Google Scholar 

  17. 17.

    Cartis, C., Gould, N.I.M., Toint, P.L.: Adaptive cubic overestimation methods for unconstrained optimization. Part II: worst-case function-evaluation complexity. Math. Program. Ser. A 130(2), 295–319 (2011)

    MATH  Article  Google Scholar 

  18. 18.

    Cartis, C., Gould, N.I.M., Toint, P.L.: An adaptive cubic regularization algorithm for nonconvex optimization with convex constraints and its function-evaluation complexity. IMA J. Numer. Anal. 32, 1662–1695 (2012)

    MathSciNet  MATH  Article  Google Scholar 

  19. 19.

    Chen, X., Ge, D., Wang, Z., Ye, Y.: Complexity of unconstrained \(l_2\)-\(l_p\) minimization. Math. Program. 143, 371–383 (2014)

    MathSciNet  MATH  Article  Google Scholar 

  20. 20.

    Curtis, F., Robinson, D.P., Samadi, M.: A trust region algorithm with a worst-case iteration complexity of \({\cal{O}} (\epsilon ^{-3/2})\) for nonconvex optimization. Math. Program. 162, 1–32 (2017)

    MathSciNet  MATH  Article  Google Scholar 

  21. 21.

    Devolder, O., François, G., Nesterov, Yu.: First-order methods of smooth convex optimization with inexact oracle. Math. Program. Ser. A 146, 37–75 (2014)

    MathSciNet  MATH  Article  Google Scholar 

  22. 22.

    Dutta, J., Deb, K., Tulshyan, R., Arora, R.: Approximate KKT points and a proximity measure for termination. J. Glob. Optim. 56, 1463–1499 (2013)

    MathSciNet  MATH  Article  Google Scholar 

  23. 23.

    Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96(456), 1348–1360 (2001)

    MathSciNet  MATH  Article  Google Scholar 

  24. 24.

    Frank, M., Wolfe, P.: An algorithm for quadratic programming. Nav. Res. Logist. Q. 3, 95–110 (1956)

    MathSciNet  Article  Google Scholar 

  25. 25.

    Freund, R.M., Grigas, P.: New analysis and results for the Frank–Wolfe method. Math. Program. 155, 199–230 (2016)

    MathSciNet  MATH  Article  Google Scholar 

  26. 26.

    Gao, X., Jiang, B., Zhang, S.: On the information-adaptive variants of the ADMM: an iteration complexity perspective. J. Sci. Comput. 76, 327–363 (2018)

    MathSciNet  MATH  Article  Google Scholar 

  27. 27.

    Ge, D., He, R., He, S.: A three criteria algorithm for \(l_2-l_p\) minimization problem with linear constraints. Math. Program. 166(1), 131–158 (2017)

    MathSciNet  Article  Google Scholar 

  28. 28.

    Ghadimi, S., Lan, G., Zhang, H.: Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization. Math. Program. 155(1), 1–39 (2016)

    MathSciNet  MATH  Google Scholar 

  29. 29.

    Gong, P., Zhang, C., Lu, Z., Huang, J., Ye, J.: A general iterative shrinkage and thresholding algorithm for non-convex regularized optimization problems. In: ICML, pp. 37–45 (2013)

  30. 30.

    Harchaoui, Z., Juditsky, A., Nemirovski, A.: Conditional gradient algorithms for norm-regularized smooth convex optimization. Math. Program. 152, 75–112 (2015)

    MathSciNet  MATH  Article  Google Scholar 

  31. 31.

    Hong, M.: A distributed, asynchronous and incremental algorithm for nonconvex optimization: an ADMM based approach. IEEE Trans. Control Netw. Syst. 5(3), 935–945 (2018)

    MathSciNet  MATH  Article  Google Scholar 

  32. 32.

    Hong, M.: Decomposing linearly constrained nonconvex problems by a proximal primal dual approach: algorithms, convergence, and applications. arXiv:1604.00543 (2016)

  33. 33.

    Hong, M., Luo, Z.-Q., Razaviyayn, M.M.: Convergence analysis of alternating direction method of multipliers for a family of nonconvex problems. SIAM J. Optim. 26(1), 337–364 (2016)

    MathSciNet  MATH  Article  Google Scholar 

  34. 34.

    Jaggi, M.: Revisiting Frank–Wolfe: projection-free sparse convex optimization. In: ICML (2013)

  35. 35.

    Jiang, B., Yang, F., Zhang, S.: Tensor and its Tucker core: the invariance relationships. Numer. Linear Algebra Appl. 24(3), e2086 (2017)

    MathSciNet  MATH  Article  Google Scholar 

  36. 36.

    Kurdyka, K.: On gradients of functions definable in o-minimal structures. Ann. Inst. Fourier 146, 769–783 (1998)

    MathSciNet  MATH  Article  Google Scholar 

  37. 37.

    Lan, G., Zhou, Y.: Conditional gradient sliding for convex optimization. SIAM J. Optim. 26(2), 1379–1409 (2016)

    MathSciNet  MATH  Article  Google Scholar 

  38. 38.

    Li, G., Pong, T.K.: Global convergence of splitting methods for nonconvex composite optimization. SIAM J. Optim. 25(4), 2434–2460 (2015)

    MathSciNet  MATH  Article  Google Scholar 

  39. 39.

    Lin, T., Ma, S., Zhang, S.: Global convergence of unmodified 3-block ADMM for a class of convex minimization problems. J. Sci. Comput 76, 69–88 (2018)

    MathSciNet  MATH  Article  Google Scholar 

  40. 40.

    Lin, T., Ma, S., Zhang, S.: Iteration complexity analysis of multi-block ADMM for a family of convex minimization without strong convexity. J. Sci. Comput. 69(1), 52–81 (2016)

    MathSciNet  MATH  Article  Google Scholar 

  41. 41.

    Liu, Y., Ma, S., Dai, Y., Zhang, S.: A smoothing SQP framework for a class of composite \(\ell _q\) minimization over polyhedron. Math. Program. Ser. A 158(1), 467–500 (2016)

    MATH  Article  Google Scholar 

  42. 42.

    Łojasiewicz, S.: Une propriété topologique des sous-ensembles analytiques réels, Les Équations aux Dérivées Partielles. Éditions du centre National de la Recherche Scientifique, Paris (1963)

  43. 43.

    Lacoste-Julien, S.: Convergence rate of Frank–Wolfe for non-convex objectives. Preprint arXiv:1607.00345 (2016)

  44. 44.

    Lafond, J., Wai, H.-T., Moulines, E.: On the Online Frank–Wolfe algorithms for convex and non-convex optimizations. Preprint arXiv:1510.01171

  45. 45.

    Luss, R., Teboulle, M.: Conditional gradient algorithms for rank one matrix approximations with a sparsity constraint. SIAM Rev. 55, 65–98 (2013)

    MathSciNet  MATH  Article  Google Scholar 

  46. 46.

    Martınez, J.M., Raydan, M.: Cubic-regularization counterpart of a variable-norm trust-region method for unconstrained minimization. J. Glob. Optim. 68, 367–385 (2017)

    MathSciNet  MATH  Article  Google Scholar 

  47. 47.

    Mu, C., Zhang, Y., Wright, J., Goldfarb, D.: Scalable robust matrix recovery: Frank–Wolfe meets proximal methods. SIAM J. Sci. Comput. 38(5), 3291–3317 (2016)

    MathSciNet  MATH  Article  Google Scholar 

  48. 48.

    Nesterov, Y.: Introductory Lectures on Convex Optimization. Applied Optimization. Kluwer Academic Publishers, Boston, MA (2004)

    Google Scholar 

  49. 49.

    Ngai, H.V., Luc, D.T., Théra, M.: Extensions of Fréchet \(\epsilon \)-subdifferential calculus and applications. J. Math. Anal. Appl. 268, 266–290 (2002)

    MathSciNet  MATH  Article  Google Scholar 

  50. 50.

    Rockafellar, R.T., Wets, R.: Variational Analysis. Volume 317 of Grundlehren der Mathematischen Wissenschafte. Springer, Berlin (1998)

    Google Scholar 

  51. 51.

    Shen, Y., Wen, Z., Zhang, Y.: Augmented Lagrangian alternating direction method for matrix separation based on low-rank factorization. Optim. Methods Softw. 29(2), 239–263 (2014)

    MathSciNet  MATH  Article  Google Scholar 

  52. 52.

    Wang, F., Cao, W., Xu, Z.: Convergence of multiblock Bregman ADMM for nonconvex composite problems. Preprint arXiv:1505.03063 (2015)

  53. 53.

    Wang, Y., Yin, W., Zeng, J.: Global convergence of ADMM in nonconvex nonsmooth optimization. J. Sci. Comput. 1–35 (2018)

  54. 54.

    Wen, Z., Yin, W., Zhang, Y.: Solving a low-rank factorization model for matrix completion by a nonlinear successive over-relaxation algorithm. Math. Program. Comput. 4(4), 333–361 (2012)

    MathSciNet  MATH  Article  Google Scholar 

  55. 55.

    Xu, Y.: Alternating proximal gradient method for sparse nonnegative Tucker decomposition. Math. Program. Comput. 7(1), 39–70 (2015)

    MathSciNet  MATH  Article  Google Scholar 

  56. 56.

    Yang, L., Pong, T.K., Chen, X.: Alternating direction method of multipliers for a class of nonconvex and nonsmooth problems with applications to background/foreground extraction. SIAM J. Imaging Sci. 10, 74–110 (2017)

    MathSciNet  MATH  Article  Google Scholar 

  57. 57.

    Yu, Y., Zhang, X., Schuurmans, D.: Generalized conditional gradient for sparse estimation. Preprint arXiv:1410.4828v1 (2014)

  58. 58.

    Zhang, C.-H.: Nearly unbiased variable selection under minimax concave penalty. Ann. Stat. 38(2), 894–942 (2010)

    MathSciNet  MATH  Article  Google Scholar 

  59. 59.

    Zhang, T.: Analysis of multi-stage convex relaxation for sparse regularization. J. Mach. Learn. Res. 11, 1081–1107 (2010)

    MathSciNet  MATH  Google Scholar 

  60. 60.

    Zhang, T.: Multi-stage convex relaxation for feature selection. Bernoulli 19(5B), 2277–2293 (2013)

    MathSciNet  MATH  Article  Google Scholar 

Download references

Acknowledgements

We would like to thank Professor Renato D. C. Monteiro and two anonymous referees for their insightful comments, which helped improve this paper significantly.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Bo Jiang.

Additional information

Bo Jiang: Research of this author was supported in part by NSFC Grants 11771269 and 11831002, and Program for Innovative Research Team of Shanghai University of Finance and Economics. Shiqian Ma: Research of this author was supported in part by a startup package in Department of Mathematics at UC Davis. Shuzhong Zhang: Research of this author was supported in part by the National Science Foundation (Grant CMMI-1462408), and in part by Shenzhen Fundamental Research Fund under Grant No. KQTD2015033114415450.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Jiang, B., Lin, T., Ma, S. et al. Structured nonconvex and nonsmooth optimization: algorithms and iteration complexity analysis. Comput Optim Appl 72, 115–157 (2019). https://doi.org/10.1007/s10589-018-0034-y

Download citation

Keywords

  • Structured nonconvex optimization
  • \(\epsilon \)-Stationary
  • Iteration complexity
  • Conditional gradient method
  • Alternating direction method of multipliers
  • Block coordinate descent method

Mathematics Subject Classification

  • 90C26
  • 90C06
  • 90C60