Advertisement

Computational Optimization and Applications

, Volume 67, Issue 2, pp 225–258 | Cite as

Second-order orthant-based methods with enriched Hessian information for sparse \(\ell _1\)-optimization

  • J. C. De Los ReyesEmail author
  • E. Loayza
  • P. Merino
Article

Abstract

We present a second order algorithm, based on orthantwise directions, for solving optimization problems involving the sparsity enhancing \(\ell _1\)-norm. The main idea of our method consists in modifying the descent orthantwise directions by using second order information both of the regular term and (in weak sense) of the \(\ell _1\)-norm. The weak second order information behind the \(\ell _1\)-term is incorporated via a partial Huber regularization. One of the main features of our algorithm consists in a faster identification of the active set. We also prove that a reduced version of our method is equivalent to a semismooth Newton algorithm applied to the optimality condition, under a specific choice of the algorithm parameters. We present several computational experiments to show the efficiency of our approach compared to other state-of-the-art algorithms.

Keywords

Sparse optimization Orthantwise directions Second-order algorithms Semismooth Newton methods 

Mathematics Subject Classification

49M15 65K05 90C53 49J20 49K20 

References

  1. 1.
    Andrew, G., Gao, J.: Scalable training of \(\ell _1\)—regularized log-linear models. In: Proceedings of the Twenty Fourth Conference on Machine Learning (ICML), (2007)Google Scholar
  2. 2.
    Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  3. 3.
    Byrd, R., Chin, G., Nocedal, J., Wu, Y.: Sample size selection in optimization methods for machine learning. Math. Program. 134(1), 127–155 (2011)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Byrd, R., Chin, G.M., Nocedal, J., Oztoprak, F.: A family of second-order methods for convex \(\ell _1\)-regularized optimization. Math. Program. 159(1), 435–467 (2016)Google Scholar
  5. 5.
    Casas, E., Ryll, C., Tröltzsch, F.: Sparse optimal control of the Schlögl and Fitzhugh–Nagumo systems. Comput. Methods Appl. Math. 13(4), 415–442 (2013)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Chouzenoux, E., Pesquet, J.C., Repetti, A.: Variable metric forward–backward algorithm for minimizing the sum of a differentiable function and a convex function. J. Optim. Theory Appl. 162(1), 107–132 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Ciarlet, P.: Linear and nonlinear functional analysis with applications. SIAM, Philadelphia (2013)Google Scholar
  8. 8.
    Collins, M., Koo, T.: Discriminative reranking for natural language parsing. Comput. Linguist. 31(1), 25–70 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    De los Reyes, J.C.: Numerical PDE-Constrained Optimization. Springer, New York (2015)zbMATHGoogle Scholar
  10. 10.
    De Santis, M., Lucidi, S., Rinaldi, F.: A fast active set block coordinate descent algorithm for \(l_1\)-regularized least squares. SIAM J. Optim. 26(1), 781–809 (2016)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Facchinei, F., Pang, J.S.: Finite-dimensional Variational Inequalities and Complementarity Problems, Vols I and II. Springer, Berlin (2003)zbMATHGoogle Scholar
  12. 12.
    Fountoulakis, K., Gondzio, J.: A second-order method for strongly convex \(\ell _1\)-regularization problems. Math. Program. 156(1), 189–219 (2016)MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Fu, H., Ng, M.K., Nikolova, M., Barlow, J.L.: Efficient minimization methods of mixed \(\ell _2\)-\(\ell _1\) and \(\ell _1\)-\(\ell _1\) norms for image restoration. SIAM J. Sci. Comput. 27(6), 1881–1902 (2006)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Geiger, C., Kanzow, C.: Numerische Verfahren zur Lösung unrestringierter Optimierungsaufgaben. Springer, Heidelberg (1999)CrossRefzbMATHGoogle Scholar
  15. 15.
    Herzog, R., Stadler, G., Wachsmuth, G.: Directional sparsity in optimal control of partial differential equations. SIAM J. Control Optim. 50(2), 943–963 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  16. 16.
    Kurdyka, Krzysztof, Parusinski, Adam: Wf-stratification of subanalytic functions and the lojasiewicz inequality. Comptes Rendus Acad. Sci Sér. Math. 318(2), 129–133 (1994)MathSciNetzbMATHGoogle Scholar
  17. 17.
    Meinshausen, N., Yu, B.: Lasso-type recovery of sparse representations for high-dimensional data. Ann. Stat. 37(1), 246–270 (2009)Google Scholar
  18. 18.
    Milzarek, A., Ulbrich, M.: A semismooth Newton method with multidimensional filter globalization for \(\ell _1\)-optimization. SIAM J. Optim. 24(1), 298–333 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Minka, T.P.: A comparison of numerical optimizers for logistic regression. Unpublished draft, (2003)Google Scholar
  20. 20.
    Nesterov, Y.: Gradient methods for minimizing composite functions. Math. Program. 140(1), 125–161 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  21. 21.
    Quarteroni, A.: Numerical Models for Differential Problems, vol. 2. Springer, Berlin (2010)Google Scholar
  22. 22.
    Saad, Y.: Krylov subspace methods for solving large unsymmetric linear systems. Math. Comput. 37(155), 105–126 (1981)MathSciNetCrossRefzbMATHGoogle Scholar
  23. 23.
    Saad, Y.: Iterative Methods for Sparse Linear Systems. SIAM, Philadelphia (2003)Google Scholar
  24. 24.
    Solntsev, S., Nocedal, J., Byrd, R.H.: An algorithm for quadratic 1-regularized optimization with a flexible active-set strategy. Optim. Methods Softw. 30(6), 1213–1237 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
  25. 25.
    Sra, S., Nowozin, S., Wright., S.J.: Optimization for Machine Learning. MIT Press, Cambridge (2012)Google Scholar
  26. 26.
    Stadler, G.: Elliptic optimal control problems with \(L^1\)-control cost and applications for the placement of control devices. Comput. Optim. Appl. 44(2), 159–181 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  27. 27.
    Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B (Methodol.) 58(1), 267–288 (1996)Google Scholar
  28. 28.
    Wright, J.S.: Accelerated block-coordinate relaxation for regularized optimization. SIAM J. Optim. 22(1), 159–186 (2012)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2017

Authors and Affiliations

  1. 1.Research Center on Mathematical Modeling (MODEMAT)Escuela Politécnica NacionalQuitoEcuador

Personalised recommendations