Abstract
We present a second order algorithm, based on orthantwise directions, for solving optimization problems involving the sparsity enhancing \(\ell _1\)-norm. The main idea of our method consists in modifying the descent orthantwise directions by using second order information both of the regular term and (in weak sense) of the \(\ell _1\)-norm. The weak second order information behind the \(\ell _1\)-term is incorporated via a partial Huber regularization. One of the main features of our algorithm consists in a faster identification of the active set. We also prove that a reduced version of our method is equivalent to a semismooth Newton algorithm applied to the optimality condition, under a specific choice of the algorithm parameters. We present several computational experiments to show the efficiency of our approach compared to other state-of-the-art algorithms.
Similar content being viewed by others
References
Andrew, G., Gao, J.: Scalable training of \(\ell _1\)—regularized log-linear models. In: Proceedings of the Twenty Fourth Conference on Machine Learning (ICML), (2007)
Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009)
Byrd, R., Chin, G., Nocedal, J., Wu, Y.: Sample size selection in optimization methods for machine learning. Math. Program. 134(1), 127–155 (2011)
Byrd, R., Chin, G.M., Nocedal, J., Oztoprak, F.: A family of second-order methods for convex \(\ell _1\)-regularized optimization. Math. Program. 159(1), 435–467 (2016)
Casas, E., Ryll, C., Tröltzsch, F.: Sparse optimal control of the Schlögl and Fitzhugh–Nagumo systems. Comput. Methods Appl. Math. 13(4), 415–442 (2013)
Chouzenoux, E., Pesquet, J.C., Repetti, A.: Variable metric forward–backward algorithm for minimizing the sum of a differentiable function and a convex function. J. Optim. Theory Appl. 162(1), 107–132 (2014)
Ciarlet, P.: Linear and nonlinear functional analysis with applications. SIAM, Philadelphia (2013)
Collins, M., Koo, T.: Discriminative reranking for natural language parsing. Comput. Linguist. 31(1), 25–70 (2005)
De los Reyes, J.C.: Numerical PDE-Constrained Optimization. Springer, New York (2015)
De Santis, M., Lucidi, S., Rinaldi, F.: A fast active set block coordinate descent algorithm for \(l_1\)-regularized least squares. SIAM J. Optim. 26(1), 781–809 (2016)
Facchinei, F., Pang, J.S.: Finite-dimensional Variational Inequalities and Complementarity Problems, Vols I and II. Springer, Berlin (2003)
Fountoulakis, K., Gondzio, J.: A second-order method for strongly convex \(\ell _1\)-regularization problems. Math. Program. 156(1), 189–219 (2016)
Fu, H., Ng, M.K., Nikolova, M., Barlow, J.L.: Efficient minimization methods of mixed \(\ell _2\)-\(\ell _1\) and \(\ell _1\)-\(\ell _1\) norms for image restoration. SIAM J. Sci. Comput. 27(6), 1881–1902 (2006)
Geiger, C., Kanzow, C.: Numerische Verfahren zur Lösung unrestringierter Optimierungsaufgaben. Springer, Heidelberg (1999)
Herzog, R., Stadler, G., Wachsmuth, G.: Directional sparsity in optimal control of partial differential equations. SIAM J. Control Optim. 50(2), 943–963 (2012)
Kurdyka, Krzysztof, Parusinski, Adam: Wf-stratification of subanalytic functions and the lojasiewicz inequality. Comptes Rendus Acad. Sci Sér. Math. 318(2), 129–133 (1994)
Meinshausen, N., Yu, B.: Lasso-type recovery of sparse representations for high-dimensional data. Ann. Stat. 37(1), 246–270 (2009)
Milzarek, A., Ulbrich, M.: A semismooth Newton method with multidimensional filter globalization for \(\ell _1\)-optimization. SIAM J. Optim. 24(1), 298–333 (2014)
Minka, T.P.: A comparison of numerical optimizers for logistic regression. Unpublished draft, (2003)
Nesterov, Y.: Gradient methods for minimizing composite functions. Math. Program. 140(1), 125–161 (2013)
Quarteroni, A.: Numerical Models for Differential Problems, vol. 2. Springer, Berlin (2010)
Saad, Y.: Krylov subspace methods for solving large unsymmetric linear systems. Math. Comput. 37(155), 105–126 (1981)
Saad, Y.: Iterative Methods for Sparse Linear Systems. SIAM, Philadelphia (2003)
Solntsev, S., Nocedal, J., Byrd, R.H.: An algorithm for quadratic 1-regularized optimization with a flexible active-set strategy. Optim. Methods Softw. 30(6), 1213–1237 (2015)
Sra, S., Nowozin, S., Wright., S.J.: Optimization for Machine Learning. MIT Press, Cambridge (2012)
Stadler, G.: Elliptic optimal control problems with \(L^1\)-control cost and applications for the placement of control devices. Comput. Optim. Appl. 44(2), 159–181 (2009)
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B (Methodol.) 58(1), 267–288 (1996)
Wright, J.S.: Accelerated block-coordinate relaxation for regularized optimization. SIAM J. Optim. 22(1), 159–186 (2012)
Author information
Authors and Affiliations
Corresponding author
Additional information
This research has been partially supported by SENESCYT Award PIC-13-INAMHI-002 ”Sistema de Pronóstico del Tiempo para todo el Territorio Ecuatoriano: Modelización Numérica y Estadística”, a joint project between the Research Center on Mathematical Modelling (MODEMAT) and the Instituto Nacional de Meteorología e Hidrología (INAMHI). Moreover, we acknowledge partial support of MATHAmSud project SOCDE “Sparse Optimal Control of Differential Equations”.
Rights and permissions
About this article
Cite this article
De Los Reyes, J.C., Loayza, E. & Merino, P. Second-order orthant-based methods with enriched Hessian information for sparse \(\ell _1\)-optimization. Comput Optim Appl 67, 225–258 (2017). https://doi.org/10.1007/s10589-017-9891-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10589-017-9891-z