Abstract
In this paper, we consider the problem of minimizing the sum of two convex functions subject to linear linking constraints. The classical alternating direction type methods usually assume that the two convex functions have relatively easy proximal mappings. However, many problems arising from statistics, image processing and other fields have the structure that while one of the two functions has an easy proximal mapping, the other function is smoothly convex but does not have an easy proximal mapping. Therefore, the classical alternating direction methods cannot be applied. To deal with the difficulty, we propose in this paper an alternating direction method based on extragradients. Under the assumption that the smooth function has a Lipschitz continuous gradient, we prove that the proposed method returns an \(\epsilon \)-optimal solution within \(O(1/\epsilon )\) iterations. We apply the proposed method to solve a new statistical model called fused logistic regression. Our numerical experiments show that the proposed method performs very well when solving the test problems. We also test the performance of the proposed method through solving the lasso problem arising from statistics and compare the result with several existing efficient solvers for this problem; the results are very encouraging.
Similar content being viewed by others
References
O. Banerjee, L. El Ghaoui, and A. d’Aspremont. Model selection through sparse maximum likelihood estimation for multivariate gaussian for binary data. Journal of Machine Learning Research, 9:485–516, 2008.
A. Beck and M. Teboulle. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sciences, 2(1):183–202, 2009.
S. Bonettini and V. Ruggiero. An alternating extragradient method for total variation based image restoration from Poisson data. Inverse Problems, 27:095001, 2011.
S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein. Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends in Machine Learning, 3(1):1–122, 2011.
E. J. Candès, X. Li, Y. Ma, and J. Wright. Robust principal component analysis? Journal of ACM, 58(3):1–37, 2011.
V. Chandrasekaran, S. Sanghavi, P. Parrilo, and A. Willsky. Rank-sparsity incoherence for matrix decomposition. SIAM Journal on Optimization, 21(2):572–596, 2011.
D. Davis and W. Yin. Convergence rate analysis of several splitting schemes. UCLA CAM Report 14-51, 2014.
J. Douglas and H. H. Rachford. On the numerical solution of the heat conduction problem in 2 and 3 space variables. Transactions of the American Mathematical Society, 82:421–439, 1956.
J. Eckstein. Splitting methods for monotone operators with applications to parallel optimization. PhD thesis, Massachusetts Institute of Technology, 1989.
J. Eckstein and D. P. Bertsekas. On the Douglas-Rachford splitting method and the proximal point algorithm for maximal monotone operators. Mathematical Programming, 55:293–318, 1992.
J. Eckstein and P. J. S. Silva. A practical relative error criterion for augmented lagrangians. Mathematical Programming, 141:319–348, 2013.
M. Fortin and R. Glowinski. Augmented Lagrangian methods: applications to the numerical solution of boundary-value problems. North-Holland Pub. Co., 1983.
J. Friedman, T. Hastie, and R. Tibshirani. Sparse inverse covariance estimation with the graphical lasso. Biostatistics, 9(3):432–441, 2008.
D. Gabay. Applications of the method of multipliers to variational inequalities. In M. Fortin and R. Glowinski, editors, Augmented Lagrangian Methods: Applications to the Solution of Boundary Value Problems. North-Hollan, Amsterdam, 1983.
X. Gao, B. Jiang, and S. Zhang. On the information-adaptive variants of the ADMM: an iteration complexity perspective. Optimization Online http://www.optimization-online.org/DB_FILE/2014/11/4633.pdf, 2014.
P. E. Gill, W. Murray, and M. A. Saunders. Users guide for SQOPT 5.3: a Fortran package for large-scale linear and quadratic programming. Technical report, Technical Report NA 97-4. University of California, San Diego., 1997.
R. Glowinski and P. Le Tallec. Augmented Lagrangian and Operator-Splitting Methods in Nonlinear Mechanics. SIAM, Philadelphia, Pennsylvania, 1989.
T. Goldstein and S. Osher. The split Bregman method for L1-regularized problems. SIAM J. Imaging Sci., 2:323–343, 2009.
E. T. Hale, W. Yin, and Y. Zhang. Fixed-point continuation for \(\ell _1\)-minimization: Methodology and convergence. SIAM Journal on Optimization, 19(3):1107–1130, 2008.
B. He, L. Liao, D. Han, and H. Yang. A new inexact alternating direction method for monotone variational inequalities. Mathematical Programming, 92:103–118, 2002.
B. He and X. Yuan. On the \({\cal {O}}(1/n)\) convergence rate of douglas-rachford alternating direction method. SIAM Journal on Numerical Analysis, 50:700–709, 2012.
G. Korpelevich. The extragradient method for finding saddle points and other problems. Ekonomika i Matematicheskie Metody, 12:747–756, 1976. (in Russian; English translation in Matekon).
G. Korpelevich. Extrapolation gradient methods and relation to modified lagrangeans. Ekonomika i Matematicheskie Metody, 19:694–703, 1983. (in Russian; English translation in Matekon).
T. Lin, S. Ma, and S. Zhang. On the sublinear convergence rate of multi-block ADMM. Journal of the Operations Research Society of China. doi:10.1007/s40305-015-0092-0, 2015.
P. L. Lions and B. Mercier. Splitting algorithms for the sum of two nonlinear operators. SIAM Journal on Numerical Analysis, 16:964–979, 1979.
J. Liu, J. Chen, and J. Ye. Large-scale sparse logistic regression. In SIGKDD, 2009.
S. Ma. Alternating direction method of multipliers for sparse principal component analysis. Journal of the Operations Research Society of China, 1(2):253–274, 2013.
S. Ma, D. Goldfarb, and L. Chen. Fixed point and Bregman iterative methods for matrix rank minimization. Mathematical Programming Series A, 128:321–353, 2011.
R. D. C. Monteiro and B. F. Svaiter. On the complexity of the hybrid proximal extragradient method for the iterates and the ergodic mean. SIAM Journal on Optimization, 20:2755–2787, 2010.
R. D. C. Monteiro and B. F. Svaiter. Complexity of variants of Tseng’s modified F-B splitting and Korpelevich’s methods for hemi-variational inequalities with applications to saddle point and convex optimization problems. SIAM Journal on Optimization, 21:1688–1720, 2011.
R. D. C. Monteiro and B. F. Svaiter. Iteration-complexity of block-decomposition algorithms and the alternating direction method of multipliers. SIAM Journal on Optimization, 23:475–507, 2013.
I. Necoara and J. Suykens. Application of a smoothing technique to decomposition in convex optimization. IEEE Trans. Automat. Contr., 53(11):2674–2679, 2008.
A. Nemirovski. Prox-method with rate of convergence \(O(1/t)\) for variational inequalities with Lipschitz continuous monotone operators and smooth convex-concave saddle point problems. SIAM Journal on Optimization, 15(1):229–251, 2005.
Y. E. Nesterov. Smooth minimization for non-smooth functions. Math. Program. Ser. A, 103:127–152, 2005.
M. A. Noor. New extragradient-type methods for general variational inequalities. Journal of Mathematical Analysis and Applications, 277(2):379–394, 2003.
D. H. Peaceman and H. H. Rachford. The numerical solution of parabolic elliptic differential equations. SIAM Journal on Applied Mathematics, 3:28–41, 1955.
P. Richtarik and M. Takac. Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function. Mathematical Programming, 144(2):1–38, 2014.
P. Richtarik and M. Takac. Parallel coordinate descent methods for big data optimization. Mathematical Programming, 2015.
K. Scheinberg, S. Ma, and D. Goldfarb. Sparse inverse covariance selection via alternating linearization methods. In NIPS, 2010.
S. Shalev-Shwartz and T. Zhang. Accelerated proximal stochastic dual coordinate ascent for regularized loss minimization. Mathematical Programming, 2015.
M. V. Solodov and B. F. Svaiter. A hybrid approximate extragradient-proximal point algorithm using the enlargement of a maximal monotone operator. Set-Valued Anal., 7:323–345, 1999.
T. Suzuki. Stochastic dual coordinate ascent with alternating direction multiplier method. In ICML, 2014.
M. Tao and X. Yuan. Recovering low-rank and sparse components of matrices from incomplete and noisy observations. SIAM J. Optim., 21:57–81, 2011.
R. Tibshirani. Regression shrinkage and selection via the lasso. Journal Royal Statistical Society B, 58:267–288, 1996.
R. Tibshirani, M. Saunders, S. Rosset, J. Zhu, and K. Knight. Sparsity and smoothness via the fused lasso. J. R. Stat. Soc. Ser. B Stat. Methodol., 67(1):91–108, 2005.
Q. Tran Dinh, I. Necoara, and M. Diehl. Path-following gradient-based decomposition algorithms for separable convex optimization. Journal of Global Optimization, 59:59–80, 2014.
Y. Wang, J. Yang, W. Yin, and Y. Zhang. A new alternating minimization algorithm for total variation image reconstruction. SIAM Journal on Imaging Sciences, 1(3):248–272, 2008.
Z. Wen, D. Goldfarb, and W. Yin. Alternating direction augmented Lagrangian methods for semidefinite programming. Mathematical Programming Computation, 2:203–230, 2010.
J. Yang and X. Yuan. Linearized augmented Lagrangian and alternating direction methods for nuclear norm minimization. Mathematics of Computation, 82(281):301–329, 2013.
J. Yang and Y. Zhang. Alternating direction algorithms for \(\ell _1\) problems in compressive sensing. SIAM Journal on Scientific Computing, 33(1):250–278, 2011.
G. Ye and X. Xie. Split Bregman method for large scale fused Lasso. Computational Statistics and Data Analysis, 55(4):1552–1569, 2011.
M. Yuan and Y. Lin. Model selection and estimation in the Gaussian graphical model. Biometrika, 94(1):19–35, 2007.
X. Yuan. Alternating direction methods for sparse covariance selection. Journal of Scientific Computing, 51:261–273, 2012.
X. Zhang, M. Burger, X. Bresson, and S. Osher. Bregmanized nonlocal regularization for deconvolution and sparse reconstruction. SIAM Journal on Imaging Science, 3:253–276, 2010.
Acknowledgments
We thank Lingzhou Xue and Hui Zou for fruitful discussions on logistic regression and fused lasso, and Ya-Feng Liu for insightful discussions on Definition 3.1. We are also grateful to two anonymous referees for their constructive comments that have helped improve the presentation of this paper greatly.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Michael Jeremy Todd.
Shiqian Ma: Research of this author was supported in part by a Direct Grant of The Chinese University of Hong Kong (Project ID: 4055016) and the Hong Kong Research Grants Council General Research Funds Early Career Scheme (Project ID: CUHK 439513).
Shuzhong Zhang: Research of this author was supported in part by the NSF Grant CMMI-1161242.
Rights and permissions
About this article
Cite this article
Lin, T., Ma, S. & Zhang, S. An Extragradient-Based Alternating Direction Method for Convex Minimization. Found Comput Math 17, 35–59 (2017). https://doi.org/10.1007/s10208-015-9282-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10208-015-9282-8