Abstract
We introduce a generalization of the linearized Alternating Direction Method of Multipliers to optimize a real-valued function f of multiple arguments with potentially multiple constraints \(g_\circ\) on each of them. The function f may be nonconvex as long as it is convex in every argument, while the constraints \(g_\circ\) need to be convex but not smooth. If f is smooth, the proposed Block-Simultaneous Direction Method of Multipliers (bSDMM) can be interpreted as a proximal analog to inexact coordinate descent methods under constraints. Unlike alternative approaches for joint solvers of multiple-constraint problems, we do not require linear operators \({{\mathsf {L}}}\) of a constraint function \(g({{\mathsf {L}}}\ \cdot )\) to be invertible or linked between each other. bSDMM is well-suited for a range of optimization problems, in particular for data analysis, where f is the likelihood function of a model and \({{\mathsf {L}}}\) could be a transformation matrix describing e.g. finite differences or basis transforms. We apply bSDMM to the Non-negative Matrix Factorization task of a hyperspectral unmixing problem and demonstrate convergence and effectiveness of multiple constraints on both matrix factors. The algorithms are implemented in python and released as an open-source package.
Similar content being viewed by others
Notes
Throughout this work, indices denote different variables or constraints, not elements of vectors or tensors.
We use \(||\cdot ||_{\mathrm {s}}\) to denote the spectral norm, \(||\cdot ||_2\) for the element-wise \(\ell _2\) norm of vectors and tensors.
While it is always possible to reformulate the problem thusly because we can set \(f({\mathbf {x}}_1) = g_l({{\mathsf {L}}}_{j 1} {\mathbf {x}}_1)\) for any l, it may render inefficient the minimization of f by means of a proximal operator. This is the limitation of the algorithm we derive in this section.
Data set obtained from https://engineering.purdue.edu/~biehl/MultiSpec/.
The choice of \(K=4\) is somewhat arbitrary, and we have not attempted to find the optimal number of components since that is not the focus of this work.
References
Berry MW, Browne M, Langville AN, Pauca VP, Plemmons RJ (2007) Algorithms and applications for approximate nonnegative matrix factorization. Comput Stat Data Anal 52(1):155–173
Blanton MR, Roweis S (2007) K-corrections and filter transformations in the ultraviolet, optical, and near-infrared. Astron J 133:734–754. https://doi.org/10.1086/510127. ArXiv:astro-ph/0606170
Boyd S, Parikh N, Chu E, Peleato B, Eckstein J (2011) Distributed optimization and statistical learning via the alternating direction method of multipliers. Found Trends® Mach Learn 3(1):1–122
Chambolle A (2004) An algorithm for total variation minimization and applications. J Math Imaging Vis 20(1):89–97. https://doi.org/10.1023/B:JMIV.0000011325.36760.1e
Chambolle A, Lions PL (1997) Image recovery via total variation minimization and related problems. Numer Math 76(2):167–188. https://doi.org/10.1007/s002110050258
Chen G, Teboulle M (1994) A proximal-based decomposition method for convex minimization problems. Math Program 64(1–3):81–101
Combettes PL, Pesquet JC (2007) A Douglas-Rachford splitting approach to nonsmooth convex variational signal recovery. IEEE J Sel Top Signal Process 1(4):564–574. https://doi.org/10.1109/JSTSP.2007.910264
Combettes PL, Pesquet JC (2011) Proximal splitting methods in signal processing. In: Fixed-point algorithms for inverse problems in science and engineering, Springer, pp 185–212
Combettes PL, Wajs VR (2005) Signal recovery by proximal forward-backward splitting. Multiscale Model Simul 4(4):1168–1200
Condat L (2013) A primal-dual splitting method for convex optimization involving lipschitzian, proximable and linear composite terms. J Optim Theory Appl 158(2):460–479
Douglas J, Rachford HH (1956) On the numerical solution of heat conduction problems in two and three space variables. Trans Am Math Soc 82(2):421–439
Eckstein J, Bertsekas DP (1992) On the Douglas-Rachford splitting method and the proximal point algorithm for maximal monotone operators. Math Program 55(1):293–318
Eckstein J, Yao W (2017) Approximate ADMM algorithms derived from Lagrangian splitting. Comput Optim Appl. https://doi.org/10.1007/s10589-017-9911-z
Esser E, Zhang X, Chan TF (2010) A general framework for a class of first order primal-dual algorithms for convex optimization in imaging science. SIAM J Imaging Sci 3(4):1015–1046
Gabay D, Mercier B (1976) A dual algorithm for the solution of nonlinear variational problems via finite element approximation. Comput Math Appl 2(1):17–40
Gillis N (2014) The why and how of nonnegative matrix factorization, Chapman and Hall/CRC, pp 257–291. https://doi.org/10.1201/b17558-13
Glowinski R, Marroco A (1975) Sur l’approximation, par éléments finis d’ordre un, et la résolution, par pénalisation-dualité d’une classe de problèmes de dirichlet non linéaires. Revue française d’automatique, informatique, recherche opérationnelle Analyse numérique 9(2):41–76
Grippo L, Sciandrone M (2000) On the convergence of the block nonlinear Gauss-Seidel method under convex constraints. Oper Res Lett 26(3):127–136
Hong M, Luo ZQ, Razaviyayn M (2016) Convergence analysis of alternating direction method of multipliers for a family of nonconvex problems. SIAM J Optim 26(1):337–364
Jia S, Qian Y (2009) Constrained nonnegative matrix factorization for hyperspectral unmixing. IEEE Trans Geosci Remote Sens 47(1):161–173. https://doi.org/10.1109/TGRS.2008.2002882
Komodakis N, Pesquet JC (2015) Playing with duality: an overview of recent primal-dual approaches for solving large-scale optimization problems. IEEE Signal Process Mag 32(6):31–54
Lee DD, Seung HS (2001) Algorithms for non-negative matrix factorization. In: Leen TK, Dietterich TG, Tresp V (eds) Advances in Neural Information Processing Systems 13, MIT Press, pp 556–562, http://papers.nips.cc/paper/1861-algorithms-for-non-negative-matrix-factorization.pdf
Lin CJ (2007) Projected gradient methods for nonnegative matrix factorization. Neural Comput 19(10):2756–2779
Mitchell PA (1995) Hyperspectral digital imagery collection experiment (hydice). In: Proceedings of SPIE, vol 2587, pp 2587–2587–26, https://doi.org/10.1117/12.226807
Nesterov Y (2013) Gradient methods for minimizing composite functions. Math Program 140(1):125–161
Paatero P, Tapper U (1994) Positive matrix factorization: a non-negative factor model with optimal utilization of error estimates of data values. Environmetrics 5(2):111–126
Parikh N, Boyd S et al (2014) Proximal algorithms. Found Trends® Optim 1(3):127–239
Pesquet JC, Pustelnik N (2012) A parallel inertial proximal optimization method. Pac J Optim 8(2):273–305, https://hal.archives-ouvertes.fr/hal-00790702
Razaviyayn M, Hong M, Luo ZQ (2013) A unified convergence analysis of block successive minimization methods for nonsmooth optimization. SIAM J Optim 23(2):1126–1153
Stephanopoulos G, Westerberg AW (1975) The use of Hestenes’ method of multipliers to resolve dual gaps in engineering system optimization. J Optim Theory Appl 15(3):285–309
Wang Y, Yin W, Zeng J (2015) Global convergence of ADMM in nonconvex nonsmooth optimization. arXiv preprint arXiv:151106324
Xu Y, Yin W (2013) A block coordinate descent method for regularized multiconvex optimization with applications to nonnegative tensor factorization and completion. SIAM J Imaging Sci 6(3):1758–1789
Zhang S, Qian H, Gong X (2016) An alternating proximal splitting method with global convergence for nonconvex structured sparsity optimization. In: AAAI, pp 2330–2336
Zhu G (2016) Nonnegative Matrix Factorization (NMF) with Heteroscedastic Uncertainties and Missing Data. ArXiv e-prints ArXiv:1612.06037
Acknowledgements
We would like to thank Robert Vanderbei and Jonathan Eckstein for useful discussions regarding the algorithm, and Jim Bosch and Robert Lupton for comments on its astrophysical applications.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Moolekamp, F., Melchior, P. Block-simultaneous direction method of multipliers: a proximal primal-dual splitting algorithm for nonconvex problems with multiple constraints. Optim Eng 19, 871–885 (2018). https://doi.org/10.1007/s11081-018-9380-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11081-018-9380-y