Abstract
Structure-enforced matrix factorization (SeMF) represents a large class of mathematical models appearing in various forms of principal component analysis, sparse coding, dictionary learning and other machine learning techniques useful in many applications including neuroscience and signal processing. In this paper, we present a unified algorithm framework, based on the classic alternating direction method of multipliers (ADMM), for solving a wide range of SeMF problems whose constraint sets permit low-complexity projections. We propose a strategy to adaptively adjust the penalty parameters which is the key to achieving good performance for ADMM. We conduct extensive numerical experiments to compare the proposed algorithm with a number of state-of-the-art special-purpose algorithms on test problems including dictionary learning for sparse representation and sparse nonnegative matrix factorization. Results show that our unified SeMF algorithm can solve different types of factorization problems as reliably and as efficiently as special-purpose algorithms. In particular, our SeMF algorithm provides the ability to explicitly enforce various combinatorial sparsity patterns that, to our knowledge, has not been considered in existing approaches.
Similar content being viewed by others
References
Paatero, P., Tapper, U.: Positive matrix factorization: a non-negative factor model with optimal utilization of error estimates of data values. Environmetrics 5(2), 111–126 (1994)
Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. In: Conference on Neural Information Processing Systems (NIPS), pp. 556–562. MIT Press, MIT (2001)
Gonzalez, E., Zhang, Y.: Accelerating the lee-seung algorithm for nonnegative matrix factorization. Technical Report TR05-02, CAAM, Rice University (2005)
Lin, C.J.: On the convergence of multiplicative update algorithms for nonnegative matrix factorization. IEEE Trans. Neural Netw. 18(6), 1589–1596 (2007)
Albright, R., Cox, J., Duling, D., Langville, A.N., Meyer, C.D.: Algorithms, initializations, and convergence for the nonnegative matrix factorization. NCSU Technical Report, Math 81706 (2006)
Lin, C.J.: Projected gradient methods for non-negative matrix factorization. Neural Comput. 19(10), 2756–2779 (2007)
Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401, 788–791 (1999)
Hoyer, P.O.: Non-negative sparse coding. Neural Networks for Signal Processing XII (Proc. IEEE Workshop on Neural Networks for Signal Processing), pp. 557–565. Martigny, Switzerland (2002)
Feng, T., Li, S.Z., Shum, H.Y., Zhang, H.J.: Local non-negative matrix factorization as a visual representation. In: Proceedings of the 2nd International Conference on Development and Learning, pp. 178–183 (2002)
Hoyer, P.O., Dayan, P.: Non-negative matrix factorization with sparseness constraints. J. Mach. Learn. Res. 5, 1457–1469 (2004)
Montano, A.P., Carazo, J.M., Kochi, K., Lehmann, D., Pascual-Marqui, R.D.: Nonsmooth nonnegative matrix factorization (nsnmf). IEEE Trans. Pattern Anal. 28(3), 403–415 (2006)
Jenatton, R., Obozinski, G., Bach, F.: Structured sparse principal component analysis. International Conference on Artificial Intellgence and Statistics (AISTATS) (2010)
Peharz, R., Pernkopf, F.: Sparse nonnegative matrix factorization with \(\ell _0\)-constraints. Neurocomputing 80, 38–46 (2012)
Zheng, W.S., Lai, J.H., Liao, S.C., He, R.: Extracting non-negative basis images using pixel dispersion penalty. Pattern Recogn. 45(8), 2912–2926 (2012)
Aharon, M., Elad, M., Bruckstein, A.: K-svd: an algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans. Signal Process. 54(11), 4311–4322 (2006)
Mairal, J., Bach, F., Ponce, J., Sapiro, G.: Online dictionary learning for sparse coding. In: Proceedings of the 26th Annual International Conference on Machine Learning 8, pp. 689–696 (2009)
Jenatton, R., Audibert, J.Y., Bach, F.: Structured variable selection with sparsity-inducing norms. J. Mach. Learn. Res. 12, 2777–2824 (2011)
Szabo, Z., Poczos, B., Lorincz, A.: Online group-structured dictionary learning. In: Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 11, pp. 2865–2872, Washington, DC (2011)
Glowinski, R., Marroco, A.: Sur lapproximation, par elements finis dordre un, et la resolution, par penalisation-dualite dune classe de problemes de dirichlet non lineaires. Revue francaise dautomatique, informatique, recherche operationnelle. Analyse numerique 9(2), 41–76 (1975)
Gabay, D., Mercier, B.: A dual algorithm for the solution of nonlinear variational problems via finite element approximation. Comput. Math. Appl. 2(1), 17–40 (1976)
Yang, J., Zhang, Y.: Alternating direction algorithms for L1-problems in compressive sensing. SIAM J. Sci. Comput. 33(1), 250–278 (2011)
Goldstein, T., Osher, S.: The split Bregman method for L1 regularized problems. SIAM J. Imaging Sci. 2(2), 323–343 (2009)
Glowinski, R., Le Tallec, P.: Augmented Lagrangian and Operator Splitting Methods in Nonlinear Mechanics. SIAM, Philadelphia (1989)
He, B., Liao, L., Han, D., Yang, H.: A new inexact alternating directions method for monotone variational inequalities. Math. Program. A(92), 103–118 (2002)
Deng, W., Yin, W.: On the global and linear convergence of the generalized alternating direction method of multipliers. Technical Report TR12-14, CAAM, Rice University (2012)
Powell, M.J.D.: A Method for Nonlinear Constraints in Minimization Problems. Optimization. Academic Press, London (1969)
Hestenes, M.R.: Multiplier and gradient methods. J. Optimiz. Theory Appl. 4, 303–320 (1969)
Rockafellar, R.T.: The multiplier method of hestenes and powell applied to convex programming. J. Optimiz. Theory Appl. 12, 555–562 (1973)
Fortin, M., Glowinski, R.: Augmented Lagrangian Methods. North-Holland, Amsterdam (1983)
Zhang, Y.: An alternating direction algorithm for nonnegative matrix factorization. Technical Report TR10-03, CAAM, Rice University (2010)
He, B.S., Yang, H., Wang, S.L.: Alternating direction method with self adaptive penalty parameters for monotone variational inequalities. J. Optimiz. Theory Appl. 106(2), 337–356 (2000)
Wen, Z., Goldfarb, D., Yin, W.: Alternating direction augmented Lagrangian methods for semidefinite programming. Math. Progam. Comput. 2, 203–230 (2010)
Lin, Z., Liu, R., Su, Z.: Linearized alternating direction method with adaptive penalty for low rank representation. Advances in Neural Information Processing Systems(NIPS) 24 (2011)
Samaria, F.S., Harter, A.C.: Parameterisation of a stochastic model for human face identification. In: Proceedings of the 2nd IEEE Workshop on Applications of Computer Vision, pp. 138–142 (1994)
Donoho, D., Stodden, V.: When does non-negative matrix factorization give a correct decomposition into Parts? In: Proceedings of 17th Annual Conferance Neural Information Processing Systems NIPS (2003)
Mallat, S., Zhang, Z.F.: Matching pursuit with time-frequency dictionaries. IEEE Trans. Signal Process. 41, 3397–3415 (1993)
Chen, S., Billings, S.A., Luo, W.: Orthogonal least squares methods and their application to non-linear system identification. Int. J. Control 50, 1873–1896 (1989)
Davis, G.M., Mallat, S.G., Zhang, Z.F.: Adaptive time-frequency decompositions. Opt. Eng. 33(7), 2183–2191 (1994)
Pati, Y.C., Rezaiifar, R., Krishnaprasad, P.S.: Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition. In: 1993 Conference Record of The Twenty-Seventh Asilomar Conference on Signals, Systems and Computers 1, pp. 40–44 (1993)
Tropp, J.A.: Greed is good: algorithmic results for sparse approximation. IEEE Trans. Inf. Theory 50(10), 2231–2242 (2004)
Chen, S., Donoho, D., Saunders, M.: Atomic decomposition by basis pursuit. SIAM J. Sci. Comput. 20, 33–61 (1998)
Gorodnitsky, I.F., Rao, B.D.: Sparse signal reconstruction from limited data using focuss: A re-weighted norm minimization algorithm. IEEE Trans. Signal Process. 45, 600–616 (1997)
Rao, B.D., Kreutz-Delgado, K.: An affine scaling methodology for best basis selection. IEEE Trans. Signal Process. 47(1), 187–200 (1999)
Rao, B.D., Kreutz-Delgado, K.: Deriving algorithms for computing sparse solutions to linear inverse problems. In: IEEE Conference Record of the Thirty-First Asilomar Conference on Signals, Systems and Computers 1, pp. 955–959 (1998)
Rao, B.D., Engan, K., Cotter, S.F., Palmer, J., Kreutz-Delgado, K.: Subset selection in noise based on diversity measure minimization. IEEE Trans. Signal Process. 51(3), 760–770 (2003)
Gersho, A., Gray, R.M.: Vector Quantization and Signal Compression. Kluwer Academic Publishers, Norwell (1991)
Lewicki, M.S., Olshausen, B.A.: A probabilistic framework for the adaptation and comparison of image codes. J. Opt. Soc. Am. A Opt. Image Sci. Vis. 16(7), 1587–1601 (1999)
Olshausen, B.A., Field, D.J.: Natural image statistics and efficient coding. Netw. Comput. Neural 7(2), 333–339 (1996)
Olshausen, B.A., Field, B.J.: Sparse coding with an overcomplete basis set: a strategy employed by v1? Vis. Res. 37, 3311–3325 (1997)
Lewicki, M.S., Sejnowski, T.J.: Learning overcomplete representations. Neural Comput. 12, 337–365 (2000)
Engan, K., Aase, S.O., Husøy, J.H.: Multi-frame compression: theory and design. EURASIP Signal Process. 80(10), 2121–2140 (2000)
Engan, K., Aase, S.O., Hakon-Husoy, J.H.: Method of optimal directions for frame design. In: IEEE International Conferance on Acoustics, Speech, and Signal Processing 5, pp. 2443–2446 (1999)
Engan, K., Rao, B.D., Kreutz-Delgado, K.: Frame design using focuss with method of optimal directions (MOD). Norwegian Signal Processing Symposium, pp. 65–69 (1999)
Kreutz-Delgado, K., Murray, J.F., Rao, B.D., Engan, K., Lee, T., Sejnowski, T.J.: Dictionary learning algorithms for sparse representation. Neural Comput. 15(2), 349–396 (2003)
Murray, J.F., Kreutz-Delgado, K.: An improved focuss-based learning algorithm for solving sparse linear inverse problems. In: IEEE International Conference on Signals, Systems and Computers 1, pp. 347–351 (2001)
Kreutz-Delgado, K., Rao, B.D.: Focuss-based dictionary learning algorithms. Wavel. Appl. Signal Image Process. VIII 4119, 1–53 (2000)
Li, S.Z., Hou, X.W., Zhang, H.J., Cheng, Q.S.: Learning spatially localized, parts-based representation. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR2001) 1, pp. 207–212 (2001)
Zheng, W.S., Li, S.Z., Lai, J.H., Liao, S.C.: On constrained sparse matrix factorization. In: IEEE 11th International Conference on Computer Vision (ICCV2007), pp. 1–8 (2007)
Bittorf, V., Recht, B., Ré, E., Tropp, J.A.: Factoring nonnegative matrices with linear programs. In: Advances in Neural Information Processing Systems (NIPS ’12), pp. 1223–1231 (2012)
Araújo, U.M.C., Saldanha, B.T.C., Galvao, R.K.H., Yoneyama, T., Chame, H.C., Visani, V.: The successive projections algorithm for variable selection in spectroscopic multicomponent analysis. Chemom. Intell. Lab. Syst. 57(2), 65–73 (2001)
Kumar, A., Sindhwani, V., Kambadur, P.: Fast conical hull algorithms for near-separable non-negative matrix factorization. In Interantional Conferance on Machine Learning (ICML ’13) 28, pp. 231–239 (2013)
Gillis, N., Luce, R.: Robust near-separable nonnegative matrix factorization using linear optimization. J. Mach. Learn. Res. 15, 1249–1280 (2014)
Acknowledgements
The authors would like to thank two anonymous referees for their many valuable and constructive comments and suggestions that have greatly helped improve the quality and presentation of the paper.
Author information
Authors and Affiliations
Corresponding author
Additional information
Bo Yu: Research supported in part by NSFC 11571061. Yin Zhang: Research supported in part by NSF DMS-1115950 and NSF DMS-1418724.
Rights and permissions
About this article
Cite this article
Xu, L., Yu, B. & Zhang, Y. An alternating direction and projection algorithm for structure-enforced matrix factorization. Comput Optim Appl 68, 333–362 (2017). https://doi.org/10.1007/s10589-017-9913-x
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10589-017-9913-x