Nonnegative Matrix Factorization: Models, Algorithms and Applications

  • Zhong-Yuan Zhang
Part of the Intelligent Systems Reference Library book series (ISRL, volume 24)

Abstract

In recent years, Nonnegative Matrix Factorization (NMF) has become a popular model in data mining society. NMF aims to extract hidden patterns from a series of high-dimensional vectors automatically, and has been applied for dimensional reduction, unsupervised learning (clustering, semi-supervised clustering and co-clustering, etc.) and prediction successfully. This chapter surveys NMF in terms of the model formulation and its variations and extensions, algorithms and applications, as well as its relations with K-means and Probabilistic Latent Semantic Indexing (PLSI). In summary, we draw the following conclusions: 1) NMF has a good interpretability due to its nonnegative constraints; 2) NMF is very flexible regarding the choices of its objective functions and the algorithms employed to solve it; 3) NMF has a variety of applications; 4) NMF has a solid theoretical foundation and a close relationship with the existing state-of-the-art unsupervised learning models. However, as a new and developing technology, there are still many interesting open issues remained unsolved and waiting for research from theoretical and algorithmic perspectives.

Keywords

Nonnegative Matrix Factorization Nonnegative Matrix Positive Matrix Factorization Bregman Divergence Probabilistic Latent Semantic Indexing 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401(6755), 788–791 (1999)CrossRefGoogle Scholar
  2. 2.
    Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. In: Annual Conference on Neural Information Processing Systems, pp. 556–562 (2000)Google Scholar
  3. 3.
    Paatero, P., Tapper, U.: Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values. Environmetrics 5(2), 111–126 (1994)CrossRefGoogle Scholar
  4. 4.
    Jolliffe, I.T.: Principal Component Analysis, 2nd edn. Springer, Heidelberg (2002)MATHGoogle Scholar
  5. 5.
    Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd edn., corr. 3rd printing edn. Springer Series in Statistics. Springer, Heidelberg (2009)MATHGoogle Scholar
  6. 6.
    Tropp, J.A.: Literature survey: Non-negative matrix factorization. University of Texas at Austin, Austin, TX (2003) (unpublished document)Google Scholar
  7. 7.
    Xie, Y.L., Hopke, P., Paatero, P.: Positive matrix factorization applied to a curve resolution problem. Journal of Chemometrics 12(6), 357–364 (1999)CrossRefGoogle Scholar
  8. 8.
    Li, S.Z., Hou, X.W., Zhang, H.J., Cheng, Q.S.: Learning spatially localized, parts-based representation. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. I-207– I-212 (2001)Google Scholar
  9. 9.
    Cooper, M., Foote, J.: Summarizing video using non-negative similarity matrix factorization. In: IEEE Workshop on Multimedia Signal Processing, pp. 25–28 (2002)Google Scholar
  10. 10.
    Pauca, V.P., Shahnaz, F., Berry, M.W., Plemmons, R.J.: Text mining using nonnegative matrix factorizations. In: Proceedings of the Fourth SIAM International Conference on Data Mining (2004)Google Scholar
  11. 11.
    Shahnaz, F., Berry, M.W., Pauca, P.R.J.: Document clustering using nonnegative matrix factorization. Information Processing & Management 42(2), 373–386 (2006)MATHCrossRefGoogle Scholar
  12. 12.
    Xu, W., Liu, X., Gong, Y.: Document clustering based on non-negative matrix factorization. In: SIGIR 2003: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval, pp. 267–273. ACM Press, New York (2003)CrossRefGoogle Scholar
  13. 13.
    Nielsen, F.A., Balslev, D., Hansen, L.K.: Mining the posterior cingulate: Segregation between memory and pain components. NeuroImage 27(3), 520–532 (2005)CrossRefGoogle Scholar
  14. 14.
    Brunet, J.P., Tamayo, P., Golub, T.R., Mesirov, J.P.: Metagenes and molecular pattern discovery using matrix factorization. Proc. Natl. Acad. Sci. USA 101(12), 4164–4169 (2004)CrossRefGoogle Scholar
  15. 15.
    Pascual-Montano, A., Carazo, J.M., Kochi, K., Lehmann, D., Pascual-Marqui, R.D.: Nonsmooth nonnegative matrix factorization (nsNMF). IEEE transactions on Pattern Analysis and Machine Intelligence 28(3), 403–415 (2006)CrossRefGoogle Scholar
  16. 16.
    Devarajan, K.: Nonnegative matrix factorization: An analytical and interpretive tool in computational biology. PLoS Comput. 4(7), e1000029 (2008)CrossRefGoogle Scholar
  17. 17.
    Ding, C., He, X., Simon, H.D.: On the equivalence of nonnegative matrix factorization and spectral clustering. In: SIAM Data Mining Conf. (2005)Google Scholar
  18. 18.
    Ding, C., Li, T., Peng, W.: On the equivalence between non-negative matrix factorization and probabilistic latent semantic indexing. Comput. Stat. Data Anal. 52(8), 3913–3927 (2008)MathSciNetMATHCrossRefGoogle Scholar
  19. 19.
    Gaussier, E., Goutte, C.: Relation between PLSA and NMF and implications. In: SIGIR 2005: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 601–602. ACM, New York (2005)CrossRefGoogle Scholar
  20. 20.
    Zhang, Z.Y., Li, T., Ding, C., Zhang, X.S.: Binary matrix factorization with applications. In: IEEE International Conference on Data Mining, pp. 391–400 (2007)Google Scholar
  21. 21.
    Zhang, Z.Y., Li, T., Ding, C., Ren, X.W., Zhang, X.S.: Binary matrix factorization for analyzing gene expression data. Data Min. Knowl. Discov. 20(1), 28–52 (2010)MathSciNetCrossRefGoogle Scholar
  22. 22.
    Ding, C.H.Q., Li, T., Jordan, M.I.: Convex and semi-nonnegative matrix factorizations. IEEE Trans. Pattern Anal. Mach. Intell. 32(1), 45–55 (2010)CrossRefGoogle Scholar
  23. 23.
    Ding, C., Li, T., Peng, W., Park, H.: Orthogonal nonnegative matrix tfactorizations for clustering. In: KDD 2006: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data mining, pp. 126–135. ACM, New York (2006)CrossRefGoogle Scholar
  24. 24.
    Li, T., Ding, C.: The relationships among various nonnegative matrix factorization methods for clustering. In: ICDM 2006: Proceedings of the Sixth International Conference on Data Mining, pp. 362–371. IEEE Computer Society, Washington, DC, USA (2006)CrossRefGoogle Scholar
  25. 25.
    Li, S.Z., Hou, X.W., Zhang, H.J., Cheng, Q.S.: Learning spatially localized, parts-based representation. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2001, vol. 1 (2001)Google Scholar
  26. 26.
    Feng, T., Li, S., Shum, H.Y., Zhang, H.: Local non-negative matrix factorization as a visual representation. In: International Conference on Development and Learning (2002)Google Scholar
  27. 27.
    Hoyer, P.O.: Non-negative matrix factorization with sparseness constraints. Journal of Machine Learning Research 5, 1457–1469 (2004)MathSciNetMATHGoogle Scholar
  28. 28.
    Hoyer, P.O.: Non-negative sparse coding. In: Proceedings of the, 12th IEEE Workshop on Neural Networks for Signal Processing, pp. 557–565 (2002)Google Scholar
  29. 29.
    Liu, W., Zheng, N., Lu, X.: Non-negative matrix factorization for visual coding. In: Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, ICASSP 2003 (2003)Google Scholar
  30. 30.
    Gao, Y., Church, G.: Improving molecular cancer class discovery through sparse non-negative matrix factorization. Bioinformatics 21(21), 3970–3975 (2005)CrossRefGoogle Scholar
  31. 31.
    Pauca, V.P., Piper, J., Plemmons, R.J.: Nonnegative matrix factorization for spectral data analysis. Linear Algebra and its Applications 416(1), 29–47 (2006)MathSciNetMATHCrossRefGoogle Scholar
  32. 32.
    Hoyer, P.O.: Non-negative matrix factorization with sparseness constraints. J. Mach. Learn. Res. 5, 1457–1469 (2004)MathSciNetMATHGoogle Scholar
  33. 33.
    Kim, H., Park, H.: Sparse non-negative matrix factorizations via alternating nonnegativity- constrained least squares for microarray data analysis. Bioinformatics 23(12), 1495–1502 (2007)CrossRefGoogle Scholar
  34. 34.
    Mahoney, M.W., Drineas, P.: CUR matrix decompositions for improved data analysis. Proc. Natl. Acad. Sci. USA 106(3), 697–702 (2009)MathSciNetMATHCrossRefGoogle Scholar
  35. 35.
    Cichocki, A., Zdunek, R., Amari, S.: Csiszár’s divergences for non-negative matrix factorization: Family of new algorithms. In: Proc. Int’l Conf. Independent Component Analysis and Blind Signal Separation, pp. 32–39 (2006)Google Scholar
  36. 36.
    Cichocki, A., Lee, H., Kim, Y.D., Choi, S.: Non-negative matrix factorization with -divergence. Pattern Recogn. Lett. 29(9), 1433–1440 (2008)CrossRefGoogle Scholar
  37. 37.
    Cichocki, A., Amari, S.-i., Zdunek, R., Kompass, R., Hori, G., He, Z.: Extended SMART Algorithms for Non-negative Matrix Factorization. In: Rutkowski, L., Tadeusiewicz, R., Zadeh, L.A., Żurada, J.M. (eds.) ICAISC 2006. LNCS (LNAI), vol. 4029, pp. 548–562. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  38. 38.
    Liu, W., Yuan, K., Ye, D.: On alpha-divergence based nonnegative matrix factorization for clustering cancer gene expression data. Artif. Intell. Med. 44(1), 1–5 (2008)CrossRefGoogle Scholar
  39. 39.
    Cichocki, A., Zdunek, R., Choi, S., Plemmons, R., Amari, S.: Nonnegative tensor factorization using alpha and beta divergencies. In: Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2007), pp. 1393–1396 (2007)Google Scholar
  40. 40.
    Dhillon, I.S., Sra, S.: Generalized nonnegative matrix approximations with bregman divergences. In: Proc. Advances in Neural Information Proc. Systems (NIPS), pp. 283–290 (2005)Google Scholar
  41. 41.
    Kompass, R.: A generalized divergence measure for nonnegative matrix factorization. Neural Comput. 19(3), 780–791 (2007)MathSciNetMATHCrossRefGoogle Scholar
  42. 42.
    Cichocki, A., Zdunek, R., Phan, A.H., Amari, S.-i.: Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-way Data Analysis and Blind Source Separation. A John Wiley and Sons, Ltd, Publication, Chichester (2009)Google Scholar
  43. 43.
    Févotte, C., Bertin, N., Durrieu, J.L.: Nonnegative matrix factorization with the itakura-saito divergence: With application to music analysis. Neural Comput. 21(3), 793–830 (2009)MATHCrossRefGoogle Scholar
  44. 44.
    Gonzalez, E.F., Zhang, Y.: Accelerating the lee-seung algorithm for nonnegative matrix factorization. Technical Report (2005)Google Scholar
  45. 45.
    Lin, C.J.: Projected gradient methods for nonnegative matrix factorization. Neural Comput. 19(10), 2756–2779 (2007)MathSciNetMATHCrossRefGoogle Scholar
  46. 46.
    Kim, D., Sra, S., Dhillon, I.S.: Fast newton-type methods for the least squares nonnegative matrix approximation problem. In: Proceedings of SIAM Conference on Data Mining, pp. 343–354 (2007)Google Scholar
  47. 47.
    Zdunek, R., Cichocki, A.: Non-negative Matrix Factorization with Quasi-Newton Optimization. In: Rutkowski, L., Tadeusiewicz, R., Zadeh, L.A., Żurada, J.M. (eds.) ICAISC 2006. LNCS (LNAI), vol. 4029, pp. 870–879. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  48. 48.
    Kim, H., Park, H.: Nonnegative matrix factorization based on alternating nonnegativity constrained least squares and active set method. SIAM J. Matrix Anal. Appl. 30(2), 713–730 (2008)MathSciNetMATHCrossRefGoogle Scholar
  49. 49.
    Long, B., Wu, X., Zhang, Z., Yu, P.S.: Community learning by graph approximation. In: Proceedings of the 2007 Seventh IEEE International Conference on Data Mining, ICDM 2007, pp. 232–241 (2007)Google Scholar
  50. 50.
    Chen, Y., Rege, M., Dong, M., Hua, J.: Incorporating user provided constraints into document clustering. In: Proceedings of the 2007 Seventh IEEE International Conference on Data Mining, ICDM 2007, pp. 103–112 (2007)Google Scholar
  51. 51.
    Kulis, B., Basu, S., Dhillon, I., Mooney, R.: Semi-supervised graph clustering: a kernel approach. In: ICML 2005: Proceedings of the 22nd International Conference on Machine Learning, pp. 457–464. ACM, New York (2005)CrossRefGoogle Scholar
  52. 52.
    Ji, X., Xu, W.: Document clustering with prior knowledge. In: SIGIR 2006: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information retrieval, pp. 405–412. ACM, New York (2006)CrossRefGoogle Scholar
  53. 53.
    Cheng, Y., Church, G.M.: Biclustering of expression data. In: Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology, pp. 93–103. AAAI Press, Menlo Park (2000)Google Scholar
  54. 54.
    Prelić, A., Bleuler, S., Zimmermann, P., Wille, A., Bühlmann, P., Gruissem, W., Hennig, L., Thiele, L., Zitzler, E.: A systematic comparison and evaluation of biclustering methods for gene expression data. Bioinformatics 22(9), 1122–1129 (2006)CrossRefGoogle Scholar
  55. 55.
    Drakakis, K., Rickard, S., de Frein, R., Cichocki, A.: Analysis of financial data using non-negative matrix factorization. International Mathematical Forum 3(38), 1853–1870 (2008)MathSciNetMATHGoogle Scholar
  56. 56.
    Ribeiro, B., Silva, C., Vieira, A., Neves, J.: Extracting Discriminative Features Using Non-negative Matrix Factorization in Financial Distress Data. In: Kolehmainen, M., Toivanen, P., Beliczynski, B. (eds.) ICANNGA 2009. LNCS, vol. 5495, pp. 537–547. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  57. 57.
    Zha, H., He, X., Ding, C., Simon, H.: Spectral relaxation for k-means clustering. In: Proc. Advances in Neural Information Proc. Systems (NIPS), pp. 1057–1064 (2001)Google Scholar
  58. 58.
    Ding, C., He, X.: K-means clustering via principal component analysis. In: Proceedings of the twenty-first international conference on Machine learning (ICML 2004), pp. 225–232 (2004)Google Scholar
  59. 59.
    Hofmann, T.: Probabilistic latent semantic indexing. In: SIGIR 1999: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 50–57. ACM Press, New York (1999)CrossRefGoogle Scholar
  60. 60.
    Deerwester, S.C., Dumais, S.T., Landauer, T.K., Furnas, G.W., Harshman, R.A.: Indexing by latent semantic analysis. Journal of the American Society of Information Science 41(6), 391–407 (1990)CrossRefGoogle Scholar
  61. 61.
    Wu, X., Yan, J., Liu, N., Yan, S., Chen, Y., Chen, Z.: Probabilistic latent semantic user segmentation for behavioral targeted advertising. In: ADKDD 2009: Proceedings of the Third International Workshop on Data Mining and Audience Intelligence for Advertising, pp. 10–17. ACM, New York (2009)CrossRefGoogle Scholar
  62. 62.
    Cohn, D., Hofmann, T.: The missing link - a probabilistic model of document content and hypertext connectivity. In: Proc. Advances in Neural Information Proc. Systems, NIPS (2001)Google Scholar
  63. 63.
    Ho, N.D., Dooren, P.V.: Non-negative matrix factorization with fixed row and column sums. Linear Algebra and its Applications 429, 1020–1025 (2008)MathSciNetMATHCrossRefGoogle Scholar
  64. 64.
    Tibshirani, R.: Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B 58, 267–288 (1996)MathSciNetMATHGoogle Scholar
  65. 65.
    Liu, C., Yang, H.: c., Fan, J., He, L.W.,Wang, Y.M.: Distributed nonnegative matrix factorization for web-scale dyadic data analysis on mapreduce. In: Proceedings of the 19th International Conference on World wide web (WWW 2010), pp. 681–690 (2010)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Zhong-Yuan Zhang
    • 1
  1. 1.School of StatisticsCentral University of Finance and EconomicsP.R. China

Personalised recommendations