Advertisement

Regularized NNLS Algorithms for Nonnegative Matrix Factorization with Application to Text Document Clustering

  • Rafal Zdunek
Part of the Advances in Intelligent and Soft Computing book series (AINSC, volume 95)

Abstract

Nonnegative Matrix Factorization (NMF) has recently received much attention both in an algorithmic aspect as well as in applications. Text document clustering and supervised classification are important applications of NMF. Various types of numerical optimization algorithms have been proposed for NMF, which includes multiplicative, projected gradient descent, alternating least squares and active-set ones. In this paper, we discuss the selected Non-Negatively constrained Least Squares (NNLS) algorithms (a family of the NNLS algorithm proposed by Lawson and Hanson) that belong to a class of active-set methods. We noticed that applying the NNLS algorithm to the Tikhonov regularized LS objective function with a regularization parameter exponentially decreasing considerably increases the accuracy of data clustering as well as it reduces the risk of getting stuck into unfavorable local minima. Moreover, the experiments demonstrate that the regularized NNLS algorithm is superior to many well-known NMF algorithms used for text document clustering.

Keywords

Monte Carlo Nonnegative Matrix Factorization Nonnegative Matrix Document Cluster Latent Semantic Indexing 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Benthem, M.H.V., Keenan, M.R.: J. Chemometr. 18, 441–450 (2004)CrossRefGoogle Scholar
  2. 2.
    Berry, M., Browne, M., Langville, A., Pauca, P., Plemmons, R.: Comput. Stat. Data An. 52, 155–173 (2007)CrossRefzbMATHMathSciNetGoogle Scholar
  3. 3.
    Bro, R., Jong, S.D.: J. Chemometr. 11, 393–401 (1997)CrossRefGoogle Scholar
  4. 4.
    Buciu, I., Pitas, I.: Application of non-negative and local nonnegative matrix factorization to facial expression recognition. In: Proc. Intl. Conf. Pattern Recognition (ICPR), pp. 288–291 (2004)Google Scholar
  5. 5.
    Cai, D., He, X., Wu, X., Bao, H., Han, J.: Locality preserving nonnegative matrix factorization. In: Proc. IJCAI 2009, pp. 1010–1015 (2009)Google Scholar
  6. 6.
    Cai, D., He, X., Wu, X., Han, J.: Nonnegative matrix factorization on manifold. In: Proc. 8th IEEE Intl. Conf. Data Mining (ICDM), pp. 63–72 (2008)Google Scholar
  7. 7.
    Cichocki, A., Zdunek, R., Phan, A.H., Amari, S.I.: Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-way Data Analysis and Blind Source Separation. Wiley and Sons, Chichester (2009)Google Scholar
  8. 8.
    Ding, C., Li, T., Jordan, M.I.: IEEE T. Pattern. Anal. 32, 45–55 (2010)CrossRefGoogle Scholar
  9. 9.
    Ding, C., Li, T., Peng, W.: Nonnegative matrix factorization and probabilistic latent semantic indexing: Equivalence, chi-square statistic, and a hybrid method. In: Proc. AAAI National Conf. Artificial Intelligence (AAAI 2006) (2006)Google Scholar
  10. 10.
    Ding, C., Li, T., Peng, W., Park, H.: Orthogonal nonnegative matrix tri-factorizations for clustering. In: Proc 12th ACM SIGKDD Intl. Conf. Knowledge Discovery and Data Mining, pp. 126–135. ACM Press, New York (2006)CrossRefGoogle Scholar
  11. 11.
    Du, Q., Kopriva, I.: Neurocomputing 72, 2682–2692 (2009)CrossRefGoogle Scholar
  12. 12.
    Heiler, M., Schnoerr, C.: J. Mach. Learn. Res. 7, 1385–1407 (2006)MathSciNetGoogle Scholar
  13. 13.
    Jain, A.K., Murty, M.N., Flynn, P.J.: ACM Comput. Surv. 31, 264–323 (1999)CrossRefGoogle Scholar
  14. 14.
    Jankowiak, M.: Application of nonnegative matrix factorization for text document classification. MSc thesis (supervised by Dr. R. Zdunek), Wroclaw University of Technology, Poland (2010) (in Polish)Google Scholar
  15. 15.
    Kim, H., Park, H.: Bioinformatics 23, 1495–1502 (2007)CrossRefGoogle Scholar
  16. 16.
    Kim, H., Park, H.: SIAM J. Matrix Anal. A 30, 713–730 (2008)CrossRefzbMATHGoogle Scholar
  17. 17.
    Lawson, C.L., Hanson, R.J.: Solving Least Squares Problems. Prentice-Hall, Englewood Cliffs (1974)zbMATHGoogle Scholar
  18. 18.
    Lee, D.D., Seung, H.S.: Nature 401, 788–791 (1999)CrossRefGoogle Scholar
  19. 19.
    Li, T., Ding, C.: The relationships among various nonnegative matrix factorization methods for clustering. In: Proc. 6th Intl. Conf. Data Mining (ICDM 2006), pp. 362–371. IEEE Computer Society, Washington DC, USA (2006)CrossRefGoogle Scholar
  20. 20.
    O’Grady, P., Pearlmutte, B.: Neurocomputing 72, 88–101 (2008)CrossRefGoogle Scholar
  21. 21.
    Sajda, P., Du, S., Brown, T.R., Stoyanova, R., Shungu, D.C., Mao, X., Parra, L.C.: IEEE T. Med. Imaging 23, 1453–1465 (2004)CrossRefGoogle Scholar
  22. 22.
    Shahnaz, F., Berry, M., Pauca, P., Plemmons, R.: Inform. Process. Manag. 42, 373–386 (2006)CrossRefzbMATHGoogle Scholar
  23. 23.
    Sra, S., Dhillon, I.S.: Nonnegative matrix approximation: Algorithms and Applications. UTCS Technical Report TR-06-27, Austin, USA (2006), http://www.cs.utexas.edu/ftp/pub/techreports/tr06-27.pdf
  24. 24.
    Xu, W., Liu, X., Gong, Y.: Document clustering based on non-negative matrix factorization. In: SIGIR 2003: Proc 26th Annual Intl ACM SIGIR Conf. Research and Development in Informaion Retrieval, pp. 267–273. ACM Press, New York (2003)CrossRefGoogle Scholar
  25. 25.
    Zdunek, R., Cichocki, A.: Comput. Intel. Neurosci. (939567) (2008)Google Scholar
  26. 26.
    Zdunek, R., Phan, A.H., Cichocki, A.: Aust. J. Intel. Inform. Process. Syst. 12, 16–22 (2010)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Rafal Zdunek
    • 1
  1. 1.Institute of Telecommunications, Teleinformatics and AcousticsWroclaw University of TechnologyWroclawPoland

Personalised recommendations