Kullback-Leibler Divergence for Nonnegative Matrix Factorization

  • Zhirong Yang
  • He Zhang
  • Zhijian Yuan
  • Erkki Oja
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6791)


The I-divergence or unnormalized generalization of Kullback-Leibler (KL) divergence is commonly used in Nonnegative Matrix Factorization (NMF). This divergence has the drawback that its gradients with respect to the factorizing matrices depend heavily on the scales of the matrices, and learning the scales in gradient-descent optimization may require many iterations. This is often handled by explicit normalization of one of the matrices, but this step may actually increase the I-divergence and is not included in the NMF monotonicity proof. A simple remedy that we study here is to normalize the input data. Such normalization allows the replacement of the I-divergence with the original KL-divergence for NMF and its variants. We show that using KL-divergence takes the normalization structure into account in a very natural way and brings improvements for nonnegative matrix factorizations: the gradients of the normalized KL-divergence are well-scaled and thus lead to a new projected gradient method for NMF which runs faster or yields better approximation than three other widely used NMF algorithms.


Nonnegative Matrix Factorization Nonnegative Matrix Positive Matrix Factorization Projected Gradient Algorithm Input Data Matrix 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Paatero, P., Tapper, U.: Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values. Environmetrics 5, 111–126 (1994)CrossRefGoogle Scholar
  2. 2.
    Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401, 788–791 (1999)CrossRefGoogle Scholar
  3. 3.
    Cichocki, A., Zdunek, R., Phan, A.-H., Amari, S.: Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-way Data Analysis. John Wiley, Chichester (2009)CrossRefGoogle Scholar
  4. 4.
    Dhillon, I.S., Sra, S.: Generalized nonnegative matrix approximations with bregman divergences. Advances in Neural Information Processing Systems 18, 283–290 (2006)Google Scholar
  5. 5.
    Févotte, C., Bertin, N., Durrieu, J. L.: Nonnegative matrix factorization with the Itakura-Saito divergence: With application to music analysis. Neural Computation 21(3), 793–830 (2009)CrossRefzbMATHGoogle Scholar
  6. 6.
    Gullberg, J.: Mathematics: From the Birth of Numbers. W. W. Norton & Company, New York (1997)zbMATHGoogle Scholar
  7. 7.
    van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605 (2008)zbMATHGoogle Scholar
  8. 8.
    Ho, N. D., Dooren, P.V.: Non-negative matrix factorization with fixed row and column sums. Linear Algebra and its Applications 429(5-6), 1020–1025 (2008)CrossRefzbMATHMathSciNetGoogle Scholar
  9. 9.
    Ding, C., Li, T., Peng, W.: On the equivalence between non-negative matrix factorization and probabilistic laten semantic indexing. Computational Statistics and Data Analysis 52(8), 3913–3927 (2008)CrossRefzbMATHMathSciNetGoogle Scholar
  10. 10.
    Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. Advances in Neural Information Processing Systems 13, 556–562 (2001)Google Scholar
  11. 11.
    Gonzales, E.F., Zhang, Y.: Accelerating the lee-seung algorithm for non-negative matrix factorization. Technical report, Dept. of Computational and Applied Mathematics. Rice University (2005)Google Scholar
  12. 12.
    Lin, C. J.: On the convergence of multiplicative update algorithms for nonnegative matrix factorization. IEEE Transactions on Neural Networks 18(6), 1589–1596 (2007)CrossRefGoogle Scholar
  13. 13.
    Lin, C.J.: Projected gradient methods for non-negative matrix factorization. Neural Computation 19, 2756–2779 (2007)CrossRefzbMATHMathSciNetGoogle Scholar
  14. 14.
    Kim, D., Sra, S., Dhillon, I.S.: Fast projection-based methods for the least squares nonnegative matrix approximation problem. Statistical Analysis and Data Mining 1(1), 38–51 (2008)CrossRefMathSciNetGoogle Scholar
  15. 15.
    Kim, H., Park, H.: Nonnegative matrix factorization based on alternating non-negativity-constrained least squares and the active set method. SIAM Journal on Matrix Analysis and Applications 30(2), 713–730 (2008)CrossRefzbMATHMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Zhirong Yang
    • 1
  • He Zhang
    • 1
  • Zhijian Yuan
    • 1
  • Erkki Oja
    • 1
  1. 1.Department of Information and Computer ScienceAalto UniversityEspooFinland

Personalised recommendations