Skip to main content
Log in

A non-negative matrix factorization model based on the zero-inflated Tweedie distribution

  • Original Paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

Non-negative matrix factorization (NMF) is a technique of multivariate analysis used to approximate a given matrix containing non-negative data using two non-negative factor matrices that has been applied to a number of fields. However, when a matrix containing non-negative data has many zeroes, NMF encounters an approximation difficulty. This zero-inflated situation occurs often when a data matrix is given as count data, and becomes more challenging with matrices of increasing size. To solve this problem, we propose a new NMF model for zero-inflated non-negative matrices. Our model is based on the zero-inflated Tweedie distribution. The Tweedie distribution is a generalization of the normal, the Poisson, and the gamma distributions, and differs from each of the other distributions in the degree of robustness of its estimated parameters. In this paper, we show through numerical examples that the proposed model is superior to the basic NMF model in terms of approximation of zero-inflated data. Furthermore, we show the differences between the estimated basis vectors found using the basic and the proposed NMF models for \(\beta \) divergence by applying it to real purchasing data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. This was ordered by the Joint Association Study Group of Management Science for a data analysis competition in 2014.

References

  • Berry MW, Browne M, Langville AN, Pauca VP, Plemmons RJ (2007) Algorithms and applications for approximate non-negative matrix factorization. Comput Stat Data Anal 52(1):155–173

    Article  MATH  Google Scholar 

  • Cichocki A, Amari S (2010) Families of alpha-beta-and gamma-divergences: flexible and robust measures of similarities. Entropy 12(6):1532–1568

    Article  MathSciNet  MATH  Google Scholar 

  • De Leeuw J, Van der Heijden PGM, Verboon P (1990) A latent time–budget model. Stat Neerl 44(1):1–22

    Article  MathSciNet  MATH  Google Scholar 

  • Dunn PK, Smyth GK (2001) Tweedie family densities: methods of evaluation. In: Proceedings of the 16th international workshop on statistical modelling. Odense, Denmark, pp 2–6

  • Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B (methodological) 39(1):1–38

    MathSciNet  MATH  Google Scholar 

  • Févotte C, Bertin N, Durrieu JL (2009) Non-negative matrix factorization with the Itakura–Saito divergence: with application to music analysis. Neural Comput 21(3):793–830

    Article  MATH  Google Scholar 

  • Févotte C, Idier J (2011) Algorithms for non-negative matrix factorization with the \(\beta \)-divergence. Neural Comput 23(9):2421–2456

    Article  MathSciNet  MATH  Google Scholar 

  • Jorgensen B (1997) The theory of dispersion models. CRC Press, Chapman and Hall, London

    MATH  Google Scholar 

  • Lambert D (1992) Zero-inflated Poisson regression, with an application to defects in manufacturing. Technometrics 34(1):1–14

    Article  MATH  Google Scholar 

  • Lee DD, Seung S (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401:788–791

    Article  Google Scholar 

  • Lee DD, Seung HS (2001) Algorithms for non-negative matrix factorization. Adv Neural Inf Process Syst 13:556–562

    Google Scholar 

  • McLachlan G, Krishnan T (2007) The EM algorithm and extensions, vol 382. Wiley, Hoboken

    MATH  Google Scholar 

  • Nakano M, Kameoka H, Roux LJ, Kitano Y, Ono N, Sagayama S (2010) Convergence-guaranteed multiplicative algorithms for non-negative matrix factorization with divergence. In: Proceedings of the 2010 IEEE international workshop on machine learning for signal processing (MLSP), pp 283–288

  • Paez F, Vanegas JA, Gonzalez FA (2013) An evaluation of NMF algorithm on human action video retrieval. In: Image, signal processing, and artificial vision (STSIVA), 2013 XVIII symposium of IEEE, pp 1–4

  • Simchowitz M (2013) Zero-inflated Poisson factorization for recommendation systems. Academia.edu. https://www.academia.edu/6256225/Zero-Inflated_Poisson_Factorization_for_Recommendation_Systems

  • Schachtner R (2010) Extensions of non-negative matrix factorization and their application to the analysis of wafer test data. Doctoral dissertation

  • Şimşekli U, Cemgil A, Yılmaz YK (2013) Learning the beta-divergence in Tweedie compound Poisson matrix factorization models. In: Proceedings of the 30th international conference on machine learning (ICML-13), pp 1409–1417

  • Tan VYF, Févotte C (2013) Automatic relevance determination in non-negative matrix factorization with the \(\beta \)-divergence. IEEE Trans Pattern Anal Mach Intell 35(7):1592–1605

    Article  Google Scholar 

  • Xu W, Liu X, Gong Y (2003) Document clustering based on non-negative matrix factorization. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in information retrieval, pp 267–273

  • Zhang Y (2013) Likelihood-based and Bayesian methods for Tweedie compound Poisson linear mixed models. Stat Comput 23(6):743–757

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

We wish to express our appreciation to the editor and referees for their insightful comments, which have helped us signicantly improve the paper. We are also grateful to the Joint Association Study Group of Management Science for providing the data used in our application.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hiroyasu Abe.

Appendix: Proof of (25)

Appendix: Proof of (25)

\( {\text {Case of } \beta < 1}\)

The differential of the auxiliary function (17) is given by

$$\begin{aligned} \frac{\partial Q_{\text {aux}}(\varvec{F},\varvec{A})}{\partial f_{im}} = \textstyle \sum _{j}z_{ij}^{*}\eta _{ij}^{\beta -1}a_{jm} - \sum _{j}z_{ij}^{*}y_{ij}f_{im}^{\beta -2}\lambda _{ijm}^{2-\beta }a_{jm}^{\beta -1}. \end{aligned}$$
(30)

The update equation for \(f_{im}\) is derived from (30) when it is zero, as follows:

$$\begin{aligned}&\textstyle \sum _{j}z_{ij}^{*}\eta _{ij}^{\beta -1}a_{jm} - \sum _{j}z_{ij}^{*}y_{ij}f_{im}^{\beta -2}\lambda _{ijm}^{2-\beta }a_{jm}^{\beta -1} = 0 \nonumber \\ \Longleftrightarrow&f_{im}^{\beta -2}\textstyle \sum _{j}z_{ij}^{*}y_{ij}\lambda _{ijm}^{2-\beta }a_{jm}^{\beta -1} = \sum _{j}z_{ij}^{*}\eta _{ij}^{\beta -1}a_{jm} \nonumber \\ \Longleftrightarrow&f_{im}^{2-\beta } = \frac{\sum _{j}z_{ij}^{*}y_{ij}\lambda _{ijm}^{2-\beta }a_{jm}^{\beta -1}}{\sum _{j}z_{ij}^{*}\eta _{ij}^{\beta -1}a_{jm}} \nonumber \\ \Longleftrightarrow&f_{im} = \left\{ \frac{\sum _{j}z_{ij}^{*}y_{ij}\lambda _{ijm}^{2-\beta }a_{jm}^{\beta -1}}{\sum _{j}z_{ij}^{*}\eta _{ij}^{\beta -1}a_{jm}} \right\} ^{\frac{1}{2-\beta }} \end{aligned}$$
(31)

\( {\text {Case of } 1 \le \beta \le 2}\)

The differential of the auxiliary function (17) is given by

$$\begin{aligned} \frac{\partial Q_{\text {aux}}(\varvec{F},\varvec{A})}{\partial f_{im}} = \textstyle \sum _{j}z_{ij}^{*}\lambda _{ijm}^{1-\beta }f_{im}^{\beta -1}a_{jm}^{\beta } - \sum _{j}z_{ij}^{*}y_{ij}\lambda _{ijm}^{2-\beta }f_{im}^{\beta -2}a_{jm}^{\beta -1}. \end{aligned}$$
(32)

The update equation for \(f_{im}\) is derived from (32) when it is zero, as follows:

$$\begin{aligned}&\textstyle \sum _{j}z_{ij}^{*}\lambda _{ijm}^{1-\beta }f_{im}^{\beta -1}a_{jm}^{\beta } - \sum _{j}z_{ij}^{*}y_{ij}\lambda _{ijm}^{2-\beta }f_{im}^{\beta -2}a_{jm}^{\beta -1} = 0 \nonumber \\ \Longleftrightarrow&f_{im}^{\beta -1}\textstyle \sum _{j}z_{ij}^{*}\lambda _{ijm}^{1-\beta }a_{jm}^{\beta } = f_{im}^{\beta -2}\sum _{j}z_{ij}^{*}y_{ij}\lambda _{ijm}^{2-\beta }a_{jm}^{\beta -1} \nonumber \\ \Longleftrightarrow&f_{im}\textstyle \sum _{j}z_{ij}^{*}\lambda _{ijm}^{1-\beta }a_{jm}^{\beta } = \sum _{j}z_{ij}^{*}y_{ij}\lambda _{ijm}^{2-\beta }a_{jm}^{\beta -1} \nonumber \\ \Longleftrightarrow&f_{im} = \frac{\sum _{j}z_{ij}^{*}y_{ij}\lambda _{ijm}^{2-\beta }a_{jm}^{\beta -1}}{\textstyle \sum _{j}z_{ij}^{*}\lambda _{ijm}^{1-\beta }a_{jm}^{\beta }} \end{aligned}$$
(33)

\( {\text {Case of }\beta > 2}\)

The differential of the auxiliary function (17) is given by

$$\begin{aligned} \frac{\partial Q_{\text {aux}}(\varvec{F},\varvec{A})}{\partial f_{im}} = \textstyle \sum _{j}z_{ij}^{*}\lambda _{ijm}^{1-\beta }f_{im}^{\beta -1}a_{jm}^{\beta } - \sum _{j}z_{ij}^{*}y_{ij}\eta _{ij}^{\beta -2}a_{jm}. \end{aligned}$$
(34)

The update equation for \(f_{im}\) is derived from (34) when it is zero, as follows:

$$\begin{aligned}&\textstyle \sum _{j}z_{ij}^{*}\lambda _{ijm}^{1-\beta }f_{im}^{\beta -1}a_{jm}^{\beta } - \sum _{j}z_{ij}^{*}y_{ij}\eta _{ij}^{\beta -2}a_{jm} = 0 \nonumber \\ \Longleftrightarrow&f_{im}^{\beta -1}\textstyle \sum _{j}z_{ij}^{*}\lambda _{ijm}^{1-\beta }a_{jm}^{\beta } = \sum _{j}z_{ij}^{*}y_{ij}\eta _{ij}^{\beta -2}a_{jm} \nonumber \\ \Longleftrightarrow&f_{im}^{\beta -1} = \frac{\sum _{j}z_{ij}^{*}y_{ij}\eta _{ij}^{\beta -2}a_{jm}}{\textstyle \sum _{j}z_{ij}^{*}\lambda _{ijm}^{1-\beta }a_{jm}^{\beta }} \nonumber \\ \Longleftrightarrow&f_{im} = \left\{ \frac{\sum _{j}z_{ij}^{*}y_{ij}\eta _{ij}^{\beta -2}a_{jm}}{\textstyle \sum _{j}z_{ij}^{*}\lambda _{ijm}^{1-\beta }a_{jm}^{\beta }}\right\} ^{\frac{1}{\beta -1}} \end{aligned}$$
(35)

We obtain (25) replacing \(f_{im}\), \(a_{jm}\), \(\lambda _{ijm}\), and \(\eta _{ij}\) to \(f_{im}^{(t)}\), \(a_{jm}^{(t-1)}\), \(\lambda _{ijm}^{(f)}\), and \(\eta _{ij}^{(f)}\), respectively in (30), (32), and (34). \(\blacksquare \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Abe, H., Yadohisa, H. A non-negative matrix factorization model based on the zero-inflated Tweedie distribution. Comput Stat 32, 475–499 (2017). https://doi.org/10.1007/s00180-016-0689-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-016-0689-8

Keywords

Navigation