Skip to main content

Advertisement

SpringerLink
  • Log in
Book cover

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

ECML PKDD 2015: Machine Learning and Knowledge Discovery in Databases pp 53–70Cite as

  1. Home
  2. Machine Learning and Knowledge Discovery in Databases
  3. Conference paper
Scalable Bayesian Non-negative Tensor Factorization for Massive Count Data

Scalable Bayesian Non-negative Tensor Factorization for Massive Count Data

  • Changwei Hu10,
  • Piyush Rai10,
  • Changyou Chen10,
  • Matthew Harding11 &
  • …
  • Lawrence Carin10 
  • Conference paper
  • First Online: 01 January 2015
  • 3900 Accesses

  • 10 Citations

Part of the Lecture Notes in Computer Science book series (LNAI,volume 9285)

Abstract

We present a Bayesian non-negative tensor factorization model for count-valued tensor data, and develop scalable inference algorithms (both batch and online) for dealing with massive tensors. Our generative model can handle overdispersed counts as well as infer the rank of the decomposition. Moreover, leveraging a reparameterization of the Poisson distribution as a multinomial facilitates conjugacy in the model and enables simple and efficient Gibbs sampling and variational Bayes (VB) inference updates, with a computational cost that only depends on the number of nonzeros in the tensor. The model also provides a nice interpretability for the factors; in our model, each factor corresponds to a “topic”. We develop a set of online inference algorithms that allow further scaling up the model to massive tensors, for which batch inference methods may be infeasible. We apply our framework on diverse real-world applications, such as multiway topic modeling on a scientific publications database, analyzing a political science data set, and analyzing a massive household transactions data set.

Keywords

  • Tensor factorization
  • Bayesian learning
  • Latent factor models
  • Count data
  • Online bayesian inference

Download conference paper PDF

References

  1. Bazerque, J.A., Mateos, G., Giannakis, G.B.: Inference of poisson count processes using low-rank tensor data. In: ICASSP (2013)

    Google Scholar 

  2. Beutel, A., Kumar, A., Papalexakis, E.E., Talukdar, P.P., Faloutsos, C., Xing, E.P.: Flexifact: Scalable flexible factorization of coupled tensors on hadoop. In: SDM (2014)

    Google Scholar 

  3. Cappé, O., Moulines, E.: On-line expectation-maximization algorithm for latent data models. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 71(3), 593–613 (2009)

    CrossRef  MathSciNet  MATH  Google Scholar 

  4. Chi, E.C., Kolda, T.G.: On tensors, sparsity, and nonnegative factorizations. SIAM Journal on Matrix Analysis and Applications 33(4), 1272–1299 (2012)

    CrossRef  MathSciNet  MATH  Google Scholar 

  5. Cichocki, A., Zdunek, R., Phan, A.H., Amari, S.: Nonnegative matrix and tensor factorizations: applications to exploratory multi-way data analysis and blind source separation. John Wiley & Sons (2009)

    Google Scholar 

  6. Dunson, D.B., Herring, A.H.: Bayesian latent variable models for mixed discrete outcomes. Biostatistics 6(1), 11–25 (2005)

    CrossRef  MATH  Google Scholar 

  7. Guhaniyogi, R., Qamar, S., Dunson, D.B.: Bayesian conditional density filtering. arXiv preprint arXiv:1401.3632 (2014)

  8. Heinrich, G., Goesele, M.: Variational Bayes for Generic Topic Models. In: Mertsching, B., Hund, M., Aziz, Z. (eds.) KI 2009. LNCS, vol. 5803, pp. 161–168. Springer, Heidelberg (2009)

    CrossRef  Google Scholar 

  9. Hoffman, M.D., Blei, D.M., Wang, C., Paisley, J.: Stochastic variational inference. The Journal of Machine Learning Research 14(1), 1303–1347 (2013)

    MathSciNet  MATH  Google Scholar 

  10. Inah, J., Papalexakis, E.E., Kang, U., Faloutsos, C.: Haten2: Billion-scale tensor decompositions. In: ICDE (2015)

    Google Scholar 

  11. Johndrow, J.E., Battacharya, A., Dunson, D.B.: Tensor decompositions and sparse log-linear models. arXiv preprint arXiv:1404.0396 (2014)

  12. Jordan, M.I., Ghahramani, Z., Jaakkola, T.S., Saul, L.K.: An introduction to variational methods for graphical models. Machine Learning 37(2), 183–233 (1999)

    CrossRef  Google Scholar 

  13. Kang, U., Papalexakis, E., Harpale, A., Faloutsos, C.: Gigatensor: scaling tensor analysis up by 100 times-algorithms and discoveries. In: KDD (2012)

    Google Scholar 

  14. Kolda, T.G., Bader, B.W.: Tensor decompositions and applications. SIAM Review 51(3), 455–500 (2009)

    CrossRef  MathSciNet  Google Scholar 

  15. Kozubowski, T.J., Podgórski, K.: Distributional properties of the negative binomial Lévy process. Centre for Mathematical Sciences, Faculty of Engineering, Lund University, Mathematical Statistics (2008)

    Google Scholar 

  16. Leetaru, K., Schrodt, P.A.: Gdelt: Global data on events, location, and tone, 1979–2012. ISA Annual Convention 2, 4 (2013)

    Google Scholar 

  17. Papalexakis, E., Faloutsos, C., Sidiropoulos, N.: Parcube: Sparse parallelizable candecomp-parafac tensor decompositions. ACM Transactions on Knowledge Discovery from Data (2015)

    Google Scholar 

  18. Rai, P., Wang, Y., Guo, S., Chen, G., Dunson, D., Carin, L.: Scalable bayesian low-rank decomposition of incomplete multiway tensors. In: ICML (2014)

    Google Scholar 

  19. Schein, A., Paisley, J., Blei, D.M., Wallach, H.: Inferring polyadic events with poisson tensor factorization. In: NIPS Workshop (2014)

    Google Scholar 

  20. Schmidt, M., Mohamed, S.: Probabilistic non-negative tensor factorisation using markov chain monte carlo. In: 17th European Signal Processing Conference (2009)

    Google Scholar 

  21. Shashua, A., Hazan, T.: Non-negative tensor factorization with applications to statistics and computer vision. In: ICML (2005)

    Google Scholar 

  22. Zhao, Q., Zhang, L., Cichocki, A.: Bayesian cp factorization of incomplete tensors with automatic rank determination

    Google Scholar 

  23. Zhou, G., Cichocki, A., Xie, S.: Fast nonnegative matrix/tensor factorization based on low-rank approximation. IEEE Transactions on Signal Processing 60(6), 2928–2940 (2012)

    Google Scholar 

  24. Zhou, M., Hannah, L.A., Dunson, D., Carin, L.: Beta-negative binomial process and poisson factor analysis. In: AISTATS (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

  1. Department of Electrical and Computer Engineering, Duke University, Durham, USA

    Changwei Hu, Piyush Rai, Changyou Chen & Lawrence Carin

  2. Sanford School of Public Policy and Department of Economics, Duke University, Durham, USA

    Matthew Harding

Authors
  1. Changwei Hu
    View author publications

    You can also search for this author in PubMed Google Scholar

  2. Piyush Rai
    View author publications

    You can also search for this author in PubMed Google Scholar

  3. Changyou Chen
    View author publications

    You can also search for this author in PubMed Google Scholar

  4. Matthew Harding
    View author publications

    You can also search for this author in PubMed Google Scholar

  5. Lawrence Carin
    View author publications

    You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Piyush Rai .

Editor information

Editors and Affiliations

  1. University of Bari Aldo Moro, Bari, Italy

    Annalisa Appice

  2. University of Porto, Porto, Portugal

    Pedro Pereira Rodrigues

  3. Universidade do Porto, Porto, Portugal

    Vítor Santos Costa

  4. University of Porto - INESC TEC, Porto, Portugal

    João Gama

  5. University of Porto - INESC TEC, Porto, Portugal

    Alípio Jorge

  6. University of Porto - INESC TEC, Porto, Portugal

    Carlos Soares

Rights and permissions

Reprints and Permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Hu, C., Rai, P., Chen, C., Harding, M., Carin, L. (2015). Scalable Bayesian Non-negative Tensor Factorization for Massive Count Data. In: Appice, A., Rodrigues, P., Santos Costa, V., Gama, J., Jorge, A., Soares, C. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2015. Lecture Notes in Computer Science(), vol 9285. Springer, Cham. https://doi.org/10.1007/978-3-319-23525-7_4

Download citation

  • .RIS
  • .ENW
  • .BIB
  • DOI: https://doi.org/10.1007/978-3-319-23525-7_4

  • Published: 29 August 2015

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-23524-0

  • Online ISBN: 978-3-319-23525-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Share this paper

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Over 10 million scientific documents at your fingertips

Switch Edition
  • Academic Edition
  • Corporate Edition
  • Home
  • Impressum
  • Legal information
  • Privacy statement
  • California Privacy Statement
  • How we use cookies
  • Manage cookies/Do not sell my data
  • Accessibility
  • FAQ
  • Contact us
  • Affiliate program

Not logged in - 44.201.94.236

Not affiliated

Springer Nature

© 2023 Springer Nature Switzerland AG. Part of Springer Nature.