Advertisement

Steering Time-Dependent Estimation of Posteriors with Hyperparameter Indexing in Bayesian Topic Models

  • Tomonari Masada
  • Atsuhiro Takasu
  • Yuichiro Shibata
  • Kiyoshi Oguri
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6634)

Abstract

This paper provides a new approach to topical trend analysis. Our aim is to improve the generalization power of latent Dirichlet allocation (LDA) by using document timestamps. Many previous works model topical trends by making latent topic distributions time-dependent. We propose a straightforward approach by preparing a different word multinomial distribution for each time point. Since this approach increases the number of parameters, overfitting becomes a critical issue. Our contribution to this issue is two-fold. First, we propose an effective way of defining Dirichlet priors over the word multinomials. Second, we propose a special scheduling of variational Bayesian (VB) inference. Comprehensive experiments with six datasets prove that our approach can improve LDA and also Topics over Time, a well-known variant of LDA, in terms of test data perplexity in the framework of VB inference.

Keywords

Bayesian methods topic models trend analysis variational inference parallelization 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Asuncion, A., Welling, M., Smyth, P., Teh, Y.-W.: On smoothing and inference for topic models. In: Proc. of UAI 2009 (2009)Google Scholar
  2. 2.
    Blei, D.M., Lafferty, J.D.: Dynamic topic models. In: Proceedings of ICML 2006, pp. 113–120 (2006)Google Scholar
  3. 3.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. Journal of Machine Learning Research 3, 993–1022 (2003)zbMATHGoogle Scholar
  4. 4.
    Chu, C.-T., Kim, S.-K., Lin, Y.-A., Yu, Y.-Y., Bradski, G., Ng, A.Y., Olukotun, K.: Map-reduce for machine learning on multicore. In: Proceedings of NIPS 2006, pp. 281–288 (2006)Google Scholar
  5. 5.
    Griffiths, T.L., Steyvers, M.: Finding scientific topics. Proceedings of the National Academy of Sciences of the United States of America 101(1), 5228–5235 (2004)CrossRefGoogle Scholar
  6. 6.
    Gruber, A., Rosen-Zvi, M., Weiss, Y.: Hidden topic Markov models. In: Proceedings of AISTATS 2007 (2007)Google Scholar
  7. 7.
    Hofmann, T.: Probabilistic latent semantic indexing. In: Proceedings of SIGIR 1999, pp. 50–57 (1999)Google Scholar
  8. 8.
    Iwata, T., Yamada, T., Sakurai, Y., Ueda, N.: Online multiscale dynamic topic models. In: Proceedings of KDD 2010, pp. 663–672 (2010)Google Scholar
  9. 9.
    Nallapati, R.M., Ditmore, S., Lafferty, J.D., Ung, K.: Multiscale topic tomography. In: Proceedings of KDD 2007, pp. 520–529 (2007)Google Scholar
  10. 10.
    Pruteanu-Malinici, I., Ren, L., Paisley, J., Wang, E., Carin, L.: Hierarchical Bayesian modeling of topics in time-stamped documents. IEEE Trans. Pattern Anal. Mach. Intell. 32(6), 996–1011 (2010)CrossRefGoogle Scholar
  11. 11.
    Ren, L., Dunson, D.B., Carin, L.: The dynamic hierarchical Dirichlet process. In: Proceedings of ICML 2008, pp. 824–831 (2008)Google Scholar
  12. 12.
    Srebro, N., Roweis, S.: Time-varying topic models using dependent Dirichlet processes. Technical report, Dept. of Computer Science, Univ. of Toronto (2005)Google Scholar
  13. 13.
    Teh, Y.-W., Jordan, M.I., Beal, M.J., Blei, D.M.: Hierarchical Dirichlet processes. Journal of the American Statistical Association 101(476), 1566–1581 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    Teh, Y.-W., Newman, D., Welling, M.: A collapsed variational Bayesian inference algorithm for latent Dirichlet allocation. In: Proceedings of NIPS 2006, pp. 1353–1360 (2006)Google Scholar
  15. 15.
    Wang, C., Blei, D., Heckerman, D.: Continuous time dynamic topic models. In: Proceedings of UAI 2008, pp. 579–586 (2008)Google Scholar
  16. 16.
    Wang, X.-R., McCallum, A.: Topics over time: A non-Markov continuous-time model of topical trends. In: Proceedings of KDD 2006, pp. 424–433 (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Tomonari Masada
    • 1
  • Atsuhiro Takasu
    • 2
  • Yuichiro Shibata
    • 1
  • Kiyoshi Oguri
    • 1
  1. 1.Nagasaki UniversityNagasaki-shiJapan
  2. 2.National Institute of InformaticsTokyoJapan

Personalised recommendations