Advertisement

Accelerating Collapsed Variational Bayesian Inference for Latent Dirichlet Allocation with Nvidia CUDA Compatible Devices

  • Tomonari Masada
  • Tsuyoshi Hamada
  • Yuichiro Shibata
  • Kiyoshi Oguri
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5579)

Abstract

In this paper, we propose an acceleration of collapsed variational Bayesian (CVB) inference for latent Dirichlet allocation (LDA) by using Nvidia CUDA compatible devices. While LDA is an efficient Bayesian multi-topic document model, it requires complicated computations for parameter estimation in comparison with other simpler document models, e.g. probabilistic latent semantic indexing, etc. Therefore, we accelerate CVB inference, an efficient deterministic inference method for LDA, with Nvidia CUDA. In the evaluation experiments, we used a set of 50,000 documents and a set of 10,000 images. We could obtain inference results comparable to sequential CVB inference.

Keywords

Shared Memory Device Memory Local Memory Latent Dirichlet Allocation Neural Information Processing System 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
  2. 2.
  3. 3.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet Allocation. In: Advances in Neural Information Processing Systems 14, pp. 601–608 (2001)Google Scholar
  4. 4.
    Chu, C.T., Kim, S.K., Lin, Y.A., Yu, Y., Bradski, G.R., Ng, A.Y., Olukotun, K.: Map-Reduce for Machine Learning on Multicore. In: Advances in Neural Information Processing Systems 19, pp. 306–313 (2006)Google Scholar
  5. 5.
    Steyvers, M., Smyth, P., Rosen-Zvi, M., Griffiths, T.L.: Probabilistic Author-Topic Models for Information Discovery. In: The 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 306–315 (2004)Google Scholar
  6. 6.
    Hofmann, T.: Probabilistic Latent Semantic Indexing. In: The 22nd International Conference on Research and Development in Information Retrieval, pp. 50–57 (1999)Google Scholar
  7. 7.
    Madsen, R.E., Kauchak, D., Elkan, C.: Modeling Word Burstiness Using the Dirichlet Distribution. In: The 22nd International Conference on Machine learning, pp. 545–552 (2005)Google Scholar
  8. 8.
    Nallapati, R., Cohen, W.W., Lafferty, J.D.: Parallelized Variational EM for Latent Dirichlet Allocation: An Experimental Evaluation of Speed and Scalability. In: ICDM Workshop on High Performance Data Mining, pp. 349–354 (2007)Google Scholar
  9. 9.
    Porteous, I., Newman, D., Ihler, A.T., Asuncion, A., Smyth, P., Welling, M.: Fast Collapsed Gibbs Sampling for Latent Dirichlet Allocation. In: The 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 569–577 (2008)Google Scholar
  10. 10.
    Newman, D., Ascuncion, A., Smyth, P., Welling, M.: Distributed Inference for Latent Dirichlet Allocation. In: Advances in Neural Information Processing Systems 20, pp. 1081–1088 (2007)Google Scholar
  11. 11.
    Newman, D., Smyth, P., Steyvers, M.: Scalable Parallel Topic Models. Journal of Intelligence Community Research and Development (2006)Google Scholar
  12. 12.
    Teh, Y.W., Newman, D., Welling, M.: A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation. In: Advances in Neural Information Processing Systems 19, pp. 1378–1385 (2006)Google Scholar
  13. 13.
    Wang, X.G., Grimson, E.: Spatial Latent Dirchlet Allocation. In: Advances in Neural Information Processing Systems 20, pp. 1577–1584 (2008)Google Scholar
  14. 14.
    Xing, D.S., Girolami, M.: Employing Latent Dirichlet Allocation for Fraud Detection in Telecommunications. Pattern Recog. Lett. 28, 1727–1734 (2007)CrossRefGoogle Scholar
  15. 15.
    Li, J., Wang, J.Z.: Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach. IEEE Trans. Pattern Anal. Mach. Intell. 25(9), 1075–1088 (2003)CrossRefGoogle Scholar
  16. 16.
    Wang, J.Z., Li, J., Wiederhold, G.: SIMPLIcity: Semantics-sensitive Integrated Matching for Picture LIbraries. IEEE Trans. Pattern Anal. Mach. Intell. 23(9), 947–963 (2001)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Tomonari Masada
    • 1
  • Tsuyoshi Hamada
    • 1
  • Yuichiro Shibata
    • 1
  • Kiyoshi Oguri
    • 1
  1. 1.Nagasaki UniversityNagasakiJapan

Personalised recommendations