Advertisement

Parallel Latent Dirichlet Allocation on GPUs

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10861)

Abstract

Latent Dirichlet Allocation (LDA) is a statistical technique for topic modeling. Since it is very computationally demanding, its parallelization has garnered considerable interest. In this paper, we systematically analyze the data access patterns for LDA and devise suitable algorithmic adaptations and parallelization strategies for GPUs. Experiments on large-scale datasets show the effectiveness of the new parallel implementation on GPUs.

Keywords

Parallel topic modeling Parallel Latent Dirichlet Allocation Parallel machine learning 

References

  1. 1.
    Asuncion, A., Welling, M., Smyth, P., Teh, Y.W.: On smoothing and inference for topic models. In: Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence, pp. 27–34. AUAI Press (2009)Google Scholar
  2. 2.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. JMLR 3, 993–1022 (2003)MATHGoogle Scholar
  3. 3.
    Griffiths, T.L., Steyvers, M.: Finding scientific topics. Proc. Natl. Acad. Sci. 101(Suppl 1), 5228–5235 (2004)CrossRefGoogle Scholar
  4. 4.
    Jelodar, H., Wang, Y., Yuan, C., Feng, X.: Latent Dirichlet allocation (LDA) and topic modeling: models, applications, a survey. arXiv:1711.04305 (2017)
  5. 5.
    Lichman, M.: UCI machine learning repository (2013). http://archive.ics.uci.edu/ml
  6. 6.
    Lu, M., Bai, G., Luo, Q., Tang, J., Zhao, J.: Accelerating topic model training on a single machine. In: Ishikawa, Y., Li, J., Wang, W., Zhang, R., Zhang, W. (eds.) APWeb 2013. LNCS, vol. 7808, pp. 184–195. Springer, Heidelberg (2013).  https://doi.org/10.1007/978-3-642-37401-2_20CrossRefGoogle Scholar
  7. 7.
    Newman, D., Asuncion, A., Smyth, P., Welling, M.: Distributed algorithms for topic models. JMLR 10, 1801–1828 (2009)MathSciNetMATHGoogle Scholar
  8. 8.
    Phan, X.H., Nguyen, C.T.: GibbsLDA++: AC/C++ implementation of latent dirichlet allocation (LDA) (2007)Google Scholar
  9. 9.
    Porteous, I., Newman, D., Ihler, A., Asuncion, A., Smyth, P., Welling, M.: Fast collapsed Gibbs sampling for latent Dirichlet allocation. In: SIGKDD. ACM (2008)Google Scholar
  10. 10.
    Tristan, J.B., Huang, D., Tassarotti, J., Pocock, A.C., Green, S., Steele, G.L.: Augur: data-parallel probabilistic modeling. In: NIPS (2014)Google Scholar
  11. 11.
    Tristan, J.B., Tassarotti, J., Steele, G.: Efficient training of LDA on a GPU by mean-for-mode estimation. In: ICML (2015)Google Scholar
  12. 12.
    Wang, Y., Bai, H., Stanton, M., Chen, W.-Y., Chang, E.Y.: PLDA: parallel latent Dirichlet allocation for large-scale applications. In: Goldberg, A.V., Zhou, Y. (eds.) AAIM 2009. LNCS, vol. 5564, pp. 301–314. Springer, Heidelberg (2009).  https://doi.org/10.1007/978-3-642-02158-9_26CrossRefGoogle Scholar
  13. 13.
    Xiao, H., Stibor, T.: Efficient collapsed Gibbs sampling for latent Dirichlet allocation. In: ACML (2010)Google Scholar
  14. 14.
    Xue, P., Li, T., Zhao, K., Dong, Q., Ma, W.: GLDA: parallel Gibbs sampling for latent Dirichlet allocation on GPU. In: Wu, J., Li, L. (eds.) ACA 2016. CCIS, vol. 626, pp. 97–107. Springer, Singapore (2016).  https://doi.org/10.1007/978-981-10-2209-8_9CrossRefGoogle Scholar
  15. 15.
    Yan, F., Xu, N., Qi, Y.: Parallel inference for latent Dirichlet allocation on graphics processing units. In: NIPS (2009)Google Scholar
  16. 16.
    Zhang, B., Peng, B., Qiu, J.: High performance LDA through collective model communication optimization. Proc. Comput. Sci. 80, 86–97 (2016)CrossRefGoogle Scholar
  17. 17.
    Zhao, H., Jiang, B., Canny, J.F., Jaros, B.: Same but different: fast and high quality Gibbs parameter estimation. In: SIGKDD. ACM (2015)Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.The Ohio State UniversityColumbusUSA

Personalised recommendations