Advertisement

Estimating Word Probabilities with Neural Networks in Latent Dirichlet Allocation

  • Tomonari Masada
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10526)

Abstract

This paper proposes a new method for estimating the word probabilities in latent Dirichlet allocation (LDA). LDA uses a Dirichlet distribution as the prior for the per-document topic discrete distributions. While another Dirichlet prior can be introduced for the per-topic word discrete distributions, point estimations may lead to a better evaluation result, e.g. in terms of test perplexity. This paper proposes a method for the point estimation of the per-topic word probabilities in LDA by using multilayer perceptron (MLP). Our point estimation is performed in an online manner by mini-batch gradient ascent. We compared our method to the baseline method using a perceptron with no hidden layers and also to the collapsed Gibbs sampling (CGS). The evaluation experiment showed that the test perplexity of CGS could not be improved in almost all cases. However, there certainly were situations where our method achieved a better perplexity than the baseline. We also discuss a usage of our method as word embedding.

Notes

Acknowledgment

This work was supported by JSPS KAKENHI Grant-in-Aid for Scientific Research (C) JP26330256.

References

  1. 1.
    Asuncion, A., Welling, M., Smyth, P., Teh, Y.W.: On smoothing and inference for topic models. In: Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI), pp. 27–34 (2009)Google Scholar
  2. 2.
    Ba, J.L., Kiros, J.R., Hinton, G.E.: Layer normalization arXiv preprint arXiv:1607.06450 (2016)
  3. 3.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. (JMLR) 3, 993–1022 (2003)MATHGoogle Scholar
  4. 4.
    Blei, D.M.: Topic modeling and digital humanities. J. Digital Humanit. 2(1), 8–11 (2013)Google Scholar
  5. 5.
    Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. (JMLR) 12, 2121–2159 (2011)MathSciNetMATHGoogle Scholar
  6. 6.
    Griffiths, T.L., Steyvers, M.: Finding scientific topics. Proc. National Acad. Sci. (PNAS) 101(Suppl. 1), 5288–5235 (2004)Google Scholar
  7. 7.
    Grimmer, J., Stewart, B.M.: Text as data: the promise and pitfalls of automatic content analysis methods for political texts. Polit. Anal. 21(3), 267–297 (2013)CrossRefGoogle Scholar
  8. 8.
    Hoffman, M.D., Blei, D.M., Bach, F.: Online learning for latent Dirichlet allocation. In: Proceedings of Neural Information Processing Systems (NIPS), pp. 856–864 (2010)Google Scholar
  9. 9.
    Jacobi, C., van Atteveldt, W., Welbers, K.: Quantitative analysis of large amounts of journalistic texts using topic modelling. Digital Journalism 4(1), 89–106 (2016)CrossRefGoogle Scholar
  10. 10.
    Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. In: Proceedings of the International Conference on Learning Representations (ICLR) (2014)Google Scholar
  11. 11.
    Miao, Y.-S., Yu, L., Blunsom, P.: Neural variational inference for text processing. In: Proceedings of the International Conference on Learning Representations (ICLR) (2016)Google Scholar
  12. 12.
    Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of Neural Information Processing Systems (NIPS), pp. 3111–3119 (2013)Google Scholar
  13. 13.
    Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning (ICML), pp. 807–814 (2010)Google Scholar
  14. 14.
    Srivastava, A., Sutton, C.: Neural variational inference for topic models. In: Proceedings of the International Conference on Learning Representations (ICLR) (2017)Google Scholar
  15. 15.
    Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. (JMLR) 15, 1929–1958 (2014)MathSciNetMATHGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Nagasaki UniversityNagasakiJapan

Personalised recommendations