Tutorial on Probabilistic Topic Modeling: Additive Regularization for Stochastic Matrix Factorization

  • Konstantin VorontsovEmail author
  • Anna Potapenko
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 436)


Probabilistic topic modeling of text collections is a powerful tool for statistical text analysis. In this tutorial we introduce a novel non-Bayesian approach, called Additive Regularization of Topic Models. ARTM is free of redundant probabilistic assumptions and provides a simple inference for many combined and multi-objective topic models.


Probabilistic topic modeling Regularization of ill-posed inverse problems Stochastic matrix factorization Probabilistic latent sematic analysis Latent Dirichlet Allocation EM-algorithm 



The work was supported by the Russian Foundation for Basic Research grants 14-07-00847, 14-07-00908. We thank Alexander Frey for his help and valuable discussion, and Vitaly Glushachenkov for his experimental work on model data.


  1. 1.
    Blei, D.M.: Probabilistic topic models. Commun. ACM 55(4), 77–84 (2012)CrossRefMathSciNetGoogle Scholar
  2. 2.
    Rubin, T.N., Chambers, A., Smyth, P., Steyvers, M.: Statistical topic models for multi-label document classification. Mach. Learn. 88(1–2), 157–208 (2012)CrossRefzbMATHMathSciNetGoogle Scholar
  3. 3.
    Daud, A., Li, J., Zhou, L., Muhammad, F.: Knowledge discovery through directed probabilistic topic models: a survey. Front. Comput. Sci. China 4(2), 280–301 (2010)CrossRefGoogle Scholar
  4. 4.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)zbMATHGoogle Scholar
  5. 5.
    Shashanka, M., Raj, B., Smaragdis, P.: Sparse overcomplete latent variable decomposition of counts data. In: Platt, J.C., Koller, D., Singer, Y., Roweis, S. (eds.) Advances in Neural Information Processing Systems, NIPS-2007, pp. 1313–1320. MIT Press, Cambridge (2008)Google Scholar
  6. 6.
    Wang, C., Blei, D.M.: Decoupling sparsity and smoothness in the discrete hierarchical dirichlet process. In: Bengio, Y., Schuurmans, D., Lafferty, J.D., Williams, C.K.I., Culotta, A. (eds.) NIPS, pp. 1982–1989. Curran Associates Inc., New York (2009)Google Scholar
  7. 7.
    Eisenstein, J., Ahmed, A., Xing, E.P.: Sparse additive generative models of text. In: ICML’11, pp. 1041–1048 (2011)Google Scholar
  8. 8.
    Larsson, M.O., Ugander, J.: A concave regularization technique for sparse mixture models. In: Shawe-Taylor, J., Zemel, R., Bartlett, P., Pereira, F., Weinberger, K. (eds.) Advances in Neural Information Processing Systems 24, pp. 1890–1898 (2011)Google Scholar
  9. 9.
    Chien, J.T., Chang, Y.L.: Bayesian sparse topic model. J. Signal Process. Syst., 1–15 (2013)Google Scholar
  10. 10.
    Khalifa, O., Corne, D.W., Chantler, M., Halley, F.: Multi-objective topic modeling. In: Purshouse, R.C., Fleming, P.J., Fonseca, C.M., Greco, S., Shaw, J. (eds.) EMO 2013. LNCS, vol. 7811, pp. 51–65. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  11. 11.
    Vorontsov, K.V.: Additive regularization for topic models of text collections. Doklady Math. 88(3) (to appear, 2014)Google Scholar
  12. 12.
    Hofmann, T.: Probabilistic latent semantic indexing. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 50–57. ACM, New York (1999)Google Scholar
  13. 13.
    Wang, Y.: Distributed Gibbs sampling of latent dirichlet allocation: The gritty details (2008)Google Scholar
  14. 14.
    Teh, Y.W., Newman, D., Welling, M.: A collapsed variational bayesian inference algorithm for latent dirichlet allocation. In: NIPS, pp. 1353–1360 (2006)Google Scholar
  15. 15.
    Asuncion, A., Welling, M., Smyth, P., Teh, Y.W.: On smoothing and inference for topic models. In: Proceedings of the International Conference on Uncertainty in Artificial Intelligence, pp. 27–34 (2009)Google Scholar
  16. 16.
    Varadarajan, J., Emonet, R., Odobez, J.M.: A sparsity constraint for topic models – application to temporal activity mining. In: NIPS-2010 Workshop on Practical Applications of Sparse Modeling: Open Issues and New Directions (2010)Google Scholar
  17. 17.
    Chemudugunta, C., Smyth, P., Steyvers, M.: Modeling general and specific aspects of documents with a probabilistic topic model, vol. 19, pp. 241–248. MIT Press (2007)Google Scholar
  18. 18.
    Potapenko, A., Vorontsov, K.: Robust PLSA performs better than LDA. In: Serdyukov, P., Braslavski, P., Kuznetsov, S.O., Kamps, J., Rüger, S., Agichtein, E., Segalovich, I., Yilmaz, E. (eds.) ECIR 2013. LNCS, vol. 7814, pp. 784–787. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  19. 19.
    Teh, Y.W., Jordan, M.I., Beal, M.J., Blei, D.M.: Hierarchical Dirichlet processes. J. Am. Stat. Assoc. 101(476), 1566–1581 (2006)CrossRefzbMATHMathSciNetGoogle Scholar
  20. 20.
    Blei, D.M., Griffiths, T.L., Jordan, M.I.: The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies. J. ACM 57(2), 7:1–7:30 (2010)CrossRefMathSciNetGoogle Scholar
  21. 21.
    Tan, Y., Ou, Z.: Topic-weak-correlated latent dirichlet allocation. In: 7th International Symposium Chinese Spoken Language Processing (ISCSLP), pp. 224–228 (2010)Google Scholar
  22. 22.
    Dietz, L., Bickel, S., Scheffer, T.: Unsupervised prediction of citation influences. In: Proceedings of the 24th International Conference on Machine Learning, ICML ’07, pp. 233–240. ACM, New York (2007)Google Scholar
  23. 23.
    Newman, D., Noh, Y., Talley, E., Karimi, S., Baldwin, T.: Evaluating topic models for digital libraries. In: Proceedings of the 10th Annual Joint Conference on Digital Libraries, JCDL ’10, pp. 215–224. ACM, New York (2010)Google Scholar
  24. 24.
    Newman, D., Bonilla, E.V., Buntine, W.L.: Improving topic coherence with regularized topic models. In: Shawe-Taylor, J., Zemel, R., Bartlett, P., Pereira, F., Weinberger, K. (eds.) Advances in Neural Information Processing Systems 24, pp. 496–504 (2011)Google Scholar
  25. 25.
    Mimno, D., Wallach, H.M., Talley, E., Leenders, M., McCallum, A.: Optimizing semantic coherence in topic models. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP ’11, pp. 262–272. Association for Computational Linguistics, Stroudsburg (2011)Google Scholar
  26. 26.
    Zhou, S., Li, K., Liu, Y.: Text categorization based on topic model. Int. J. Comput. Intell. Syst. 2(4), 398–409 (2009)CrossRefMathSciNetGoogle Scholar
  27. 27.
    Rosen-Zvi, M., Griffiths, T., Steyvers, M., Smyth, P.: The author-topic model for authors and documents. In: Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence, UAI ’04, pp. 487–494. AUAI Press, Arlington (2004)Google Scholar
  28. 28.
    Cui, W., Liu, S., Tan, L., Shi, C., Song, Y., Gao, Z., Qu, H., Tong, X.: TextFlow: towards better understanding of evolving topics in text. IEEE Trans. Vis. Comput. Graph. 17(12), 2412–2421 (2011)CrossRefGoogle Scholar
  29. 29.
    Kataria, S., Mitra, P., Caragea, C., Giles, C.L.: Context sensitive topic models for author influence in document networks. In: Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence, IJCAI’11, vol. 3, pp. 2274–2280. AAAI Press (2011)Google Scholar
  30. 30.
    Wang, C., Blei, D.M.: Collaborative topic modeling for recommending scientific articles. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 448–456. ACM, New York (2011)Google Scholar
  31. 31.
    Friedman, J.H., Hastie, T., Tibshirani, R.: Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 33(1), 1–22 (2010)Google Scholar
  32. 32.
    Newman, D., Lau, J.H., Grieser, K., Baldwin, T.: Automatic evaluation of topic coherence. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, HLT ’10, pp. 100–108. Association for Computational Linguistics, Stroudsburg (2010)Google Scholar
  33. 33.
    McCallum, A.K.: Bow: A toolkit for statistical language modeling, text retrieval, classification and clustering (1996).

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.The Higher School of Economics, Dorodnicyn Computing Centre of RASMoscow Institute of Physics and TechnologyMoscowRussia
  2. 2.Dorodnicyn Computing Centre of RASMoscow State UniversityMoscowRussia

Personalised recommendations