Skip to main content

The Polylingual Labeled Topic Model

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9324))

Abstract

In this paper, we present the Polylingual Labeled Topic Model, a model which combines the characteristics of the existing Polylingual Topic Model and Labeled LDA. The model accounts for multiple languages with separate topic distributions for each language while restricting the permitted topics of a document to a set of predefined labels. We explore the properties of the model in a two-language setting on a dataset from the social science domain. Our experiments show that our model outperforms LDA and Labeled LDA in terms of their held-out perplexity and that it produces semantically coherent topics which are well interpretable by human subjects.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Biewald, L.: Massive multiplayer human computation for fun, money, and survival. In: Harth, A., Koch, N. (eds.) ICWE 2011. LNCS, vol. 7059, pp. 171–176. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  2. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. Journal of Machine Learning Research 3, 993–1022 (2003)

    MATH  Google Scholar 

  3. Bleier, A.: Practical collapsed stochastic variational inference for the hdp. In: NIPS Workshop on Topic Models: Computation, Application, and Evaluation (2013)

    Google Scholar 

  4. Chang, J., Boyd-Graber, J.L., Gerrish, S., Wang, C., Blei, D.M.: Reading tea leaves: how humans interpret topic models. In: Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held December 7–10, 2009, Vancouver, British Columbia, Canada, pp. 288–296 (2009)

    Google Scholar 

  5. Foulds, J.R., Boyles, L., DuBois, C., Smyth, P., Welling, M.: Stochastic collapsed variational bayesian inference for latent dirichlet allocation. In: The 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2013, Chicago, IL, USA, pp. 446–454, August 11–14, 2013

    Google Scholar 

  6. Griffiths, T.L., Steyvers, M.: Finding scientific topics. In: Proceedings of the National Academy of Sciences (2004)

    Google Scholar 

  7. Mimno, D.M., Wallach, H.M., Naradowsky, J., Smith, D.A., McCallum, A.: Polylingual topic models. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, EMNLP 2009, August 6–7, 2009, Singapore, A meeting of SIGDAT, a Special Interest Group of the ACL, pp. 880–889 (2009)

    Google Scholar 

  8. Ni, X., Sun, J., Hu, J., Chen, Z.: Mining multilingual topics from wikipedia. In: Proceedings of the 18th International Conference on World Wide Web, WWW 2009, Madrid, Spain, pp. 1155–1156, April 20–24, 2009

    Google Scholar 

  9. Ramage, D., Hall, D.L.W., Nallapati, R., Manning, C.D.: Labeled LDA: a supervised topic model for credit attribution in multi-labeled corpora. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, EMNLP 2009, August 6–7, 2009, Singapore, A meeting of SIGDAT, a Special Interest Group of the ACL, pp. 248–256 (2009)

    Google Scholar 

  10. Zapilko, B., Schaible, J., Mayr, P., Mathiak, B.: Thesoz: A SKOS representation of the thesaurus for the social sciences. Semantic Web 4(3), 257–263 (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lisa Posch .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Posch, L., Bleier, A., Schaer, P., Strohmaier, M. (2015). The Polylingual Labeled Topic Model. In: Hölldobler, S., , Peñaloza, R., Rudolph, S. (eds) KI 2015: Advances in Artificial Intelligence. KI 2015. Lecture Notes in Computer Science(), vol 9324. Springer, Cham. https://doi.org/10.1007/978-3-319-24489-1_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-24489-1_26

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-24488-4

  • Online ISBN: 978-3-319-24489-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics