The Polylingual Labeled Topic Model

Posch, Lisa; Bleier, Arnim; Schaer, Philipp; Strohmaier, Markus

doi:10.1007/978-3-319-24489-1_26

The Polylingual Labeled Topic Model

Lisa Posch^17,18,
Arnim Bleier¹⁷,
Philipp Schaer¹⁷ &
…
Markus Strohmaier^17,18

Conference paper
First Online: 03 November 2015

1352 Accesses
3 Citations
1 Altmetric

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9324))

Abstract

In this paper, we present the Polylingual Labeled Topic Model, a model which combines the characteristics of the existing Polylingual Topic Model and Labeled LDA. The model accounts for multiple languages with separate topic distributions for each language while restricting the permitted topics of a document to a set of predefined labels. We explore the properties of the model in a two-language setting on a dataset from the social science domain. Our experiments show that our model outperforms LDA and Labeled LDA in terms of their held-out perplexity and that it produces semantically coherent topics which are well interpretable by human subjects.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Biewald, L.: Massive multiplayer human computation for fun, money, and survival. In: Harth, A., Koch, N. (eds.) ICWE 2011. LNCS, vol. 7059, pp. 171–176. Springer, Heidelberg (2012)
Chapter Google Scholar
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. Journal of Machine Learning Research 3, 993–1022 (2003)
MATH Google Scholar
Bleier, A.: Practical collapsed stochastic variational inference for the hdp. In: NIPS Workshop on Topic Models: Computation, Application, and Evaluation (2013)
Google Scholar
Chang, J., Boyd-Graber, J.L., Gerrish, S., Wang, C., Blei, D.M.: Reading tea leaves: how humans interpret topic models. In: Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held December 7–10, 2009, Vancouver, British Columbia, Canada, pp. 288–296 (2009)
Google Scholar
Foulds, J.R., Boyles, L., DuBois, C., Smyth, P., Welling, M.: Stochastic collapsed variational bayesian inference for latent dirichlet allocation. In: The 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2013, Chicago, IL, USA, pp. 446–454, August 11–14, 2013
Google Scholar
Griffiths, T.L., Steyvers, M.: Finding scientific topics. In: Proceedings of the National Academy of Sciences (2004)
Google Scholar
Mimno, D.M., Wallach, H.M., Naradowsky, J., Smith, D.A., McCallum, A.: Polylingual topic models. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, EMNLP 2009, August 6–7, 2009, Singapore, A meeting of SIGDAT, a Special Interest Group of the ACL, pp. 880–889 (2009)
Google Scholar
Ni, X., Sun, J., Hu, J., Chen, Z.: Mining multilingual topics from wikipedia. In: Proceedings of the 18th International Conference on World Wide Web, WWW 2009, Madrid, Spain, pp. 1155–1156, April 20–24, 2009
Google Scholar
Ramage, D., Hall, D.L.W., Nallapati, R., Manning, C.D.: Labeled LDA: a supervised topic model for credit attribution in multi-labeled corpora. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, EMNLP 2009, August 6–7, 2009, Singapore, A meeting of SIGDAT, a Special Interest Group of the ACL, pp. 248–256 (2009)
Google Scholar
Zapilko, B., Schaible, J., Mayr, P., Mathiak, B.: Thesoz: A SKOS representation of the thesaurus for the social sciences. Semantic Web 4(3), 257–263 (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

GESIS – Leibniz Institute for the Social Sciences, Cologne, Germany
Lisa Posch, Arnim Bleier, Philipp Schaer & Markus Strohmaier
Institute for Web Science and Technologies, University of Koblenz-Landau, Mainz, Germany
Lisa Posch & Markus Strohmaier

Authors

Lisa Posch
View author publications
You can also search for this author in PubMed Google Scholar
Arnim Bleier
View author publications
You can also search for this author in PubMed Google Scholar
Philipp Schaer
View author publications
You can also search for this author in PubMed Google Scholar
Markus Strohmaier
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lisa Posch .

Editor information

Editors and Affiliations

Technische Universität Dresden, Dresden, Germany
Steffen Hölldobler
Technische Universität Dresden, Dresden, Germany
Free University Bozen-Bolzano, Bozen-Bolzano, Italy
Rafael Peñaloza
Technische Universität Dresden, Dresden, Germany
Sebastian Rudolph

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Posch, L., Bleier, A., Schaer, P., Strohmaier, M. (2015). The Polylingual Labeled Topic Model. In: Hölldobler, S., , Peñaloza, R., Rudolph, S. (eds) KI 2015: Advances in Artificial Intelligence. KI 2015. Lecture Notes in Computer Science(), vol 9324. Springer, Cham. https://doi.org/10.1007/978-3-319-24489-1_26

Download citation

DOI: https://doi.org/10.1007/978-3-319-24489-1_26
Published: 03 November 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24488-4
Online ISBN: 978-3-319-24489-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics