Multi-dimensional Analysis of Political Documents

  • Heiner Stuckenschmidt
  • Cäcilia Zirn
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7337)


Automatic content analysis is more and more becoming an accepted research method in social science. In political science researchers are using party manifestos and transcripts of political speeches to analyze the positions of different actors. Existing approaches are limited to a single dimension, in particular, they cannot distinguish between the positions with respect to a specific topic. In this paper, we propose a method for analyzing and comparing documents according to a set of predefined topics that is based on an extension of Latent Dirichlet Allocation for inducing knowledge about relevant topics. We validate the method by showing that it can reliably guess which member of a coalition was assigned a certain ministry based on a comparison of the parties’ election manifestos with the coalition contract.


Topic Models Political Science 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Andrzejewski, D., Zhu, X., Craven, M., Recht, B.: A framework for incorporating general domain knowledge into latent dirichlet allocation using first-order logic. In: Proceedings of the 22nd International Joint Conference on Artificial Intelligence, IJCAI 2011 (2011)Google Scholar
  2. 2.
    Benoit, K., Mikhaylov, S., Laver, M.: Treating words as data with error: Uncertainty in text statements of policy positions. American Journal of Political Science 53(2), 495–513 (2009)CrossRefGoogle Scholar
  3. 3.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. Journal of Machine Learning Research (JMLR) 3, 993–1022 (2003)zbMATHGoogle Scholar
  4. 4.
    Casella, G., George, E.I.: Explaining the gibbs sampler. The American Statistician 46(3), 167–174 (1992)MathSciNetGoogle Scholar
  5. 5.
    Hearst, M.: Texttiling: Segmenting text into multi-paragraph subtopic passages. Computational Linguistics 23(1), 33–64 (1997)Google Scholar
  6. 6.
    Laver, M., Garry, J.: Estimating policy positions from political texts. American Journal of Political Science 44(3), 619–634 (2000)CrossRefGoogle Scholar
  7. 7.
    Laver, M., Sergenti, E.: Party Competition: An Agent-Based Model. Princeton University Press (2011)Google Scholar
  8. 8.
    Lee, L.: On the effectiveness of the skew divergence for statistical language analysis. In: Artificial Intelligence and Statistics, pp. 65–72 (2001)Google Scholar
  9. 9.
    Pappi, F.U., Seher, N.M., Kurella, A.-S.: Das politikangebot deutscher parteien in den bundestagswahlen seit 1976 im dimensionsweisen vergleich: Gesamtskala und politikfeldspezifische skalen. Working Paper 142, Mannheimer Zentrum für Europäische Sozialforschung, MZES (2011)Google Scholar
  10. 10.
    Seher, N.M., Pappi, F.U.: Politikfeldspezifische positionen der landesverbände der deutschen parteien. Working Paper 139, Mannheimer Zentrum für Europäische Sozialforschung, MZES (2011)Google Scholar
  11. 11.
    Slapin, J.B., Proksch, S.-O.: A scaling model for estimating time-series policy positions from texts. American Journal of Political Science 52(3), 705–722 (2008)CrossRefGoogle Scholar
  12. 12.
    Toutanova, K., Klein, D., Manning, C., Singer, Y.: Feature-rich part-of-speech tagging with a cyclic dependency network. In: Proceedings of HLT-NAACL 2003, pp. 252–259 (2003)Google Scholar
  13. 13.
    Volkens, A., Lacewell, O., Lehmann, P., Regel, S., Schultze, H., Werner, A.: The Manifesto Data Collection. Manifesto Project (MRG/CMP/MARPOR), Wissenschaftszentrum Berlin für Sozialforschung, WZB (2011)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Heiner Stuckenschmidt
    • 1
  • Cäcilia Zirn
    • 1
  1. 1.KR & KM Research GroupUniversity of MannheimGermany

Personalised recommendations