Decision Tree-Based Evaluation of Genitive Classification – An Empirical Study on CMC and Text Corpora

  • Sandra Hansen
  • Roman Schneider
Conference paper

DOI: 10.1007/978-3-642-40722-2_8

Volume 8105 of the book series Lecture Notes in Computer Science (LNCS)
Cite this paper as:
Hansen S., Schneider R. (2013) Decision Tree-Based Evaluation of Genitive Classification – An Empirical Study on CMC and Text Corpora. In: Gurevych I., Biemann C., Zesch T. (eds) Language Processing and Knowledge in the Web. Lecture Notes in Computer Science, vol 8105. Springer, Berlin, Heidelberg

Abstract

Contemporary studies on the characteristics of natural language benefit enormously from the increasing amount of linguistic corpora. Aside from text and speech corpora, corpora of computer-mediated communication (CMC) position themselves between orality and literacy, and beyond that provide insight into the impact of “new”, mainly internet-based media on language behaviour. In this paper, we present an empirical attempt to work with annotated CMC corpora for the explanation of linguistic phenomena. In concrete terms, we implement machine learning algorithms to produce decision trees that reveal rules and tendencies about the use of genitive markers in German.

Keywords

Corpus Linguistics Computer-Mediated Communication Machine Learning Decision Trees Grammar Genitive Classification 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Sandra Hansen
    • 1
  • Roman Schneider
    • 1
  1. 1.Institute for German Language (IDS)MannheimGermany