A Case Study in Tagging Case in German: An Assessment of Statistical Approaches

  • Simon Clematide
Conference paper

DOI: 10.1007/978-3-642-40486-3_2

Part of the Communications in Computer and Information Science book series (CCIS, volume 380)
Cite this paper as:
Clematide S. (2013) A Case Study in Tagging Case in German: An Assessment of Statistical Approaches. In: Mahlow C., Piotrowski M. (eds) Systems and Frameworks for Computational Morphology. SFCM 2013. Communications in Computer and Information Science, vol 380. Springer, Berlin, Heidelberg

Abstract

In this study, we assess the performance of purely statistical approaches using supervised machine learning for predicting case in German (nominative, accusative, dative, genitive, n/a). We experiment with two different treebanks containing morphological annotations: TIGER and TUEBA. An evaluation with 10-fold cross-validation serves as the basis for systematic comparisons of the optimal parametrizations of different approaches. We test taggers based on Hidden Markov Models (HMM), Decision Trees, and Conditional Random Fields (CRF). The CRF approach based on our hand-crafted feature model achieves an accuracy of about 94%. This outperforms all other approaches and results in an improvement of 11% compared to a baseline HMM trigram tagger and an improvement of 2% compared to a state-of-the-art tagger for rich morphological tagsets. Moreover, we investigate the effect of additional (morphological) categories (gender, number, person, part of speech) in the internal tagset used for the training. Rich internal tagsets improve results for all tested approaches.

Keywords

German Case Tagging Supervised Learning Decision Trees Conditional Random Fields Hidden Markov Models Morphologically annotated treebanks Evaluation 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Simon Clematide
    • 1
  1. 1.Institute of Computational LinguisticsUniversity of ZurichZürichSwitzerland

Personalised recommendations