Corpus-Based Lexeme Ranking for Morphological Guessers

  • Krister Lindén
  • Jussi Tuovila
Conference paper

DOI: 10.1007/978-3-642-04131-0_9

Part of the Communications in Computer and Information Science book series (CCIS, volume 41)
Cite this paper as:
Lindén K., Tuovila J. (2009) Corpus-Based Lexeme Ranking for Morphological Guessers. In: Mahlow C., Piotrowski M. (eds) State of the Art in Computational Morphology. SFCM 2009. Communications in Computer and Information Science, vol 41. Springer, Berlin, Heidelberg

Abstract

Language software applications encounter new words, e.g., acronyms, technical terminology, loan words, names or compounds of such words. To add new words to a morphological lexicon, we need to determine their base form and indicate their inflectional paradigm. A base form and a paradigm define a lexeme. In this article, we evaluate a lexicon-based method augmented with data from a corpus or the internet for generating and ranking lexeme suggestions for new words. As an entry generator often produces numerous suggestions, it is important that the best suggestions be among the first few, otherwise it may become more efficient to create the entries by hand. By generating lexeme suggestions with an entry generator and then further generating some key word forms for the lexemes, we can find support for the lexemes in a corpus. Our ranking methods have 56–79% average precision and 78–89% recall among the top 6 candidates, i.e., an F-score of 65–84%, indicating that the first correct entry suggestion is on the average found as the second or third candidate. The corpus-based ranking methods were found to be significant in practice as they save time for the lexicographer by increasing recall with 7–8% among the top candidates.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Krister Lindén
    • 1
  • Jussi Tuovila
    • 1
  1. 1.University of HelsinkiHelsinkiFinland

Personalised recommendations