Advertisement

Dedicated Nominal Featurization of Portuguese

  • António Branco
  • João Ricardo Silva
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3960)

Abstract

A widespread assumption about the analysis of inflection features is that this task is to be performed by a tagger with an extended tagset. This typically leads to a POS precision drop due to the data-sparseness problem. In this paper we tackle this problem by addressing inflection tagging as a dedicated task, separated from that of POS tagging. More specifically, this paper describes and evaluates a rule-based approach to the tagging of Gender, Number and Degree inflection of open nominal morphosyntactic categories. This approach achieves a better F-measure than the typical approach of inflection analysis via stochastic state-of-the-art tagging.

Keywords

Head Noun Common Noun Syntactic Processing Statistical Natural Language Processing Predicative Complement 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Branco, A., Silva, J.: Evaluating Solutions for the Rapid Development of State-of-the-Art POS Taggers for Portuguese. In: Proceedings of the 4th Language Resources and Evaluation Conference (LREC), pp. 507–510 (2004)Google Scholar
  2. 2.
    Brants, T.: TnT—A Statistical Part-of-Speech Tagger. In: Proceedings of the 6th Applied Natural Language Conference (ANLP), pp. 224–231 (2000)Google Scholar
  3. 3.
    Hajič, J., Hladká, B.: Probabilistic and Rule-based Tagger of an Inflective Language: A Comparison. In: Proceedings of the 5th ANLP, pp. 111–118 (1997)Google Scholar
  4. 4.
    Manning, C., Schütze, H.: Foundations of Statistical Natural Language Processing. The MIT Press, Cambridge (1999)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • António Branco
    • 1
  • João Ricardo Silva
    • 1
  1. 1.Department of Informatics, NLX—Natural Language GroupUniversity of LisbonPortugal

Personalised recommendations