Dedicated Nominal Featurization of Portuguese
A widespread assumption about the analysis of inflection features is that this task is to be performed by a tagger with an extended tagset. This typically leads to a POS precision drop due to the data-sparseness problem. In this paper we tackle this problem by addressing inflection tagging as a dedicated task, separated from that of POS tagging. More specifically, this paper describes and evaluates a rule-based approach to the tagging of Gender, Number and Degree inflection of open nominal morphosyntactic categories. This approach achieves a better F-measure than the typical approach of inflection analysis via stochastic state-of-the-art tagging.
KeywordsHead Noun Common Noun Syntactic Processing Statistical Natural Language Processing Predicative Complement
Unable to display preview. Download preview PDF.
- 1.Branco, A., Silva, J.: Evaluating Solutions for the Rapid Development of State-of-the-Art POS Taggers for Portuguese. In: Proceedings of the 4th Language Resources and Evaluation Conference (LREC), pp. 507–510 (2004)Google Scholar
- 2.Brants, T.: TnT—A Statistical Part-of-Speech Tagger. In: Proceedings of the 6th Applied Natural Language Conference (ANLP), pp. 224–231 (2000)Google Scholar
- 3.Hajič, J., Hladká, B.: Probabilistic and Rule-based Tagger of an Inflective Language: A Comparison. In: Proceedings of the 5th ANLP, pp. 111–118 (1997)Google Scholar
- 4.Manning, C., Schütze, H.: Foundations of Statistical Natural Language Processing. The MIT Press, Cambridge (1999)Google Scholar