A Machine Learning Approach to the Identification of Appositives
Appositives are structures composed by semantically related noun phrases. In Natural Language Processing, the identification of appositives contributes to the building of semantic lexicons, noun phrase coreference resolution and information extraction from texts. In this paper, we present an appositive identifier for the Portuguese language. We describe experimental results obtained by applying two machine learning techniques: Transformation-based learning (TBL) and Hidden Markov Models (HMM). The results obtained with these two techniques are compared with that of a full syntactic parser, PALAVRAS. The TBL-based system outperformed the other methods. This suggests that a machine learning approach can be beneficial for appositive identification, and also that TBL performs well for this language task.
KeywordsHide Markov Model Noun Phrase Machine Learn Approach Baseline System Hide Markov Model Model
Unable to display preview. Download preview PDF.
- 1.Phillips, W., Riloff, E.: Exploiting strong syntactic heuristics and co-training to learn semantic lexicons. In: Proceedings of the EMNLP (2002)Google Scholar
- 2.Caraballo, S.: Automatic construction of a hypernym-labeled noun hierarchy from text (1999)Google Scholar
- 3.Cardie, C., Wagstaff, K.: Noun phrase coreference as clustering. In: Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, University of Maryland, MD, Association for Computational Linguistics, pp. 82–89 (1999)Google Scholar
- 4.Soon, W., Ng, H., Lim, D.: A machine learning approach to coreference resolution of noun phrases (2001)Google Scholar
- 5.Ng, V., Cardie, C.: Improving machine learning approaches to coreference resolution. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pp. 104–111 (2002)Google Scholar
- 6.Bick, E.: The Parsing System Palavras: Automatic Grammatical Analysis of Portuguese in a Constraint Grammar Framework. PhD thesis, Aarhus University (2000)Google Scholar
- 7.Quirk, R., Greenbaum, S., Leech, G., Svartvick, J.: A Comprehensive Grammar of the English Language. Longman (1985)Google Scholar
- 8.Rabiner, L.R.: A tutorial on hidden markov models and selected applications in speech recognition. Proc. of IEEE, 257–286 (1989)Google Scholar
- 9.Brill, E.: Transformation-based error-driven learning and natural language processing: A case study in part-of-speech tagging. Computational Linguistics 21, 543–565 (1995)Google Scholar
- 10.Freitas, M.C., Garro, M., Oliveira, C., Santos, C.N., Silveira, M.: A anotao de um corpus para o aprendizado supervisionado de um modelo de sn. In: Proceedings of the III TIL / XXV Congresso da SBC, So Leopoldo - RS (2005)Google Scholar
- 11.Santos, C.N.: Aprendizado de mquina na identificao de sintagmas nominais: o caso do portugus brasileiro. Master’s thesis, IME, Rio de Janeiro - RJ (2005)Google Scholar