Abstract
In this paper we assess to what extent the available Portuguese treebanks and available probabilistic parsers are suitable for out-of-the-box robust parsing of Portuguese. We also announce the release of the best parser coming out of this exercise, which is, to the best of our knowledge, the first robust parser widely available for Portuguese.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Petrov, S., Barrett, L., Thibaux, R., Klein, D.: Learning accurate, compact, and interpretable tree annotation. In: Proceedings of the 44th ACL, pp. 433–440 (2006)
Wing, B., Baldridge, J.: Adaptation of data and models for probabilistic parsing of Portuguese. In: Vieira, R., Quaresma, P., Nunes, M.d.G.V., Mamede, N.J., Oliveira, C., Dias, M.C. (eds.) PROPOR 2006. LNCS (LNAI), vol. 3960, pp. 140–149. Springer, Heidelberg (2006)
Bikel, D.: Design of a multi-lingual, parallel-processing statistical parsing engine. In: Proceedings of the 2nd Human Language Technology Conference (2002)
Branco, A., Costa, F.: A computational grammar for deep linguistic processing of Portuguese: LXGram, version A.4.1. Technical Report DI-FCUL-TR-08-17, University of Lisbon (2008)
Branco, A., Costa, F.: A deep linguistic processing grammar for portuguese. In: Pardo, T.A.S., et al. (eds.) PROPOR 2010. LNCS (LNAI), vol. 6001, pp. 83–86. Springer, Heidelberg (2010)
Padró, L., Màrquez, L.: On the evaluation and comparison of taggers: The effect of noise in testing corpora. In: Proceedings of the 17th COLING, pp. 997–1002 (1998)
Dickinson, M., Meurers, D.: Detecting inconsistencies in treebanks. In: Proceedings of the 2nd Workshop on Treebanks and Linguistic Theories (2003)
Klein, D., Manning, C.: Fast exact inference with a factored model for NLP. Advances in Neural Language Processing Systems 15, 3–10 (2003)
Schmid, H.: Efficient parsing of highly ambiguous context-free grammars using bit vectors. In: Proceedings of the 20th COLING, pp. 162–168 (2004)
Charniak, E., Johnson, M.: Coarse-to-fine n-best parsing and maxent discriminative reranking. In: Proceedings of the 43rd ACL (2005)
Collins, M.: Head-Driven Statistical Models for Natural Language Parsing. PhD thesis, University of Pennsylvania (1999)
Charniak, E.: A maximum-entropy-inspired parser. In: Proceedings of the 1st North American Chapter of the ACL, pp. 132–139 (2000)
Black, E., Abney, S., Flickinger, D., Gdaniec, C., Grishman, R., Harrison, P., Hindle, D., Marcus, M., Santorini, B.: A procedure for quantitatively comparing the syntactic coverage of English grammars. In: Proceedings of the Workshop on the Evaluation of Parsing Systems, pp. 306–311 (1991)
Magerman, D.: Statistical decision-tree models for parsing. In: Proceedings of the 33rd ACL, pp. 276–283 (1995)
Sekine, S., Collins, M.: Evalb website, http://nlp.cs.nyu.edu/evalb/
Sampson, G., Babarczy, A.: A test of the leaf-ancestor metric for parse accuracy. Natural Language Engineering 9(4), 365–380 (2003)
Sag, I., Baldwin, T., Bond, F., Copestake, A., Flickinger, D.: Multiword expressions: A pain in the neck for NLP. In: Proceedings of the 3rd Conference on Intelligent Text Processing and Computational Linguistics, pp. 1–15 (2002)
Branco, A., Silva, J.: Evaluating solutions for the rapid development of state-of-the-art POS taggers for Portuguese. In: Proceedings of the 4th Language Resources and Evaluation Conference (LREC), pp. 507–510 (2004)
Silva, J.: Shallow processing of Portuguese: From sentence chunking to nominal lemmatization. Master’s thesis, University of Lisbon (2007); Published as Technical Report DI-FCUL-TR-07-16
Bangalore, S., Joshi, A.: Supertagging: An approach to almost parsing. Computational Linguistics 25(2), 237–265 (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Silva, J., Branco, A., Castro, S., Reis, R. (2010). Out-of-the-Box Robust Parsing of Portuguese. In: Pardo, T.A.S., Branco, A., Klautau, A., Vieira, R., de Lima, V.L.S. (eds) Computational Processing of the Portuguese Language. PROPOR 2010. Lecture Notes in Computer Science(), vol 6001. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12320-7_10
Download citation
DOI: https://doi.org/10.1007/978-3-642-12320-7_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12319-1
Online ISBN: 978-3-642-12320-7
eBook Packages: Computer ScienceComputer Science (R0)