Abstract
In this paper, we present a new language-independent method to build VerbNet-based lexical resources. As a proof of concept, we show the use of this method to build a VerbNet-style lexicon for Brazilian Portuguese. The resulting resource was built semi-automatically by using existing lexical resources for English and Portuguese and knowledge extracted from corpora. The results achieved around 60% of f-measure when compared with a gold standard for Brazilian Portuguese, which is also described in this paper. The method proposed here also outperformed state-of-art machine learning method (verb clustering) by around 20% of f-measure.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Crouch, D., King, T.H.: Unifying Lexical Resources. In: Interdisciplinary Workshop on the Identication and Representation of Verb Features and Verb, Saarbruecken, Germany, pp. 32–37 (2005)
Yi, S., Palmer, M.: Pushing the boundaries of Semantic Role Labeling with SVM. In: ICON 2004, Hyderabad, India (2004)
Swier, R., Stevenson, S.: Unsupervised Semantic Role Labelling. In: EMNLP 2004, Barcelona, Spain, pp. 95–102 (2004)
Yi, S., Lopper, E., Palmer, M.: Can Semantic Roles Generalize Across Genres? In: NAACL HLT 2007, Rochester, NY, USA, pp. 548–555 (2007)
Girju, R., Roth, D., Sammons, M.: Token-level Disambiguation of Verbnet Classes. In: Interdisciplinary Workshop on the Identification and Representation of Verb Features and Verb Classes, Saarbruecken, Germany (2005)
Abend, O., Reichart, R., Rappoport, A.: A Supervised Algorithm for Verb Disambiguation into Verbnet Classes. In: LREC 2008, Manchester, UK, pp. 9–16 (2008)
Chen, L., Eugenio, B.D.: A Maximum Entropy Approach to Disambiguating Verbnet Classes. In: 2nd Interdisciplinary Workshop on Verbs, The Identification and Representation of Verb Features, Pisa, Italy (2010)
Brown, S.W., Dligach, D., Palmer, M.: Verbnet Class Assignment as a WSD Task. In: IWCS 2011, Oxford, UK, pp. 85–94 (2011)
Fellbaum, C.: WordNet: An electronic lexical database. MIT Press, Cambridge (1998)
Palmer, M., Gildea, D., Kingsbury, P.: The Proposition Bank: A Corpus Annotated with Semantic Roles. Computational Linguistics 31(1), 71–106 (2005)
Baker, C.F., Fillmore, C.J., Lowe, J.F.: The Berkeley Framenet Project. In: 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, pp. 86–90. University of Montréal, Canadá (1998)
Kipper-Schuler, K.: Verbnet: A broad coverage, comprehensive verb lexicon. Doctor of philosophy. University of Pennsylvania (2005)
Dias da Silva, B.C., Felippo, A.D., Nunes, M.G.V.: The Automatic Mapping of Princeton Wordnet lexical-conceptual relations onto the Brazilian Portuguese Wordnet database. In: Proc. LREC 2008, Marrakech, Morocco, pp. 1535–1541 (2008)
Salomao, M.M.: Framenet Brasil: Um trabalho em progresso. Revista Calidoscópio 7(3), 171–182 (2009)
Bertoldi, A., Chishman, R.: Frame semantics and legal corpora annotation: Theoretical and applied challenges. Linguistic Issues in Language Technology 7(9) (2012)
Duran, M.S., Aluisio, S.M.: Propbank-br: A brazilian treebank annotated with semantic role labels. In: LREC 2012, Istanbul, Turkey (2012)
Marrafa, P.: Portuguese wordnet: general architecture and internal semantic relations. DELTA 18, 131–146 (2002)
Branco, A., Carvalheiro, C., Pereira, S., Avels, M., Pinto, C., Silveira, S., Costa, F., Silva, J., Castro, S.: A propbank for portuguese: The cintil-propbank. In: Proc. LREC 2012, Istanbul, Turkey, pp. 1516–1521 (2012)
Levin, B.: English Verb Classes and Alternation, A Preliminary Investigation. The University of Chicago Press, Chicago (1933)
Palmer, M.: Semlink: Linking propbank, verbnet and framenet. In: Generative Lexicon Conference, Pisa, Italy (2009)
Merlo, P., Stevenson, S., Tsang, V., Allaria, G.: A multilingual paradigm for automatic verb classification. In: ACL 2002, Philadelphia, PA, pp. 207–214 (2002)
Sun, L., Korhonen, A., Poibeau, T., Messiant, C.: Investigating the cross-linguistic potential of Verbnet-style classification. In: COLING 2010, Beijing, China, pp. 1056–1064 (2010)
Scarton, C., Sun, L., Kipper-Schuler, K., Duran, M.S., Palmer, M., Korhonen, A.: Verb Clustering for Brazilian Portuguese. In: Gelbukh, A. (ed.) CICLing 2014, Part I. LNCS, vol. 8403, pp. 25–39. Springer, Heidelberg (2014)
Loper, E., Yi, S., Palmer, M.: Combining lexical resources: Mapping between propbank and verbnet. In: 7th International Workshop on Computational Linguistics, Tilburg, Netherlands (2007)
Scarton, C.: Verbnet.br: Construção semiautomática de um léxico computacional de verbos para o Português do Brasil. In: STIL 2011, Cuiabá, MT, Brazil (2011)
Scarton, C., Aluísio, S.M.: Towards a cross-linguistic Verbnet-style lexicon to Brazilian Portuguese. In: CREDISLAS 2012, in Conjunction with LREC 2012, Istanbul, Turkey (2012)
Sun, L., Korhonen, A., Krymolowski, Y.: Verb class discovery from rich syntactic data. In: The 9th International Conference on Computational linguistics and Intelligent Text Processing, Haifa, Israel, pp. 16–27 (2008)
Sun, L., Korhonen, A.: Improving verb clustering with automatically acquired selectional preferences. In: EMNLP 2009, Singapore, pp. 638–647 (2009)
Ferrer, E.E.: Towards a semantic classification of spanish verbs based on subcategorisation information. In: The Workshop on Student Research, in Conjunction with ACL 2004, Barcelona, Spain, pp. 163–170 (2004)
Falk, I., Gardent, C., Lamirel, J.C.: Classifying french verbs using french and english lexical resources. In: ACL 2012, Jeju, Republic of Korea, pp. 854–863 (2012)
Zilio, L., Zanette, A., Scarton, C.: Automatic extraction of subcategorization frames from portuguese corpora. In: Aluisio, S.M., Tagnin, S.E.O. (eds.) New Languages Technologies and Linguistic Research: A Two-Way Road, pp. 78–96. Cambridge Scholars Publishing (2014)
Bick, E.: The Parsing System Palavras: Automatic Grammatical Analysis of Portuguese in a Constraint Grammar Framework. Doctor of philosophy. University of Aarhus (2005)
Aluísio, S.M., Pinheiro, G.M., Manfrim, A.M.P., Genovês Jr., L.H.M.G., Tagnin, S.E.O.: The Lácio-web: Corpora and Tools to Advance Brazilian Portuguese Language Investigations and Computational Linguistic Tools. In: LREC 2004, Lisbon, Portugal, pp. 1779–1782 (2004)
Muniz, M., Paulovich, F.V., Minghim, R., Infante, K., Muniz, F., Vieira, R., Aluísio, S.: Taming the tiger topic: An xces compliant corpus portal to generate subcorpus based on automatic text topic identification. In: CL 2007, Birmingham, UK (2007)
Aziz, W., Specia, L.: Fully automatic compilation of a Portuguese-English parallel corpus for statistical machine translation. In: STIL 2011, Cuiaba ́, MT (Obtober 2011)
Vossen, P.: Eurowordnet: A multilingual database of autonomous and language specific wordnets connected via an interlingual-index. International Journal of Linguistics 17 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Scarton, C., Sanches Duran, M., Aluísio, S.M. (2014). Using Cross-Linguistic Knowledge to Build VerbNet-Style Lexicons: Results for a (Brazilian) Portuguese VerbNet. In: Baptista, J., Mamede, N., Candeias, S., Paraboni, I., Pardo, T.A.S., Volpe Nunes, M.d.G. (eds) Computational Processing of the Portuguese Language. PROPOR 2014. Lecture Notes in Computer Science(), vol 8775. Springer, Cham. https://doi.org/10.1007/978-3-319-09761-9_15
Download citation
DOI: https://doi.org/10.1007/978-3-319-09761-9_15
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-09760-2
Online ISBN: 978-3-319-09761-9
eBook Packages: Computer ScienceComputer Science (R0)