Skip to main content

Using Cross-Linguistic Knowledge to Build VerbNet-Style Lexicons: Results for a (Brazilian) Portuguese VerbNet

  • Conference paper
Computational Processing of the Portuguese Language (PROPOR 2014)

Abstract

In this paper, we present a new language-independent method to build VerbNet-based lexical resources. As a proof of concept, we show the use of this method to build a VerbNet-style lexicon for Brazilian Portuguese. The resulting resource was built semi-automatically by using existing lexical resources for English and Portuguese and knowledge extracted from corpora. The results achieved around 60% of f-measure when compared with a gold standard for Brazilian Portuguese, which is also described in this paper. The method proposed here also outperformed state-of-art machine learning method (verb clustering) by around 20% of f-measure.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Crouch, D., King, T.H.: Unifying Lexical Resources. In: Interdisciplinary Workshop on the Identication and Representation of Verb Features and Verb, Saarbruecken, Germany, pp. 32–37 (2005)

    Google Scholar 

  2. Yi, S., Palmer, M.: Pushing the boundaries of Semantic Role Labeling with SVM. In: ICON 2004, Hyderabad, India (2004)

    Google Scholar 

  3. Swier, R., Stevenson, S.: Unsupervised Semantic Role Labelling. In: EMNLP 2004, Barcelona, Spain, pp. 95–102 (2004)

    Google Scholar 

  4. Yi, S., Lopper, E., Palmer, M.: Can Semantic Roles Generalize Across Genres? In: NAACL HLT 2007, Rochester, NY, USA, pp. 548–555 (2007)

    Google Scholar 

  5. Girju, R., Roth, D., Sammons, M.: Token-level Disambiguation of Verbnet Classes. In: Interdisciplinary Workshop on the Identification and Representation of Verb Features and Verb Classes, Saarbruecken, Germany (2005)

    Google Scholar 

  6. Abend, O., Reichart, R., Rappoport, A.: A Supervised Algorithm for Verb Disambiguation into Verbnet Classes. In: LREC 2008, Manchester, UK, pp. 9–16 (2008)

    Google Scholar 

  7. Chen, L., Eugenio, B.D.: A Maximum Entropy Approach to Disambiguating Verbnet Classes. In: 2nd Interdisciplinary Workshop on Verbs, The Identification and Representation of Verb Features, Pisa, Italy (2010)

    Google Scholar 

  8. Brown, S.W., Dligach, D., Palmer, M.: Verbnet Class Assignment as a WSD Task. In: IWCS 2011, Oxford, UK, pp. 85–94 (2011)

    Google Scholar 

  9. Fellbaum, C.: WordNet: An electronic lexical database. MIT Press, Cambridge (1998)

    MATH  Google Scholar 

  10. Palmer, M., Gildea, D., Kingsbury, P.: The Proposition Bank: A Corpus Annotated with Semantic Roles. Computational Linguistics 31(1), 71–106 (2005)

    Article  Google Scholar 

  11. Baker, C.F., Fillmore, C.J., Lowe, J.F.: The Berkeley Framenet Project. In: 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, pp. 86–90. University of Montréal, Canadá (1998)

    Google Scholar 

  12. Kipper-Schuler, K.: Verbnet: A broad coverage, comprehensive verb lexicon. Doctor of philosophy. University of Pennsylvania (2005)

    Google Scholar 

  13. Dias da Silva, B.C., Felippo, A.D., Nunes, M.G.V.: The Automatic Mapping of Princeton Wordnet lexical-conceptual relations onto the Brazilian Portuguese Wordnet database. In: Proc. LREC 2008, Marrakech, Morocco, pp. 1535–1541 (2008)

    Google Scholar 

  14. Salomao, M.M.: Framenet Brasil: Um trabalho em progresso. Revista Calidoscópio 7(3), 171–182 (2009)

    Article  Google Scholar 

  15. Bertoldi, A., Chishman, R.: Frame semantics and legal corpora annotation: Theoretical and applied challenges. Linguistic Issues in Language Technology 7(9) (2012)

    Google Scholar 

  16. Duran, M.S., Aluisio, S.M.: Propbank-br: A brazilian treebank annotated with semantic role labels. In: LREC 2012, Istanbul, Turkey (2012)

    Google Scholar 

  17. Marrafa, P.: Portuguese wordnet: general architecture and internal semantic relations. DELTA 18, 131–146 (2002)

    Article  Google Scholar 

  18. Branco, A., Carvalheiro, C., Pereira, S., Avels, M., Pinto, C., Silveira, S., Costa, F., Silva, J., Castro, S.: A propbank for portuguese: The cintil-propbank. In: Proc. LREC 2012, Istanbul, Turkey, pp. 1516–1521 (2012)

    Google Scholar 

  19. Levin, B.: English Verb Classes and Alternation, A Preliminary Investigation. The University of Chicago Press, Chicago (1933)

    Google Scholar 

  20. Palmer, M.: Semlink: Linking propbank, verbnet and framenet. In: Generative Lexicon Conference, Pisa, Italy (2009)

    Google Scholar 

  21. Merlo, P., Stevenson, S., Tsang, V., Allaria, G.: A multilingual paradigm for automatic verb classification. In: ACL 2002, Philadelphia, PA, pp. 207–214 (2002)

    Google Scholar 

  22. Sun, L., Korhonen, A., Poibeau, T., Messiant, C.: Investigating the cross-linguistic potential of Verbnet-style classification. In: COLING 2010, Beijing, China, pp. 1056–1064 (2010)

    Google Scholar 

  23. Scarton, C., Sun, L., Kipper-Schuler, K., Duran, M.S., Palmer, M., Korhonen, A.: Verb Clustering for Brazilian Portuguese. In: Gelbukh, A. (ed.) CICLing 2014, Part I. LNCS, vol. 8403, pp. 25–39. Springer, Heidelberg (2014)

    Chapter  Google Scholar 

  24. Loper, E., Yi, S., Palmer, M.: Combining lexical resources: Mapping between propbank and verbnet. In: 7th International Workshop on Computational Linguistics, Tilburg, Netherlands (2007)

    Google Scholar 

  25. Scarton, C.: Verbnet.br: Construção semiautomática de um léxico computacional de verbos para o Português do Brasil. In: STIL 2011, Cuiabá, MT, Brazil (2011)

    Google Scholar 

  26. Scarton, C., Aluísio, S.M.: Towards a cross-linguistic Verbnet-style lexicon to Brazilian Portuguese. In: CREDISLAS 2012, in Conjunction with LREC 2012, Istanbul, Turkey (2012)

    Google Scholar 

  27. Sun, L., Korhonen, A., Krymolowski, Y.: Verb class discovery from rich syntactic data. In: The 9th International Conference on Computational linguistics and Intelligent Text Processing, Haifa, Israel, pp. 16–27 (2008)

    Google Scholar 

  28. Sun, L., Korhonen, A.: Improving verb clustering with automatically acquired selectional preferences. In: EMNLP 2009, Singapore, pp. 638–647 (2009)

    Google Scholar 

  29. Ferrer, E.E.: Towards a semantic classification of spanish verbs based on subcategorisation information. In: The Workshop on Student Research, in Conjunction with ACL 2004, Barcelona, Spain, pp. 163–170 (2004)

    Google Scholar 

  30. Falk, I., Gardent, C., Lamirel, J.C.: Classifying french verbs using french and english lexical resources. In: ACL 2012, Jeju, Republic of Korea, pp. 854–863 (2012)

    Google Scholar 

  31. Zilio, L., Zanette, A., Scarton, C.: Automatic extraction of subcategorization frames from portuguese corpora. In: Aluisio, S.M., Tagnin, S.E.O. (eds.) New Languages Technologies and Linguistic Research: A Two-Way Road, pp. 78–96. Cambridge Scholars Publishing (2014)

    Google Scholar 

  32. Bick, E.: The Parsing System Palavras: Automatic Grammatical Analysis of Portuguese in a Constraint Grammar Framework. Doctor of philosophy. University of Aarhus (2005)

    Google Scholar 

  33. Aluísio, S.M., Pinheiro, G.M., Manfrim, A.M.P., Genovês Jr., L.H.M.G., Tagnin, S.E.O.: The Lácio-web: Corpora and Tools to Advance Brazilian Portuguese Language Investigations and Computational Linguistic Tools. In: LREC 2004, Lisbon, Portugal, pp. 1779–1782 (2004)

    Google Scholar 

  34. Muniz, M., Paulovich, F.V., Minghim, R., Infante, K., Muniz, F., Vieira, R., Aluísio, S.: Taming the tiger topic: An xces compliant corpus portal to generate subcorpus based on automatic text topic identification. In: CL 2007, Birmingham, UK (2007)

    Google Scholar 

  35. Aziz, W., Specia, L.: Fully automatic compilation of a Portuguese-English parallel corpus for statistical machine translation. In: STIL 2011, Cuiaba ́, MT (Obtober 2011)

    Google Scholar 

  36. Vossen, P.: Eurowordnet: A multilingual database of autonomous and language specific wordnets connected via an interlingual-index. International Journal of Linguistics 17 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Scarton, C., Sanches Duran, M., Aluísio, S.M. (2014). Using Cross-Linguistic Knowledge to Build VerbNet-Style Lexicons: Results for a (Brazilian) Portuguese VerbNet. In: Baptista, J., Mamede, N., Candeias, S., Paraboni, I., Pardo, T.A.S., Volpe Nunes, M.d.G. (eds) Computational Processing of the Portuguese Language. PROPOR 2014. Lecture Notes in Computer Science(), vol 8775. Springer, Cham. https://doi.org/10.1007/978-3-319-09761-9_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-09761-9_15

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-09760-2

  • Online ISBN: 978-3-319-09761-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics