Language Resources and Evaluation

, Volume 46, Issue 4, pp 667–699 | Cite as

Annotating the argument structure of deverbal nominalizations in Spanish

Article

Abstract

Over recent years, there has been a growing interest in the computational treatment of nominalized Noun Phrases due to the rich semantic information they contain. These Noun Phrases can be understood as verbal paraphrases and, just like them, they can also denote argument and thematic-role relations. This paper presents the methodology followed to annotate the argument structure of deverbal nominalizations in the Spanish AnCora-Es corpus. We focus on the automated annotation process that is mostly based on the semantic information specified in a verbal lexicon but also on the syntactic and semantic information annotated in the corpus. The heuristic rules that make use of this information rely on linguistic assumptions that are also evaluated as we evaluate the reliability of the automated process. The automated annotation was manually checked in order to ensure the accuracy of the final resource. We demonstrate its feasibility (77% F-measure) and show that it facilitates corpus annotation, which is always a time-consuming and costly process. The result is the enrichment of the AnCora-Es corpus with the argument structure and thematic roles of deverbal nominalizations. It is the first Spanish corpus with this kind of information that is freely available.

Keywords

Nominalization Argument structure Semantic corpus annotation Heuristic rules 

References

  1. Aparicio, J., Taulé, M., & Martí, M. A. (2008). AnCora-Verb: A lexical resource for the semantic annotation of corpora. In Proceedings of the sixth international language resources and evaluation LREC’08 (pp. 797–802). Marrakech, Morocco: European Language Resources Association (ELRA).Google Scholar
  2. Badia, T. (2002). Els complements nominals. In J. Solà (Ed.), Gramàtica del Català Contemporani (Vol. 3, pp. 1591–1640). Barcelona: Empúries.Google Scholar
  3. Baker, C. F., Fillmore, C. J., & Lowe, J. B. (1998). The Berkeley FrameNet Project. In Proceedings of the 36th annual meeting of the Association for Computational Linguistics and 17th international conference on computational linguistics, ACL’98 (Vol. 1, pp. 86–90). Stroudsburg, PA, USA: Association for Computational Linguistics.Google Scholar
  4. Bertran, M., Borrega, O., Recasens, M., & Soriano, B. (2008). AnCoraPipe: A tool for multilevel annotation. Procesamiento del Lenguaje Natural, 41, 291–292.Google Scholar
  5. Boleda, G. (2007). Automatic acquisition of semantic classes for adjectives. Ph.D. thesis, Pompeu Fabra University, Barcelona, Spain.Google Scholar
  6. Bosque, I., & Picallo, C. (1996). Postnominal adjectives in Spanish DPs. Journal of Linguistics, 32, 349–385.CrossRefGoogle Scholar
  7. Burchardt, A., Erk, K., Frank, A., Kowalski, A., Padó, S., & Pinkal, M. (2009). FrameNet for the semantic analysis of German: Annotation, representation and automation. In H. C. Boas (Ed.), Multilingual FrameNets in computational lexicography: Methods and applications. Mouton de Gruyer.Google Scholar
  8. Che, W., Li, Z., Hu, Y., Li, Y., Qin, B., Liu, T., et al. (2008). A cascaded syntactic and semantic dependency parsing system. In Proceedings of the twelfth conference on computational natural language learning, CoNLL’08 (pp. 238–242).Google Scholar
  9. Ciaramita, M., Attardi, G., Dell’Orletta, F., & Surdeanu, M. (2008). DeSRL: A linear-time semantic role labeling system. In Proceedings of the twelfth conference on computational natural language learning, CoNLL’08 (pp. 258–262).Google Scholar
  10. Dowty, D. (1979). Word meaning and montague grammar. Dordrecht: Reidel.CrossRefGoogle Scholar
  11. Gerber, M., & Chai, J. Y. (2010). Beyond NomBank: A study of implicit argumentation for nominal predicates. In Proceedings of the Association of Computational Linguistics conference 2010, ACL’10 (pp. 1583–1592). Uppsala, Sweden: Association for Computational Linguistics.Google Scholar
  12. Grimshaw, J. (1990). Argument structure. Cambridge, MA: MIT Press.Google Scholar
  13. Gurevich, O., & Waterman, S. (2009). Mapping verbal argument preferences to deverbals. In Proceedings of the 2009 IEEE international conference on semantic computing (pp. 17–24).Google Scholar
  14. Gurevich, O., Richard, C., Holloway King, T., & De Paiva, V. (2006). Deverbal nouns in knowledge representation. In Proceedings of Florida Artificial Intelligence Research Society conference, Florida, USA (pp. 670–675).Google Scholar
  15. Hull, R. D., & Gomez, F. (2000). Semantic interpretation of deverbal nominalizations. Natural Language Engineering, 6(2), 139–161.CrossRefGoogle Scholar
  16. Johansson, R., & Nugues, P. (2008). Dependency-based syntactic–semantic analysis with PropBank and NomBank. In Proceedings of the twelfth conference on computational natural language learning, CoNLL’08 (pp. 183–187). Manchester, UK.Google Scholar
  17. Kipper, K., Dang, H. T., Schuler, W., & Palmer, M. (2000). Building a class-based verb lexicon using TAGs. In Proceedings of the fifth international workshop on tree adjoining grammars and related formalisms. Paris, France.Google Scholar
  18. Kipper, K., Korhonen, A., Ryant, N., & Palmer, M. (2006). Extending VerbNet with novel verb classes. In Proceedings of the 5th international conference on language resources and evaluation, LREC’06 (pp. 1027–1032). Genova, Italy.Google Scholar
  19. Loper, E., Yi, S., & Palmer, M. (2007). Combining lexical resources: Mapping between PropBank and VerbNet. In Proceedings of the 7th international workshop on computational linguistics. Tilburg, The Netherlands.Google Scholar
  20. Meyers, A. (2007). Annotation guidelines for NomBank noun argument structure for PropBank. Technical report, University of New YorkGoogle Scholar
  21. Meyers, A., Reeves, R., & Macleod, C. (2004). NP-external arguments: A study of argument sharing in English. In Proceedings of the workshop on multiword expressions: Integrating processing, MWE ’04 (pp. 96–103). Stroudsburg, PA, USA: Association for Computational Linguistics.Google Scholar
  22. Ohara, K. (2009). Frame-based contrastive lexical semantics in Japanese FrameNet: The case of risk and kakeru. In H. C. Boas (Ed.), Multilingual FrameNets in computational lexicography: Methods and applications. Mouton de Gruyer.Google Scholar
  23. Padó, S., Pennacchiotti, M., & Sporleder, C. (2008). Semantic role assignment for event nominalisations by leveraging verbal data. In Proceedings of the 22nd international conference on computational linguistics, CoLing'08 (pp. 665–672). Manchester, UK.Google Scholar
  24. Palmer, M. (2009). SemLink: Combining English lexical resources. In Proceedings of the Generative Lexicon conference, GenLex-09 (pp. 19–25).Google Scholar
  25. Palmer, M., Kingsbury, P., & Gildea, D. (2005). The proposition bank: An annotated corpus of semantic roles. Computational Linguistics, 31(1), 76–105.CrossRefGoogle Scholar
  26. Peris, A. (2010). AnCora-Nom: Annotation guidelines. Technical report, University of Barcelona.Google Scholar
  27. Peris, A., & Taulé, M. (2009). Evaluación de los criterios lingüísticos para la distinción evento y resultado en los sustantivos deverbales. In Proceedings of the 1st international conference on corpus linguistics (pp. 596–611). España: Murcia.Google Scholar
  28. Peris, A., Taulé, M., Boleda, G., & Rodríguez, H. (2010). ADN-Classifier: Automatically assigning denotation types to nominalizations. In Proceedings of the language resources and evaluation conference, LREC’10 (pp. 1422–1428). Valleta, Malta.Google Scholar
  29. Picallo, C. (1999). La estructura del Sintagma Nominal: las nominalizaciones y otros sustantivos con complementos argumentales. In I. Bosque & V. Demonte (Eds.), Gramática Descriptiva de la Lengua Española (Vol. 1, pp. 363–393). Madrid: Espasa Calpe.Google Scholar
  30. Rainer, F. (1999). La derivación Adjetival. In I. Bosque, & V. Demonte (Eds.), Gramática Descriptiva de la Lengua Española (Vol. 3, pp. 4595–4642). Madrid: Espasa Calpe.Google Scholar
  31. Recasens, M., & Martí, M. A. (2010). AnCora-CO: Coreferentially annotated corpora for Spanish and Catalan. Language Resources and Evaluation, 44, 315–345.CrossRefGoogle Scholar
  32. Ruppenhofer, J., Ellsworth, M., Petruck, M. R. L., Johnson, C. R., & Scheffczyk, J. (2006). FrameNet II: Extended theory and practice. Technical report, ICSI—International Computer Science Institute.Google Scholar
  33. Santiago, R., & Bustos, E. (1999). La derivación Nominal. In I. Bosque & V. Demonte (Eds.), Gramática Descriptiva de la Lengua Española (Vol. 3, pp. 4505–4594). Madrid: Espasa Calpe.Google Scholar
  34. Scott, W. A. (1955). Reliability of content analysis: The case of nominal scale coding. Public Opinion Quarterly, 19(3), 321–325.CrossRefGoogle Scholar
  35. Siegel, S., & Castellan, N. J. (1988). Nonparametric statistics for the behavioral sciences. New York: McGraw-Hill.Google Scholar
  36. Subirats, C. (2009). Spanish FrameNet: A frame semantic analysis of the Spanish Lexicon. In H. C. Boas (Ed.), Multilingual FrameNets in computational lexicography: Methods and applications. Mouton de Gruyer.Google Scholar
  37. Surdeanu, M., Johansson, R., Meyers, A., Màrquez, L., & Nivre, J. (2008). The CoNLL-2008 shared task on joint parsing of syntactic and semantic dependencies. In Proceedings of the twelfth conference on computational natural language learning, CoNLL’08 (pp. 159–177). Stroudsburg, PA, USA: Association for Computational Linguistics.Google Scholar
  38. Taulé, M., Martí, M. A., & Recasens, M. (2008). AnCora: Multilevel annotated corpora for Catalan and Spanish. In Proceedings of the sixth international language resources and evaluation LREC’08 (pp. 96–101). Marrakech, Morocco: European Language Resources Association (ELRA).Google Scholar
  39. Vázquez, G., Fernández, A., & Martí, M. A. (2000). Clasificación verbal. Alternancias de diátesis. Quaderns de Sintagma, 3, Edicions de la Universitat de Lleida.Google Scholar
  40. Vendler, Z. (1967). Linguistics in philosophy. Ithaca: Cornell University Press.Google Scholar
  41. Yi, S., Loper, E., & Palmer, M. (2007). Can semantic roles generalize across genres? In HLT-NAACL’07 (pp. 548–555).Google Scholar
  42. Zhao, H., & Kit, C. (2008). Parsing syntactic and semantic dependencies with two single-stage maximum entropy models. In Proceedings of the twelfth conference on natural language learning, CoNLL’08 (pp. 203–207). Manchester, UK.Google Scholar

Copyright information

© Springer Science+Business Media B.V. 2011

Authors and Affiliations

  1. 1.Centre de Llenguatge i Computació, University of Barcelona (CLiC-UB)BarcelonaSpain

Personalised recommendations