Event Annotation Schemes and Event Recognition in Spanish Texts

  • Dina Wonsever
  • Aiala Rosá
  • Marisa Malcuori
  • Guillermo Moncecchi
  • Alan Descoins
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7182)

Abstract

This paper presents an annotation scheme for events in Spanish texts, based on TimeML for English. This scheme is contrasted with different proposals, all of them based on TimeML, for various Romance languages: Italian, French and Spanish. Two manually annotated corpora for Spanish, under the proposed scheme, are now available. While manual annotation is far from trivial, we obtained a very good event identification agreement (93% of events were identically identified by both annotators). Part of the annotated text was used as a training corpus for the automatic recognition of events. In the experiments conducted so far (SVM and CRF) our best results are in the state of the art for this task (80.3% of F-measure).

Keywords

Support Vector Machine Natural Language Processing Conditional Random Field Factivity Attribute Training Corpus 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Atserias, J., Casas, B., Comelles, E., González, M., Padró, L., Padró, M.: FreeLing 1.3: Syntactic and semantic services in an open-source NLP library. In: Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC) ELRA (2006)Google Scholar
  2. 2.
    Bethard, S., Martin, J.H.: Identification of event mentions and their semantic class. In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, EMNLP 2006, pp. 146–154. Association for Computational Linguistics, Stroudsburg (2006)CrossRefGoogle Scholar
  3. 3.
    Bittar, A., Amsili, P., Denis, P., Danlos, L.: French TimeBank: An ISO-TimeML Annotated Reference Corpus. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: shortpapers, Portland, Oregon, pp. 130–134 (2011)Google Scholar
  4. 4.
    Boguraev, B., Kubota Ando, R.: TimeML-compliant text analysis for temporal reasoning. In: Proceedings of the 19th International Joint Conference on Artificial Intelligence, pp. 997–1003. Morgan Kaufmann Publishers Inc., San Francisco (2005)Google Scholar
  5. 5.
    Boguraev, B., Kubota Ando, R.: Effective Use of TimeBank for TimeML Analysis. In: Schilder, F., Katz, G., Pustejovsky, J. (eds.) Annotating, Extracting and Reasoning about Time and Events. LNCS (LNAI), vol. 4795, pp. 41–58. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  6. 6.
    Carletta, J.: Assessing agreement on classification tasks: The Kappa statistic. Computational Linguistics 22, 249–254 (1996)Google Scholar
  7. 7.
    Caselli, T., Bartalesi, V., Sprugnoli, R., Pianta, E., Prodanof, I.: Annotating Events, Tamporal Expressions and Relations in Italian: the It-TimeML Experience for the Ita-TimeBank. In: Proceedings of the Fifth Law Workshop (LAW V), Portland, Oregon, pp. 143–151 (2011)Google Scholar
  8. 8.
    Kudo, T., Matsumoto, Y.: Chunking with Support Vector Machines. In: NAACL (2001)Google Scholar
  9. 9.
    Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proceedings of the Eighteenth International Conference on Machine Learning, ICML 2001, pp. 282–289. Morgan Kaufmann Publishers Inc., San Francisco (2001)Google Scholar
  10. 10.
    Llorens, H., Saquete, E., Navarro-Colorado, B.: TimeML events recognition and classification: learning CRF models with semantic roles. In: Proceedings of the 23rd International Conference on Computational Linguistics, COLING 2010, pp. 725–733. Association for Computational Linguistics, Stroudsburg (2010)Google Scholar
  11. 11.
    March, O., Baldwin, T.: Automatic event reference identification. In: Proceedings of the Australasian Language Technology Association Workshop 2008, páginas, Hobart, Australi, pp. 79–87 (2008)Google Scholar
  12. 12.
    Ben, M.: Investigating Classification for Natural Language Processing Tasks, Ph.D. Thesis, Cambridge University (2007)Google Scholar
  13. 13.
    Pustejovsky, J., Castaño, J., Ingria, R., Saurí, R., Gaizauskas, R., Setzer, A., Katz, G.: TimeML: Robust specification of event and temporal expressions in text. In: Fifth International Workshop on Computational Semantics, IWCS-5 (2003)Google Scholar
  14. 14.
    Pustejovsky, J., Hanks, P., Saurí, R., See, A., Gaizauskas, R., Setzer, A., Radev, D., Sundheim, B., Day, D., Ferro, L., Lazo, M.: The TIMEBANK Corpus. In: Proceedings of Corpus Linguistics, pp. 647–656 (2003)Google Scholar
  15. 15.
    Resnik, G., Bel, N.: Automatic Detection of Non-deverbal Event Nouns in Spanish. In: Proceedings of the 5th International Conference on Generative Approaches to the Lexicon, Istituto di Linguistica Computazionale, Pisa (2009)Google Scholar
  16. 16.
    Rosá, A., Wonsever, D., Minel, J.-L.: Comparación de dos métodos para la extracción de opiniones en textos en español. In: Proceedings of IBERAMIA 2010, Workshop on Natural Language Processing and Web-based Technologies, Bahía Blanca (2010)Google Scholar
  17. 17.
    Saurí, R., Knippen, R., Verhagen, M., Pustejovsky, J.: Evita: a robust event recog-nizer for QA systems. In: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, HLT 2005, pp. 700–707. Association for Computational Linguistics, Stroudsburg (2005)CrossRefGoogle Scholar
  18. 18.
    Saurí, R., Littman, J., Knippen, B., Gaizauskas, R., Setzer, A., Pustejovsky, J.: TimeML Annotation Guidelines Version 1.2.1 (2006)Google Scholar
  19. 19.
    Saurí, R.: A Factuality Profiler for Eventualities in Text. PhD Dissertation. Brandeis University (2008)Google Scholar
  20. 20.
    Saurí, R., Batiukova, O., Pustejovsky, J.: Annotating Events in Spanish TimeML Annotation Guidelines. Version TempEval-2010 (2009)Google Scholar
  21. 21.
    Saurí, R., Goldberg, L., Verhagen, M., Pustejovsky, J.: Annotating Events in English TimeML Annotation Guidelines. Version TempEval-2010 (2009)Google Scholar
  22. 22.
    Vapnik, V.: The Nature of Statistical Learning Theory. Springer, New York (1995)MATHGoogle Scholar
  23. 23.
    Wiebe, J., Wilson, T., Cardie, C.: Annotating expressions of opinions and emotions in language. In: Language Resources and Evaluation (2005)Google Scholar
  24. 24.
    Wonsever, D., Malcuori, M., Rosá, A.: Sibila: esquema de anotación de eventos. Technical Report 08–11, Biblioteca InCo PEDECIBA (2008) ISSN: 0797–6410Google Scholar
  25. 25.
    Wonsever, D., Malcuori, M., Rosá, A.: Factividad de los eventos referidos en textos. Technical Report 09–12, Biblioteca InCo PEDECIBA (2009) ISSN: 0797–6410Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Dina Wonsever
    • 1
  • Aiala Rosá
    • 1
  • Marisa Malcuori
    • 2
  • Guillermo Moncecchi
    • 1
  • Alan Descoins
    • 1
  1. 1.Instituto de Computación, Facultad de IngenieríaUniversidad de la RepúblicaUruguay
  2. 2.Instituto de Lingüística, Facultad de Humanidades y Ciencias de la EducaciónUniversidad de la RepúblicaUruguay

Personalised recommendations