Language Resources and Evaluation

, Volume 43, Issue 2, pp 161–179 | Cite as

The TempEval challenge: identifying temporal relations in text

  • Marc Verhagen
  • Robert Gaizauskas
  • Frank Schilder
  • Mark Hepple
  • Jessica Moszkowicz
  • James Pustejovsky
Article

Abstract

TempEval is a framework for evaluating systems that automatically annotate texts with temporal relations. It was created in the context of the SemEval 2007 workshop and uses the TimeML annotation language. The evaluation consists of three subtasks of temporal annotation: anchoring an event to a time expression in the same sentence, anchoring an event to the document creation time, and ordering main events in consecutive sentences. In this paper we describe the TempEval task and the systems that participated in the evaluation. In addition, we describe how further task decomposition can bring even more structure to the evaluation of temporal relations.

Keywords

TimeML Temporal annotation Temporal relations Information extraction Evaluation Corpus creation 

Notes

Acknowledgements

We would like to thank the organizers of SemEval 2007: Eneko Agirre, Lluís Màrquez and Richard Wicentowski. TempEval may not have happened without SemEval as a home. Thanks also to the members of the six teams that participated in the TempEval task: Steven Bethard, James Martin, Congmin Min, Munirathnam Srikanth, Abraham Fowler, Yuchang Cheng, Masayuki Asahara, Yuji Matsumoto, Andrea Setzer, Caroline Hagège, Xavier Tannier and Georgiana Puşcaşu. Additional help to prepare the data for the TempEval task came from Emma Barker, Yonit Boussany, Catherine Havasi, Emin Mimaroglu, Hongyuan Qiu, Anna Rumshisky, Roser Saurí and Amber Stubbs. Part of the work in this paper was carried out in the context of the DTO/AQUAINT program and funded under grant number N61339-06-C-0140, and part was performed under the UK MRC-funded CLEF-Services grant ref: GO300607.

References

  1. Aït-Mokhtar, S., Chanod, J.-P., & Roux, C. (2002). Robustness beyond shallowness: Increamental deep parsing. Natural Language Engineering, 8, 121–144.CrossRefGoogle Scholar
  2. Allen, J. (1983). Maintaining knowledge about temporal intervals. Communications of the ACM, 26(11), 832–843.CrossRefGoogle Scholar
  3. Allen, J. (1984). Towards a general theory of action and time. Artificial Intelligence, 23, 123–154.CrossRefGoogle Scholar
  4. Baker, C., Fillmore, C., & Lowe, J. (1998). The Berkeley FrameNet Project. In Joint 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computation Linguistics (COLING-ACL’98). pp. 86–90.Google Scholar
  5. Bethard, S., & Martin, J. H. (2007). CU-TMP: Temporal relation classification using syntactic and semantic features. In Proceedings of the fourth international workshop on semantic evaluations (SemEval-2007) (pp. 129–132). Prague, Czech Republic: Association for Computational Linguistics.Google Scholar
  6. Bethard, S., Martin, J. H., & Klingenstein, S. (2007). Timelines from text: Identification of syntactic temporal relations. In ICSC ’07: Proceedings of the international conference on semantic computing (pp. 11–18). Washington, DC, USA: IEEE Computer Society.Google Scholar
  7. Boguraev, B., & Ando, R. K. (2006). Analysis of TimeBank as a resource for TimeML parsing. In Language Resources and Evaluation Conference, LREC 2006. Genoa, Italy.Google Scholar
  8. Boguraev, B., Pustejovsky, J., Ando, R., & Verhagen, M. (2007). TimeBank evolution as a community resource for TimeML parsing. Language Resource and Evaluation, 41(1), 91–115.CrossRefGoogle Scholar
  9. Bramsen, P., Deshpande P., Keok Y., & Barzilay, R. (2006). Inducting temporal graphs. In Proceedings of the 2006 conference on empirical methods in natural language processing (EMNLP 2006) (pp. 189–198).Google Scholar
  10. Chambers, N., Wang, S., & Jurafsky, D. (2007). Classifying temporal relations between events. In Proceedings of the 45th annual meeting of the association for computational linguistics companion volume proceedings of the demo and poster sessions (pp. 173–176). Prague, Czech Republic: Association for Computational Linguistics.Google Scholar
  11. Cheng, Y., Asahara, M., & Matsumoto, Y. (2007). NAIST.Japan: Temporal relation identification using dependency parsed tree. In Proceedings of the fourth international workshop on semantic evaluations (SemEval-2007) (pp. 245–248). Prague, Czech Republic: Association for Computational Linguistics.Google Scholar
  12. Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20, 37–46.CrossRefGoogle Scholar
  13. Dietterich, T. (1998). Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computation, 10(7), 1895–1923.CrossRefGoogle Scholar
  14. Ferro, L., Mani, I., Sundheim, B., & Wilson, G. (2001). TIDES temporal annotation guidelines, version 1.0.2. Technical report, The MITRE Corporation, McLean, Virginia. Report MTR 01W0000041.Google Scholar
  15. Filatova, E., & Hovy, E., (2001). Assigning time-stamps to event-clauses. In Proceedings of the 2001 ACL workshop on temporal and spatial information processing.Google Scholar
  16. Freksa, C. (1992). Temporal reasoning based on semi-intervals. Artificial Intelligence, 54(1), 199–227.CrossRefGoogle Scholar
  17. Hagège, C., & Tannier, X. (2007). XRCE-T: XIP Temporal Module for TempEval campaign. In Proceedings of the fourth international workshop on semantic evaluations (SemEval-2007) (pp. 492–495). Prague, Czech Republic: Association for Computational Linguistics.Google Scholar
  18. Hepple, M., Setzer, A., & Gaizauskas, R. (2007). USFD: Preliminary exploration of features and classifiers for the TempEval-2007 task. In Proceedings of the fourth international workshop on semantic evaluations (SemEval-2007) (pp. 438–441). Prague, Czech Republic: Association for Computational Linguistics.Google Scholar
  19. Hovy, E., Marcus M., Palmer M., Ramshaw L., & Weischedel, R. (2006). OntoNotes: The 90% solution. In Proceedings of the human language technology conference of the NAACL, companion volume: Short papers (pp. 57–60). New York City, USA: Association for Computational Linguistics.Google Scholar
  20. Katz, G., & Arosio, F. (2001). The annotation of temporal information in natural language sentences. In Proceedings of ACL-EACL 2001, workshop for temporal and spatial information processing (pp. 104–111). Toulouse, France.Google Scholar
  21. Kim, J.-D., Ohta, T., & Tsujii, J. (2008). Corpus annotation for mining biomedical events from literature. BMC Bioinformatics, 9(10).Google Scholar
  22. Li, W., Wong, K.-F., & Yuan, C. (2005). A model for processing temporal references in Chinese. In The language of time. Oxford, UK: Oxford University Press.Google Scholar
  23. Mani, I., Wellner, B., Verhagen, M., Lee, C. M., & Pustejovsky, J. (2006). Machine learning of temporal relations. In Proceedings of the 44th annual meeting of the association for computational linguistics. Sydney, Australia.Google Scholar
  24. Meyers, A., Reeves, R., Macleod, C., Szekely, R., Zielinkska, V., Young, B., & Grishman, R. (2004). The NomBank Project: An interim report. In Proceedings of HLT-EACL workshop: Frontiers in Corpus annotation.Google Scholar
  25. Miltsakaki, E., Prasad, R., Joshi, A., & Webber, B. (2004). The Penn discourse Treebank. In Proceedings of fourth international conference on language resources and evaluation (LREC 2004).Google Scholar
  26. Min, C., Srikanth, M., & Fowler, A. (2007). LCC-TE: A hybrid approach to temporal relation identification in news text. In Proceedings of the fourth international workshop on semantic evaluations (SemEval-2007) (pp. 219–222). Prague, Czech Republic: Association for Computational Linguistics.Google Scholar
  27. MUC-6. (1995). Proceedings of the sixth message understanding conference (MUC-6). Defense Advanced Research Projects Agency, Morgan Kaufmann.Google Scholar
  28. MUC-7. (1998). Proceedings of the seventh message understanding conference (MUC-7). Defense Advanced Research Projects Agency. Available at http://www.itl.nist.gov/iaui/894.02/related_projects/muc.
  29. Palmer, M., Gildea, D., & Kingsbury, P. (2005). The proposition bank: An annotated corpus of semantic roles. Computational Linguistics, 31(1).Google Scholar
  30. Puşcaşu, G. (2007). WVALI: Temporal relation identification by syntactico-semantic analysis. In Proceedings of the fourth international workshop on semantic evaluations (SemEval-2007) (pp. 484–487). Prague, Czech Republic: Association for Computational Linguistics.Google Scholar
  31. Pustejovsky, J., Castaño, J., Ingria, R., Saurí, R., Gaizauskas, R., Setzer, A., & Katz, G. (2003a). TimeML: Robust specification of event and temporal expressions in text. In Proceedings of the fifth international workshop on computational semantics (IWCS-5). Tilburg.Google Scholar
  32. Pustejovsky, J., Hanks, P., Saurí, R., See, A., Gaizauskas, R., Setzer, A., Radev, D., Sundheim, B., Day, D., Ferro, L., & Lazo, M. (2003b) The TIMEBANK Corpus. In Proceedings of Corpus linguistics 2003 (pp. 647–656). Lancaster.Google Scholar
  33. Pustejovsky, J., Knippen, R., Littman, J., & Saurí, R. (2005). Temporal and event information in natural language text. Language Resources and Evaluation, 39, 123–164.CrossRefGoogle Scholar
  34. Schilder, F. (1997). Temporal relations in English and German narrative discourse. Ph.D. thesis. Edinburgh, UK: University of Edinburgh.Google Scholar
  35. Schilder, F., & Habel, C. (2001). From temporal expressions to temporal information: Semantic tagging of news messages. In Proceedings of the ACL-2001 workshop on temporal and spatial information processing (pp. 1–8). Toulouse, France: Association for Computational Linguistics.Google Scholar
  36. Setzer, A., & Gaizauskas, R. (2000). Annotating events and temporal information in newswire texts. In LREC 2000.Google Scholar
  37. Setzer, A., Gaizauskas, R., & Hepple, M. (2006). The role of inference in the temporal annotation and analysis of text. Journal of Language Resources and Evaluation, 39(2–3), 243–265.Google Scholar
  38. Verhagen, M. (2005). Temporal closure in an annotation environment. Language Resources and Evaluation, 39, 211–241.CrossRefGoogle Scholar
  39. Verhagen, M., Gaizauskas, R., Schilder, F., Hepple, M., Katz, G., & Pustejovsky, J. (2007). SemEval-2007 Task 15: TempEval temporal relation identification. In Proceedings of the fourth international workshop on semantic evaluations (SemEval-2007) (pp. 75–80). Prague, Czech Republic: Association for Computational Linguistics.Google Scholar
  40. Vilain, M., Kautz, H., & van Beek, P. (1990). Constraint propagation algorithms: A revised report. In D.S. Weld & J. de Kleer (Eds.), Qualitative reasoning about physical systems (pp. 373–381). San Mateo, CA: Morgan Kaufman.Google Scholar
  41. Witten, I., & Frank, E. (2005). Data mining: Practical machine learning tools and techniques (2nd ed.). San Francisco: Morgan Kaufmann.Google Scholar

Copyright information

© Springer Science+Business Media B.V. 2009

Authors and Affiliations

  • Marc Verhagen
    • 1
  • Robert Gaizauskas
    • 2
  • Frank Schilder
    • 3
  • Mark Hepple
    • 2
  • Jessica Moszkowicz
    • 1
  • James Pustejovsky
    • 1
  1. 1.Brandeis UniversityWalthamUSA
  2. 2.University of SheffieldSheffieldEngland, UK
  3. 3.Thomson Reuters CorporationNew YorkUSA

Personalised recommendations