Language Resources and Evaluation

, Volume 47, Issue 3, pp 639–659 | Cite as

Parser evaluation using textual entailments

  • Deniz YuretEmail author
  • Laura Rimell
  • Aydın Han
Original Paper


Parser Evaluation using Textual Entailments (PETE) is a shared task in the SemEval-2010 Evaluation Exercises on Semantic Evaluation. The task involves recognizing textual entailments based on syntactic information alone. PETE introduces a new parser evaluation scheme that is formalism independent, less prone to annotation error, and focused on semantically relevant distinctions. This paper describes the PETE task, gives an error analysis of the top-performing Cambridge system, and introduces a standard entailment module that can be used with any parser that outputs Stanford typed dependencies.


Parsing Textual entailments 



We would like to thank Stephan Oepen and Anna Mac for their careful analysis and valuable suggestions. Önder Eker and Zehra Turgut contributed to the development of the PETE task. Stephen Clark collaborated on the development of the Cambridge system. We would also like to thank Matthew Honnibal for discussion of the SCHWA system and contribution to the entailment system analysis.


  1. Black, E., Abney, S., Flickenger, D., Gdaniec, C., Grishman, R., Harrison, P., et al. (1991). A procedure for quantitatively comparing the syntactic coverage of english grammars. In Speech and natural language: Proceedings of a workshop, held at Pacific Grove, California, February 19–22, 1991 (p. 306). Los Altos: Morgan Kaufmann.Google Scholar
  2. Bonnema, R., Bod, R., & Scha, R. (1997). A DOP model for semantic interpretation. In Proceedings of the eighth conference on European chapter of the association for computational linguistics, association for computational linguistics (pp. 159–167).Google Scholar
  3. Bos, J., et al. (Eds.). (2008). Proceedings of the workshop on cross-framework and cross-domain parser evaluation, in connection with the 22nd international conference on computational linguistics.
  4. Carroll, J., Minnen, G., & Briscoe, T. (1999). Corpus annotation for parser evaluation. In Proceedings of the EACL workshop on linguistically interpreted corpora (LINC).Google Scholar
  5. Cer, D., de Marneffe, M. C., Jurafsky, D., & Manning, C. D. (2010). Parsing to stanford dependencies: Trade-offs between speed and accuracy. In 7th International conference on language resources and evaluation (LREC 2010).
  6. Charniak, E., & Johnson, M. (2005). Coarse-to-fine n-best parsing and MaxEnt discriminative reranking. In Proceedings of the 43rd annual meeting on association for computational linguistics, association for computational linguistics (p. 180).Google Scholar
  7. Clark, S., & Curran, J. (2007). Wide-coverage efficient statistical parsing with CCG and log-linear models. Computational Linguistics, 33(4), 493–552.CrossRefGoogle Scholar
  8. Collins, M. (2003). Head-driven statistical models for natural language parsing. Computational Linguistics, 29(4), 589–637.CrossRefGoogle Scholar
  9. Dagan, I., Dolan, B., Magnini, B., & Roth, D. (2009). Recognizing textual entailment: Rational, evaluation and approaches. Natural Language Engineering 15(04).Google Scholar
  10. De Marneffe, M., & Manning, C. (2008). Stanford typed dependencies manual.
  11. De Marneffe, M., MacCartney, B., & Manning, C. (2006). Generating typed dependency parses from phrase structure parses. In LREC 2006.Google Scholar
  12. Dickinson, M., & Meurers, W. D. (2003). Detecting inconsistencies in treebanks. In Proceedings of the second workshop on treebanks and linguistic theories (TLT 2003), Växjö, Sweden (pp. 45–56).
  13. Dietterich, T. (1998). Approximate statistical tests for comparing supervised classification learning algorithms. Neural computation, 10(7), 1895–1923.CrossRefGoogle Scholar
  14. Erk, K., McCarthy, D., & Gaylord, N. (2009). Investigations on word senses and word usages. In Proceedings of the joint conference of the 47th annual meeting of the ACL and the 4th international joint conference on natural language processing of the afnlp: volume 1-volume 1, association for computational linguistics (pp. 10–18).Google Scholar
  15. Hockenmaier, J. (2003). Data and models for statistical parsing with combinatory categorial grammar. PhD thesis, University of Edinburgh.Google Scholar
  16. Hockenmaier, J., & Steedman, M. (2007). CCGbank: A corpus of CCG derivations and dependency structures extracted from the Penn Treebank. Computational Linguistics, 33(3), 355–396.CrossRefGoogle Scholar
  17. King, T., Crouch, R., Riezler, S., Dalrymple, M., & Kaplan, R. (2003). The PARC 700 dependency bank. In Proceedings of the EACL03: 4th international workshop on linguistically interpreted corpora (LINC-03) (pp. 1–8).Google Scholar
  18. Klein, D., & Manning, C. (2003). Accurate unlexicalized parsing. In Proceedings of the 41st annual meeting on association for computational linguistics-volume 1, association for computational linguistics (pp. 423–430).Google Scholar
  19. Marcus, M., Santorini, B., & Marcinkiewicz, M. (1994). Building a large annotated corpus of English: The Penn Treebank. Computational linguistics, 19(2), 313–330.Google Scholar
  20. McCarthy, D., & Navigli, R. (2007). Semeval-2007 task 10: English lexical substitution task. In Proceedings of the fourth international workshop on semantic evaluations (SemEval-2007), association for computational linguistics, Prague, Czech Republic (pp. 48–53).
  21. McDonald, R., Pereira, F., Ribarov, K., & Hajic, J. (2005). Non-projective dependency parsing using spanning tree algorithms. In Proceedings of HLT/EMNLP (pp. 523–530).Google Scholar
  22. Minnen, G., Carroll, J., & Pearce, D. (2000). Robust, applied morphological generation. In Proceedings of INLG, Mitzpe Ramon, Israel.Google Scholar
  23. Nivre, J., Hall, J., Kübler, S., McDonald, R., Nilsson, J., Riedel, S., et al. (2007a). The CoNLL 2007 shared task on dependency parsing. In Proceedings of the CoNLL Shared Task Session of EMNLP-CoNLL (Vol. 7, pp. 915–932).Google Scholar
  24. Nivre, J., Hall, J., Nilsson, J., Chanev, A., Eryigit, G., Kübler, S., et al. (2007b). MaltParser: A language-independent system for data-driven dependency parsing. Natural Language Engineering, 13(02), 95–135.Google Scholar
  25. Petrov, S., & Klein, D. (2007). Improved inference for unlexicalized parsing. In Proceedings of NAACL HLT 2007 (pp. 404–411).Google Scholar
  26. Rimell, L., Clark, S., & Steedman, M. (2009). Unbounded dependency recovery for parser evaluation. In Proceedings of the 2009 conference on empirical methods in natural language processing, association for computational linguistics (pp. 813–821).Google Scholar
  27. Snyder, B., & Palmer, M. (2004). The English all-words task. In ACL 2004 Senseval-3 workshop, Barcelona, Spain.
  28. Steedman, M. (2000). The syntactic process. Cambridge, MA: The MIT Press.Google Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2012

Authors and Affiliations

  1. 1.Koç UniversityIstanbulTurkey
  2. 2.Computer LaboratoryCambridgeUK

Personalised recommendations