Parser evaluation using textual entailments

Yuret, Deniz; Rimell, Laura; Han, Aydın

doi:10.1007/s10579-012-9200-5

Parser evaluation using textual entailments

Original Paper
Published: 31 October 2012

Volume 47, pages 639–659, (2013)
Cite this article

Language Resources and Evaluation Aims and scope Submit manuscript

Deniz Yuret¹,
Laura Rimell² &
Aydın Han¹

356 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Parser Evaluation using Textual Entailments (PETE) is a shared task in the SemEval-2010 Evaluation Exercises on Semantic Evaluation. The task involves recognizing textual entailments based on syntactic information alone. PETE introduces a new parser evaluation scheme that is formalism independent, less prone to annotation error, and focused on semantically relevant distinctions. This paper describes the PETE task, gives an error analysis of the top-performing Cambridge system, and introduces a standard entailment module that can be used with any parser that outputs Stanford typed dependencies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Natural language processing: state of the art, current trends and challenges

Article 14 July 2022

Natural Language Processing

Near-term advances in quantum natural language processing

Article 11 April 2024

Notes

The collapsed and propagated version of Stanford dependencies somewhat mitigates this problem and this is the parser output representation we chose to use as input to the example entailment module of Sect. 7.
http://www.cs.brown.edu/ec/papers/badPars.txt.gz.
Note that some of the difficult constructions, plus noise in the laypeople’s responses meant a large percentage of potential entailments didn’t pass the filter, but nevertheless at a nominal cost we were able to create a dataset where all the entailments were unanimously agreed by 3 people, which is not the case for most other commonly used treebanks.
http://svn.ask.it.usyd.edu.au/trac/candc.
http://www.cis.upenn.edu/~treebank/tokenizer.sed.
There were eight POS changes in the development set, most of which did not result in errors on evaluation. Note also that this particular H is ungrammatical English. Recall that the negative H sentences were derived from genuine parser errors; it was not always possible to construct grammatical sentences corresponding to such errors, though we will consider constraining all H sentences to be grammatical in future work.

References

Black, E., Abney, S., Flickenger, D., Gdaniec, C., Grishman, R., Harrison, P., et al. (1991). A procedure for quantitatively comparing the syntactic coverage of english grammars. In Speech and natural language: Proceedings of a workshop, held at Pacific Grove, California, February 19–22, 1991 (p. 306). Los Altos: Morgan Kaufmann.
Bonnema, R., Bod, R., & Scha, R. (1997). A DOP model for semantic interpretation. In Proceedings of the eighth conference on European chapter of the association for computational linguistics, association for computational linguistics (pp. 159–167).
Bos, J., et al. (Eds.). (2008). Proceedings of the workshop on cross-framework and cross-domain parser evaluation, in connection with the 22nd international conference on computational linguistics. http://lingo.stanford.edu/events/08/pe/
Carroll, J., Minnen, G., & Briscoe, T. (1999). Corpus annotation for parser evaluation. In Proceedings of the EACL workshop on linguistically interpreted corpora (LINC).
Cer, D., de Marneffe, M. C., Jurafsky, D., & Manning, C. D. (2010). Parsing to stanford dependencies: Trade-offs between speed and accuracy. In 7th International conference on language resources and evaluation (LREC 2010). http://nlp.stanford.edu/pubs/lrecstanforddeps_final_final.pdf.
Charniak, E., & Johnson, M. (2005). Coarse-to-fine n-best parsing and MaxEnt discriminative reranking. In Proceedings of the 43rd annual meeting on association for computational linguistics, association for computational linguistics (p. 180).
Clark, S., & Curran, J. (2007). Wide-coverage efficient statistical parsing with CCG and log-linear models. Computational Linguistics, 33(4), 493–552.
Article Google Scholar
Collins, M. (2003). Head-driven statistical models for natural language parsing. Computational Linguistics, 29(4), 589–637.
Article Google Scholar
Dagan, I., Dolan, B., Magnini, B., & Roth, D. (2009). Recognizing textual entailment: Rational, evaluation and approaches. Natural Language Engineering 15(04).
De Marneffe, M., & Manning, C. (2008). Stanford typed dependencies manual. http://nlp.stanford.edu/software/dependencies-manual.pdf.
De Marneffe, M., MacCartney, B., & Manning, C. (2006). Generating typed dependency parses from phrase structure parses. In LREC 2006.
Dickinson, M., & Meurers, W. D. (2003). Detecting inconsistencies in treebanks. In Proceedings of the second workshop on treebanks and linguistic theories (TLT 2003), Växjö, Sweden (pp. 45–56). http://ling.osu.edu/dickinso/papers/dickinson-meurers-tlt03.html.
Dietterich, T. (1998). Approximate statistical tests for comparing supervised classification learning algorithms. Neural computation, 10(7), 1895–1923.
Article Google Scholar
Erk, K., McCarthy, D., & Gaylord, N. (2009). Investigations on word senses and word usages. In Proceedings of the joint conference of the 47th annual meeting of the ACL and the 4th international joint conference on natural language processing of the afnlp: volume 1-volume 1, association for computational linguistics (pp. 10–18).
Hockenmaier, J. (2003). Data and models for statistical parsing with combinatory categorial grammar. PhD thesis, University of Edinburgh.
Hockenmaier, J., & Steedman, M. (2007). CCGbank: A corpus of CCG derivations and dependency structures extracted from the Penn Treebank. Computational Linguistics, 33(3), 355–396.
Article Google Scholar
King, T., Crouch, R., Riezler, S., Dalrymple, M., & Kaplan, R. (2003). The PARC 700 dependency bank. In Proceedings of the EACL03: 4th international workshop on linguistically interpreted corpora (LINC-03) (pp. 1–8).
Klein, D., & Manning, C. (2003). Accurate unlexicalized parsing. In Proceedings of the 41st annual meeting on association for computational linguistics-volume 1, association for computational linguistics (pp. 423–430).
Marcus, M., Santorini, B., & Marcinkiewicz, M. (1994). Building a large annotated corpus of English: The Penn Treebank. Computational linguistics, 19(2), 313–330.
Google Scholar
McCarthy, D., & Navigli, R. (2007). Semeval-2007 task 10: English lexical substitution task. In Proceedings of the fourth international workshop on semantic evaluations (SemEval-2007), association for computational linguistics, Prague, Czech Republic (pp. 48–53). http://www.aclweb.org/anthology/W/W07/W07-2009.
McDonald, R., Pereira, F., Ribarov, K., & Hajic, J. (2005). Non-projective dependency parsing using spanning tree algorithms. In Proceedings of HLT/EMNLP (pp. 523–530).
Minnen, G., Carroll, J., & Pearce, D. (2000). Robust, applied morphological generation. In Proceedings of INLG, Mitzpe Ramon, Israel.
Nivre, J., Hall, J., Kübler, S., McDonald, R., Nilsson, J., Riedel, S., et al. (2007a). The CoNLL 2007 shared task on dependency parsing. In Proceedings of the CoNLL Shared Task Session of EMNLP-CoNLL (Vol. 7, pp. 915–932).
Nivre, J., Hall, J., Nilsson, J., Chanev, A., Eryigit, G., Kübler, S., et al. (2007b). MaltParser: A language-independent system for data-driven dependency parsing. Natural Language Engineering, 13(02), 95–135.
Google Scholar
Petrov, S., & Klein, D. (2007). Improved inference for unlexicalized parsing. In Proceedings of NAACL HLT 2007 (pp. 404–411).
Rimell, L., Clark, S., & Steedman, M. (2009). Unbounded dependency recovery for parser evaluation. In Proceedings of the 2009 conference on empirical methods in natural language processing, association for computational linguistics (pp. 813–821).
Snyder, B., & Palmer, M. (2004). The English all-words task. In ACL 2004 Senseval-3 workshop, Barcelona, Spain. http://www.cse.unt.edu/rada/senseval/senseval3/proceedings/pdf/snyder.pdf.
Steedman, M. (2000). The syntactic process. Cambridge, MA: The MIT Press.
Google Scholar

Download references

Acknowledgments

We would like to thank Stephan Oepen and Anna Mac for their careful analysis and valuable suggestions. Önder Eker and Zehra Turgut contributed to the development of the PETE task. Stephen Clark collaborated on the development of the Cambridge system. We would also like to thank Matthew Honnibal for discussion of the SCHWA system and contribution to the entailment system analysis.

Author information

Authors and Affiliations

Koç University, Rumelifeneri Yolu, Sarıyer, 34450, Istanbul, Turkey
Deniz Yuret & Aydın Han
Computer Laboratory, William Gates Building, 15 JJ Thomson Avenue, Cambridge, CB3 0FD, UK
Laura Rimell

Authors

Deniz Yuret
View author publications
You can also search for this author in PubMed Google Scholar
Laura Rimell
View author publications
You can also search for this author in PubMed Google Scholar
Aydın Han
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Deniz Yuret.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yuret, D., Rimell, L. & Han, A. Parser evaluation using textual entailments. Lang Resources & Evaluation 47, 639–659 (2013). https://doi.org/10.1007/s10579-012-9200-5

Download citation

Published: 31 October 2012
Issue Date: September 2013
DOI: https://doi.org/10.1007/s10579-012-9200-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Parser evaluation using textual entailments

Abstract

Access this article

Similar content being viewed by others

Natural language processing: state of the art, current trends and challenges

Natural Language Processing

Near-term advances in quantum natural language processing

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Parser evaluation using textual entailments

Abstract

Access this article

Similar content being viewed by others

Natural language processing: state of the art, current trends and challenges

Natural Language Processing

Near-term advances in quantum natural language processing

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation