Skip to main content

Identification of Sentence-to-Sentence Relations Using a Textual Entailer

Abstract

We show in this article how an approach developed for the task of recognizing textual entailment relations can be extended to identify paraphrase and elaboration relations. Entailment is a unidirectional relation between two sentences in which one sentence logically infers the other. There seems to be a close relation between entailment and two other sentence-to-sentence relations: elaboration and paraphrase. This close relation is discussed to theoretically justify the newly derived approaches. The proposed approaches use lexical, syntactic, and shallow negation handling. The proposed approaches offer significantly better results than several baselines. When compared to other paraphrase and elaboration approaches they produce similar or better results. We report results on several data sets: the Microsoft Research Paraphrase corpus, a benchmark for evaluating approaches to paraphrase identification, and a data set collected from high-school students’ interactions with an intelligent tutoring system iSTART, which includes both paraphrase and elaboration utterances.

This is a preview of subscription content, access via your institution.

References

  • Barzilay, R., & McKeown, K. (2001). Extracting paraphrases from a parallel corpus. In 39th annual meeting of the association for computational linguistics, 50–57.

  • Chapman W. W., Bridewell W., Hanbury P., Cooper G. F., Buchanan B. G. (2001) A simple algorithm for identifying negated findings and diseases in discharge summaries. Journal of Biomedical Informatics 34: 301–310

    Article  Google Scholar 

  • Charniak, E. (2000). A maximum-entropy-inspired parser. In Proceedings of North American chapter of association for computational linguistics (NAACL-2000).

  • Dagan, I., & Glickman, O. (2004). Probabilistic textual entailment: Generic applied modeling of language variability. In Proceedings of learning methods for text understanding and mining.

  • Dagan, I., Glickman, O., & Magnini, B. (2004–2005). Recognizing textual entailment. In http://www.pascalnetwork.org/Challenges/RTE.

  • Dagan, I., Glickman, O., & Magnini, B. (2005). The Pascal recognising textual entailment challenge. In Proceedings of the recognizing textual entaiment challenge workshop.

  • Dennis, S. (2006). Introducing word order in an LSA framework. In T. Landauer, D. McNamara, S. Dennis & W. Kintsch (Eds.), Handbook of latent semantic analysis. Erlbaum.

  • Dolan, W. B., Quirk, C., & Brockett, C. (2004). Unsupervised construction of large paraphrase corpora: Exploiting massively parallel news sources. In Proceedings of COLING.

  • Graesser, A. C., Olney, A., Haynes, B., & Chipman, P. (2005). Cognitive systems: Human cognitive models in systems design. chapter AutoTutor: A cognitive system that simulates a tutor that facilitates learning through mixed-initiative dialogue. Mahwah, NJ: Erlbaum.

  • Hatch E., Lazaraton A. (1991) The research manual: Design and statistics for applied linguistics. Heinle & Heinle, Boston, MA

    Google Scholar 

  • Ibrahim, A., Katz, B., & Lin, J. (2003). Extracting structural paraphrases from aligned monolingual corpora. In Proceedings of the second international workshop on paraphrasing (ACL 2003).

  • Iordanskaja, L., Kittredge, R., & Polgere, A.(1991). Natural language generation in artificial intelligence and computational linguistics. Kluwer Academic. Chapter Lexical selection and paraphrase in a meaning-text generation model.

  • Kintsch W., van Dijk T. A. (1978) Toward a model of text comprehension and production. Psychology Review 85: 363–394

    Article  Google Scholar 

  • Kouylekov, M., & Magnini, B. (2005). Recognizing textual entailment with tree edit distance algorithms. In Proceedings of the recognizing textual entaiment challenge workshop.

  • Kozareva, Z., & Montoyo, A. (2006). Lecture notes in artificial intelligence: Proceedings of the 5th international conference on natural language processing (Fin-TAL 2006). chapter Paraphrase identification on the basis of supervised machine learning techniques.

  • Landauer T., McNamara D. S., Dennis S., Kintsch W. (2007) Latent semantic analysis: A road to meaning. Erlbaum, Mahwah, NJ

    Google Scholar 

  • Leacock, C., & Chodorow, M. (1998). Combining local context and wordnet sense similarity for word sense identification. In WordNet: An electronic lexical database. MIT Press.

  • Levenshtein V. (1966) Binary codes capable of correcting insertions and reversals. Soviet Physics Doklady 10: 707–717

    Google Scholar 

  • Lin, D., & Pantel, P. (2001). Dirt—discovery of inference rules from text. In Proceedings of ACM conference on knowledge discovery and data mining (KDD-01), 323–328.

  • Magerman, D. (1994). Natural language parsing as statistical pattern recognition. Ph.D. Dissertation, Stanford University.

  • McCarthy P. M., Guess R., McNamara D. S. (2009) The components of paraphrase. Behavior Research Methods 41: 682–690

    Article  Google Scholar 

  • McCarthy P.M, Rus V., Crossley S.A., Bigham S.C., Graesser A.C., McNamara D.S. (2007) Assessing entailer with a corpus of natural language. In: Wilson D., Sutcliffe G. (eds) Proceedings of the twentieth international Florida artificial intelligence research society conference. The AAAI Press, Menlo Park California, pp 247–252

    Google Scholar 

  • McNamara D., Levinstein I. B., Boonthum C. (2004) iStart: Interactive strategy trainer for active reading and thinking. Behavioral Research Methods, Instruments, and Computers 36: 222–233

    Google Scholar 

  • McNamara, D. S., Boonthum, C., Levinstein, I. B., & Millis, K. (2007). Handbook of latent semantic analysis. chapter Evaluating selfexplanations in iSTART: comparing word-based and LSA algorithms, (pp. 227–241). Mahwah, NJ: Erlbaum

  • Mel’cuk I. (1998) Dependency syntax: Theory and practice. State University of New York Press, Albany, NY

    Google Scholar 

  • Mihalcea, R., Corley, C., & Strapparava, C. (2006). Corpus-based and knowledge-based measures of text semantic similarity. In Proceedings of the American association for artificial intelligence (AAAI 2006).

  • Miller G. (1995) WordNet: a lexical database for english. Communications of the ACM 38(11): 39–41

    Article  Google Scholar 

  • Monz, C., & de Rijke, M. (2001). Light-weight entailment checking for computational semantics. 59–72.

  • Pazienza, M., Pennacchiotti, M., & Zanzotto, F. (2005). Textual entailment as syntactic graph distance: A rule based and svm based approach. In Proceedings of the recognizing textual entaiment challenge workshop.

  • Qiu, L., Kan, M., & Chua, T. (2006). Paraphrase recognition via dissimilarity significance classification. In Proceedings of the 2006 conference on empirical methods in natural language processing (EMNLP 2006), (pp. 18–26). Association of Computational Linguistics.

  • Rus, V., & Desai, K. (2005). Assigning function tags with a simple model. In Proceedings of conference on intelligent text processing and computational linguistics (CICLing) 2005.

  • Rus, V., Graesser, A. C., & Desai, K. (2005). Lexico-syntactic subsumption for textual entailment, Recent advances in natural language processing (RANLP 2005), Borovets, Bulgaria, September 21–23, 2005.

  • Rus V., McCarthy P. M., McNamara D. S., Graesser A. C. (2008) A study of textual entailment. International Journal of Artificial Intelligence Tools 17(4): 659–685

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vasile Rus.

About this article

Cite this article

Rus, V., McCarthy, P.M., Graesser, A.C. et al. Identification of Sentence-to-Sentence Relations Using a Textual Entailer. Res on Lang and Comput 7, 209–229 (2009). https://doi.org/10.1007/s11168-009-9065-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11168-009-9065-y

Keywords

  • Entailment
  • Paraphrasing
  • Dependencies
  • Intelligent tutoring systems