Skip to main content
Log in

A WordNet-based semantic approach to textual entailment and cross-lingual textual entailment

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

In this paper we explain how to build a recognizing textual entailment (RTE) system which only uses semantic similarity measures based on WordNet. We show how the widely used WordNet-based semantic measures can be generalized to build sentence level semantic metrics in order to be used in both mono-lingual and cross-lingual textual entailment. We experiment with a wide variety of RTE datasets and evaluate the contribution of an algorithm which expands the RTE monolingual corpus. Results achieved with this method yielded significant statistical differences when predicting RTE test sets. We provide an efficiency analysis of these metrics drawing some conclusions about their practical utility in recognizing textual entailment. We also analyze the cross-lingual textual entailment task, we create a bilingual English–Spanish corpus, and propose a procedure to create a cross-lingual textual entailment corpus for any pair of languages. Finally, we show that the proposed method is enough to build an average score RTE system in both monolingual and cross-lingual textual entailment, that uses semantic information from WordNet as the only source of lexical-semantic knowledge.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. http://www.nist.gov/tac/2008/rte/

  2. http://www.investigacion.frc.utn.edu.ar/mslabs/~jcastillo/Sagan-test-suite/

  3. http://www.cogs.susx.ac.uk/users/drh21/

  4. http://marimba.d.umn.edu/cgi-bin/similarity/similarity.cgi

  5. Intel Core 2 Duo 2.00 GHz, 4 GB RAM.

  6. http://www.clef-campaign.org/

  7. http://trec.nist.gov/

  8. http://research.nii.ac.jp/ntcir/index-en.html

  9. http://www.investigacion.frc.utn.edu.ar/mslabs/~jcastillo/Sagan-test-suite/

References

  1. Bentivogli L, Dagan I, Dang H, Giampiccolo D, Magnini B (2009) The Fifth PASCAL RTE Challenge. In: Proceedings of the Text Analysis Conference, Gaithersburg, Maryland

  2. Castillo J (2010) A semantic oriented approach to textual entailment using WordNet-based measures. MICAI 2010. LNCS, vol 6437. Springer, Heidelberg, pp 44–55

  3. Herrera J, Penas A, Verdejo F (2005) Textual entailment recognition based on dependency analysis and WordNet. In: Proceedings of the 1st. PASCAL Recognising Textual Entailment Challenge Workshop

  4. Ofoghi B, Yearwood J (2009) From lexical entailment to recognizing textual entailment using linguistic resources. In: ALTA Workshop (2009)

  5. Castillo J (2010) A machine learning approach for recognizing textual entailment in Spanish. In: Proceedings of the NAACL HLT 2010 Young Investigators Workshop on Computational Approaches to Languages of the Americas (2010)

  6. Castillo J, Cardenas M (2010) Using sentence semantic similarity based on WordNet in recognizing textual entailment. Iberamia 2010 (2nd edition of the Ibero-American Conference on Artificial Intelligence), In LNCS, vol 6433. Springer, Heidelberg, pp 366–375

  7. Pedersen T, Patwardhan S, Michelizzi J (2004) WordNet::similarity—measuring the relatedness of concepts. In: Proceedings of the AAAI-04

  8. Patwardhan S, Pedersen T (2006) Using WordNet based context vectors to estimate the semantic relatedness of concepts. In: Proceedings of the EACL 2006

  9. Witten I, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco

    MATH  Google Scholar 

  10. Castillo J (2010) Using machine translation systems to expand a corpus in textual entailment. In: Proceedings of the Icetal 2010, LNCS, vol 6233, pp 97–102

  11. Resnik P (1995) Information content to evaluate semantic similarity in a taxonomy. In: Proceedings of IJCAI 1995, pp 448–453

  12. Lin D (1997) An information-theoretic definition of similarity. In: Proceedings of Conference on Machine Learning, pp 296–304

  13. Jiang J, Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy. In: Proceedings of the ROCLING X

  14. Pirrò G., Seco N (2008) Design, implementation and evaluation of a new similarity metric combining feature and intrinsic information content. In: ODBASE 2008, Springer LNCS

  15. Wu Z, Palmer M (1994) Verb semantics and lexical selection. In: Proceedings of the 32nd ACL

  16. Leacock C, Chodorow M (1998) Combining local context and WordNet similarity for word sense identification. MIT Press, pp 265–283

  17. Hirst G, St-Onge D (1998) Lexical chains as representations of context for the detection and correction of malapropisms. MIT Press, pp 305–332

  18. Banerjee S, Pedersen T (2002) An adapted lesk algorithm for word sense disambiguation using WordNet. In: Proceeding of CICLING-02

  19. Castillo J (2010) Recognizing textual entailment: experiments with machine learning algorithms and RTE corpora. In: Proceedings of Cicling 2010

  20. Li Y, McLean D, Bandar, Z., O’Shea J, Crockett K (2006) Sentence similarity based on semantic nets and corpus statistics. In: IEEE TKDE, pp 1138–1150

  21. Li Y, Bandar A, McLean D (2003) An approach for measuring semantic similarity between words using multiple information sources. In: IEEE TKDE, pp 871–882

  22. Lesk M (1986) Automatic sense disambiguation using machine readable dictionaries: How to tell a pine cone from a ice cream cone. In: Proceedings of SIGDOC’86

  23. Gelbukh A, Sidorov G, Han SY (2005) On Some optimization heuristics for lesk-like WSD algorithms. LNCS, vol 3513, Springer, pp 402–405

  24. Kuncheva L (2010) Full-class set classification using the Hungarian algorithm. Int J Mach Learn Cybern 1:53–61

    Google Scholar 

  25. Bentivogli L, Clark P, Dagan I, Dang H, Giampiccolo D (2010) The sixth pascal recognizing textual entailment challenge. In: Proceedings of Textual Analysis Conference. NIST, Maryland, USA

  26. Mehdad Y, Negri M, Federico M (2010) Towards cross-lingual textual entailment. In: Proceedings of the 11th NAACL HLT

  27. Marlow J, Clough P, Recuero J, Artiles J (2008) Exploring the effects of language skills on multilingual web search. In: Proceedings of the 30th European Conference on IR Research (ECIR’08), Glasgow, UK. LNCS, vol 4956, pp 126–137. Springer, Heidelberg

  28. Lilleng J, Tomassen S (2007) Cross-lingual information retrieval by feature vectors. NLDB 2007. LNCS, pp 229–239

  29. Negri M, Mehdad Y (2010) Creating a bi-lingual entailment corpus through translations with mechanical Turk: $100 for a 10-day Rush. In: Proceedings of the 11th NAACL HLT

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Julio Javier Castillo.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Castillo, J.J. A WordNet-based semantic approach to textual entailment and cross-lingual textual entailment. Int. J. Mach. Learn. & Cyber. 2, 177–189 (2011). https://doi.org/10.1007/s13042-011-0026-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-011-0026-z

Keywords

Navigation