Advertisement

A WordNet-based semantic approach to textual entailment and cross-lingual textual entailment

  • Julio Javier CastilloEmail author
Original Article

Abstract

In this paper we explain how to build a recognizing textual entailment (RTE) system which only uses semantic similarity measures based on WordNet. We show how the widely used WordNet-based semantic measures can be generalized to build sentence level semantic metrics in order to be used in both mono-lingual and cross-lingual textual entailment. We experiment with a wide variety of RTE datasets and evaluate the contribution of an algorithm which expands the RTE monolingual corpus. Results achieved with this method yielded significant statistical differences when predicting RTE test sets. We provide an efficiency analysis of these metrics drawing some conclusions about their practical utility in recognizing textual entailment. We also analyze the cross-lingual textual entailment task, we create a bilingual English–Spanish corpus, and propose a procedure to create a cross-lingual textual entailment corpus for any pair of languages. Finally, we show that the proposed method is enough to build an average score RTE system in both monolingual and cross-lingual textual entailment, that uses semantic information from WordNet as the only source of lexical-semantic knowledge.

Keywords

Recognizing textual entailment Cross-lingual textual entailment WordNet Expand corpus Semantic measures Machine learning 

References

  1. 1.
    Bentivogli L, Dagan I, Dang H, Giampiccolo D, Magnini B (2009) The Fifth PASCAL RTE Challenge. In: Proceedings of the Text Analysis Conference, Gaithersburg, MarylandGoogle Scholar
  2. 2.
    Castillo J (2010) A semantic oriented approach to textual entailment using WordNet-based measures. MICAI 2010. LNCS, vol 6437. Springer, Heidelberg, pp 44–55Google Scholar
  3. 3.
    Herrera J, Penas A, Verdejo F (2005) Textual entailment recognition based on dependency analysis and WordNet. In: Proceedings of the 1st. PASCAL Recognising Textual Entailment Challenge WorkshopGoogle Scholar
  4. 4.
    Ofoghi B, Yearwood J (2009) From lexical entailment to recognizing textual entailment using linguistic resources. In: ALTA Workshop (2009)Google Scholar
  5. 5.
    Castillo J (2010) A machine learning approach for recognizing textual entailment in Spanish. In: Proceedings of the NAACL HLT 2010 Young Investigators Workshop on Computational Approaches to Languages of the Americas (2010)Google Scholar
  6. 6.
    Castillo J, Cardenas M (2010) Using sentence semantic similarity based on WordNet in recognizing textual entailment. Iberamia 2010 (2nd edition of the Ibero-American Conference on Artificial Intelligence), In LNCS, vol 6433. Springer, Heidelberg, pp 366–375Google Scholar
  7. 7.
    Pedersen T, Patwardhan S, Michelizzi J (2004) WordNet::similarity—measuring the relatedness of concepts. In: Proceedings of the AAAI-04Google Scholar
  8. 8.
    Patwardhan S, Pedersen T (2006) Using WordNet based context vectors to estimate the semantic relatedness of concepts. In: Proceedings of the EACL 2006Google Scholar
  9. 9.
    Witten I, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San FranciscozbMATHGoogle Scholar
  10. 10.
    Castillo J (2010) Using machine translation systems to expand a corpus in textual entailment. In: Proceedings of the Icetal 2010, LNCS, vol 6233, pp 97–102Google Scholar
  11. 11.
    Resnik P (1995) Information content to evaluate semantic similarity in a taxonomy. In: Proceedings of IJCAI 1995, pp 448–453Google Scholar
  12. 12.
    Lin D (1997) An information-theoretic definition of similarity. In: Proceedings of Conference on Machine Learning, pp 296–304Google Scholar
  13. 13.
    Jiang J, Conrath D (1997) Semantic similarity based on corpus statistics and lexical taxonomy. In: Proceedings of the ROCLING XGoogle Scholar
  14. 14.
    Pirrò G., Seco N (2008) Design, implementation and evaluation of a new similarity metric combining feature and intrinsic information content. In: ODBASE 2008, Springer LNCSGoogle Scholar
  15. 15.
    Wu Z, Palmer M (1994) Verb semantics and lexical selection. In: Proceedings of the 32nd ACLGoogle Scholar
  16. 16.
    Leacock C, Chodorow M (1998) Combining local context and WordNet similarity for word sense identification. MIT Press, pp 265–283Google Scholar
  17. 17.
    Hirst G, St-Onge D (1998) Lexical chains as representations of context for the detection and correction of malapropisms. MIT Press, pp 305–332Google Scholar
  18. 18.
    Banerjee S, Pedersen T (2002) An adapted lesk algorithm for word sense disambiguation using WordNet. In: Proceeding of CICLING-02Google Scholar
  19. 19.
    Castillo J (2010) Recognizing textual entailment: experiments with machine learning algorithms and RTE corpora. In: Proceedings of Cicling 2010Google Scholar
  20. 20.
    Li Y, McLean D, Bandar, Z., O’Shea J, Crockett K (2006) Sentence similarity based on semantic nets and corpus statistics. In: IEEE TKDE, pp 1138–1150Google Scholar
  21. 21.
    Li Y, Bandar A, McLean D (2003) An approach for measuring semantic similarity between words using multiple information sources. In: IEEE TKDE, pp 871–882Google Scholar
  22. 22.
    Lesk M (1986) Automatic sense disambiguation using machine readable dictionaries: How to tell a pine cone from a ice cream cone. In: Proceedings of SIGDOC’86Google Scholar
  23. 23.
    Gelbukh A, Sidorov G, Han SY (2005) On Some optimization heuristics for lesk-like WSD algorithms. LNCS, vol 3513, Springer, pp 402–405Google Scholar
  24. 24.
    Kuncheva L (2010) Full-class set classification using the Hungarian algorithm. Int J Mach Learn Cybern 1:53–61Google Scholar
  25. 25.
    Bentivogli L, Clark P, Dagan I, Dang H, Giampiccolo D (2010) The sixth pascal recognizing textual entailment challenge. In: Proceedings of Textual Analysis Conference. NIST, Maryland, USAGoogle Scholar
  26. 26.
    Mehdad Y, Negri M, Federico M (2010) Towards cross-lingual textual entailment. In: Proceedings of the 11th NAACL HLTGoogle Scholar
  27. 27.
    Marlow J, Clough P, Recuero J, Artiles J (2008) Exploring the effects of language skills on multilingual web search. In: Proceedings of the 30th European Conference on IR Research (ECIR’08), Glasgow, UK. LNCS, vol 4956, pp 126–137. Springer, HeidelbergGoogle Scholar
  28. 28.
    Lilleng J, Tomassen S (2007) Cross-lingual information retrieval by feature vectors. NLDB 2007. LNCS, pp 229–239Google Scholar
  29. 29.
    Negri M, Mehdad Y (2010) Creating a bi-lingual entailment corpus through translations with mechanical Turk: $100 for a 10-day Rush. In: Proceedings of the 11th NAACL HLTGoogle Scholar

Copyright information

© Springer-Verlag 2011

Authors and Affiliations

  1. 1.National University of Cordoba-FaMAFCordobaArgentina
  2. 2.National Technological University-FRCCordobaArgentina

Personalised recommendations