Advertisement

Using WordNet Relations and Semantic Classes in Information Retrieval Tasks

  • Javi Fernández
  • Rubén Izquierdo
  • José M. Gómez
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6241)

Abstract

In this paper we explore the use of semantic classes in an existing information retrieval system in order to improve its results. Thus, we use two different ontologies of semantic classes (WordNet domain and Basic Level Concepts) in order to re-rank the retrieved documents and obtain better recall and precision. Finally, we implement a new method for weighting the expanded terms taking into account the weights of the original query terms and their relations in WordNet with respect to the new ones (which have demonstrated to improve the results). The evaluation of these approaches was carried out in the CLEF Robust-WSD Task, obtaining an improvement of 1.8% in GMAP for the semantic classes approach and 10% in MAP employing the WordNet term weighting approach.

Keywords

Query Term Word Sense Information Retrieval System Word Sense Disambiguation Semantic Class 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Aguirre, E., Di Nunzio, G.M., Mandl, T., Otegi, A.: Clef 2009 ad hoc track overview: Robust-wsd task. In: Peters, C., et al. (eds.) CLEF 2009 Workshop, Part I, Corfou, Greece. LNCS, vol. 6241, pp. 36–49. Springer, Heidelberg (2010)Google Scholar
  2. 2.
    Amati, G., Van Rijsbergen, C.J.: Probabilistic models of information retrieval based on measuring the divergence from randomness. ACM Trans. Inf. Syst. 20(4), 357–389 (2002)CrossRefGoogle Scholar
  3. 3.
    Cover, T.M., Thomas, J.A.: Elements of information theory. Wiley-Interscience, New York (1991)zbMATHCrossRefGoogle Scholar
  4. 4.
    Fellbaum, C. (ed.): WordNet. An Electronic Lexical Database. The MIT Press, Cambridge (1998)zbMATHGoogle Scholar
  5. 5.
    Fernández, J., Izquierdo, R., Gómez, J.M.: Alicante at clef 2009 robust-wsd task. In: Working notes of Cross Language Evaluation Forum 2008, Corfou, Greece, CLEF (2009)Google Scholar
  6. 6.
    Izquierdo, R., Suárez, A., Rigau, G.: Exploring the automatic selection of basic level concepts. In: Angelova, G., et al. (eds.) International Conference Recent Advances in Natural Language Processing, Borovets, Bulgaria, pp. 298–302 (2007)Google Scholar
  7. 7.
    Izquierdo, R., Suárez, A., Rigau, G.: An empirical study on class-based word sense disambiguation. In: EACL, pp. 389–397. The Association for Computer Linguistics (2009)Google Scholar
  8. 8.
    Macdonald, C., He, B., Plachouras, V., Ounis, I.: University of glasgow at trec 2005: Experiments in terabyte and enterprise tracks with terrier. In: TREC (2005)Google Scholar
  9. 9.
    Magnini, B., Cavaglià, G.: Integrating subject field codes into wordnet. In: Proceedings of LREC, Athens, Greece (2000)Google Scholar
  10. 10.
    Niles, I., Pease, A.: Towards a standard upper ontology. In: Weltyand, C., Smith, B. (eds.) Proceedings of the 2nd International Conference on Formal Ontology in Information Systems (FOIS 2001), pp. 17–19 (2001)Google Scholar
  11. 11.
    Peréz Agüera, J.R., Zaragoza, H.: Query clauses and term independence. In: Peters, C., Deselaers, T., Ferro, N., Gonzalo, J., Jones, G.J.F., Kurimo, M., Mandl, T., Peñas, A., Petras, V. (eds.) CLEF 2008. LNCS, vol. 5706, pp. 138–145. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  12. 12.
    Plachouras, V., He, B., Ounis, I.: University of glasgow at trec 2004: Experiments in web, robust, and terabyte tracks with terrier. In: Voorhees, E.M., Buckland, L.P., Voorhees, E.M., Buckland, L.P. (eds.) TREC, NIST (2004)Google Scholar
  13. 13.
    Pradhan, S., Loper, E., Dligach, D., Palmer, M.: Semeval-2007 task-17: English lexical sample, srl and all words. In: Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval 2007), Prague, Czech Republic, June 2007, pp. 87–92. Association for Computational Linguistics (June 2007)Google Scholar
  14. 14.
    Robertson, S.E., Walker, S.: Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval. In: SIGIR 1994: Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 232–241. Association for Computational Linguistics (1994)Google Scholar
  15. 15.
    Snow, R., Prakash, S., Jurafsky, D., Ng, A.: Learning to merge word senses. In: Proceedings of Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pp. 1005–1014 (2007)Google Scholar
  16. 16.
    Snyder, B., Palmer, M.: The english all-words task. In: Mihalcea, R., Edmonds, P. (eds.) Senseval-3: Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text, Barcelona, Spain, pp. 41–43. Association for Computational Linguistics (July 2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Javi Fernández
    • 1
  • Rubén Izquierdo
    • 1
  • José M. Gómez
    • 1
  1. 1.Department of Software and Computing SystemsUniversity of AlicanteAlicanteSpain

Personalised recommendations