Term Proximity and Data Mining Techniques for Information Retrieval Systems

Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 206)


Term clustering based on proximity measure is a strategy leading to efficiently yield documents relevance. Unlike the recent studies that investigated term proximity for improving matching function between the document and the query, in this work the whole process of information retrieval is thoroughly revised on both indexing and interrogation steps. Consequently, an Extended Inverted file is built by exploiting the term proximity concept and using data mining techniques. Then three interrogation approaches are proposed, the first one uses query expansion, the second one is based on the Extended Inverted file and the last one hybridizes retrieval methods. Experiments carried out on OHSUMED demonstrate the effectiveness and efficiency of our approaches compared to the traditional one.


information retrieval term proximity word association Fuzzy Clustering 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley, New York (1999)Google Scholar
  2. 2.
    Chu, H.: Information representation and retrieval in the digital age. Information Today, New Jersey (2010)Google Scholar
  3. 3.
    Cummins, R., O’Riordan, C.: Learning in a pairwise term-term proximity framework for information retrieval. In: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 251–258 (2009)Google Scholar
  4. 4.
    Drias, H., Khennak, I., Boukhedra, A.: A hybrid genetic algorithm for large scale information retrieval. In: IEEE International Conference on Intelligent Computing and Intelligent Systems, ICIS, pp. 842–846 (2009)Google Scholar
  5. 5.
    He, B., Huang, J.X., Zhou, X.: Modeling Term Proximity for Probabilistic Information Retrieval Models. Information Sciences Journal 181(14) (2011)Google Scholar
  6. 6.
    Kowalski, G.: Information Retrieval Architecture and Algorithms. Springer, New York (2011)MATHCrossRefGoogle Scholar
  7. 7.
    Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann, San Francisco (2011)Google Scholar
  8. 8.
    Manning, D.M., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)MATHCrossRefGoogle Scholar
  9. 9.
    Mingjie, Z., Shuming, S., Mingjing, L., Ji-Rong, W.: Effective Top-K Computation in Retrieving Structured Documents with Term-Proximity Support. In: CIKM 2007 (2007)Google Scholar
  10. 10.
    Robertson, S.E., Walker, S., Hancock-Beaulieu, M., Gatford, M., Payne, A.: Okapi at TREC-4. In: TREC (1995)Google Scholar
  11. 11.
    Vechtomova, O., Wang, Y.: A study of the effect of term proximity on query expansion. Journal of Information Science 32(4), 324–333 (2006)CrossRefGoogle Scholar
  12. 12.
    Wei, X., Croft, W.B.: Modeling Term Associations for Ad-Hoc Retrieval Performance Within Language Modeling Framework. In: Amati, G., Carpineto, C., Romano, G. (eds.) ECiR 2007. LNCS, vol. 4425, pp. 52–63. Springer, Heidelberg (2007)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  1. 1.Computer Science Departement, Laboratory for Research in Artificial Intelligence (LRIA)USTHBAlgiersAlgeria

Personalised recommendations