Skip to main content

Term Proximity and Data Mining Techniques for Information Retrieval Systems

  • Conference paper
Advances in Information Systems and Technologies

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 206))

Abstract

Term clustering based on proximity measure is a strategy leading to efficiently yield documents relevance. Unlike the recent studies that investigated term proximity for improving matching function between the document and the query, in this work the whole process of information retrieval is thoroughly revised on both indexing and interrogation steps. Consequently, an Extended Inverted file is built by exploiting the term proximity concept and using data mining techniques. Then three interrogation approaches are proposed, the first one uses query expansion, the second one is based on the Extended Inverted file and the last one hybridizes retrieval methods. Experiments carried out on OHSUMED demonstrate the effectiveness and efficiency of our approaches compared to the traditional one.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley, New York (1999)

    Google Scholar 

  2. Chu, H.: Information representation and retrieval in the digital age. Information Today, New Jersey (2010)

    Google Scholar 

  3. Cummins, R., O’Riordan, C.: Learning in a pairwise term-term proximity framework for information retrieval. In: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 251–258 (2009)

    Google Scholar 

  4. Drias, H., Khennak, I., Boukhedra, A.: A hybrid genetic algorithm for large scale information retrieval. In: IEEE International Conference on Intelligent Computing and Intelligent Systems, ICIS, pp. 842–846 (2009)

    Google Scholar 

  5. He, B., Huang, J.X., Zhou, X.: Modeling Term Proximity for Probabilistic Information Retrieval Models. Information Sciences Journal 181(14) (2011)

    Google Scholar 

  6. Kowalski, G.: Information Retrieval Architecture and Algorithms. Springer, New York (2011)

    Book  MATH  Google Scholar 

  7. Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann, San Francisco (2011)

    Google Scholar 

  8. Manning, D.M., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)

    Book  MATH  Google Scholar 

  9. Mingjie, Z., Shuming, S., Mingjing, L., Ji-Rong, W.: Effective Top-K Computation in Retrieving Structured Documents with Term-Proximity Support. In: CIKM 2007 (2007)

    Google Scholar 

  10. Robertson, S.E., Walker, S., Hancock-Beaulieu, M., Gatford, M., Payne, A.: Okapi at TREC-4. In: TREC (1995)

    Google Scholar 

  11. Vechtomova, O., Wang, Y.: A study of the effect of term proximity on query expansion. Journal of Information Science 32(4), 324–333 (2006)

    Article  Google Scholar 

  12. Wei, X., Croft, W.B.: Modeling Term Associations for Ad-Hoc Retrieval Performance Within Language Modeling Framework. In: Amati, G., Carpineto, C., Romano, G. (eds.) ECiR 2007. LNCS, vol. 4425, pp. 52–63. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ilyes Khennak .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Khennak, I., Drias, H. (2013). Term Proximity and Data Mining Techniques for Information Retrieval Systems. In: Rocha, Á., Correia, A., Wilson, T., Stroetmann, K. (eds) Advances in Information Systems and Technologies. Advances in Intelligent Systems and Computing, vol 206. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36981-0_44

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-36981-0_44

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-36980-3

  • Online ISBN: 978-3-642-36981-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics