ENSM-SE at CLEF 2006 : Fuzzy Proxmity Method with an Adhoc Influence Function

  • Annabelle Mercier
  • Michel Beigbeder
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4730)


We experiment a new influence function in our information retrieval method that uses the degree of fuzzy proximity of key terms in a document to compute the relevance of the document to the query. The model is based on the idea that the closer the query terms in a document are to each other the more relevant the document. Our model handles Boolean queries but, contrary to the traditional extensions of the basic Boolean information retrieval model, does not use a proximity operator explicitly. A single parameter makes it possible to control the proximity degree required. To improve our system we use a stemming algorithm before indexing, we take a specific influence function and we merge fuzzy proximity result lists built with different width of influence function. We explain how we construct the queries and report the results of our experiments in the ad-hoc monolingual French task of the CLEF 2006 evaluation campaign.


Information Retrieval Boolean Model Proximity Operator Query Tree Boolean Query 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. ACM Press / Addison-Wesley, New York (1999)Google Scholar
  2. 2.
    Salton, G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill Book Company, New York (1983)Google Scholar
  3. 3.
    Keen, E.M.: Some aspects of proximity searching in text retrieval systems. Journal of Information Science 18, 89–98 (1992)CrossRefGoogle Scholar
  4. 4.
    Clarke, C.L.A., Cormack, G.V., Tudhope, E.A.: Relevance ranking for one to three term queries. Information Processing and Management 36(2), 291–311 (2000)CrossRefGoogle Scholar
  5. 5.
    Hawking, D., Thistlewaite, P.: Proximity operators - so near and yet so far. In: Harman, D.K. (ed.) The Fourth Text REtrieval Conference (TREC-4), Department of Commerce, National Institute of Standards and Technology, pp. 131–143 (1995)Google Scholar
  6. 6.
    Rasolofo, Y., Savoy, J.: Term proximity scoring for keyword-based retrieval systems. In: Sebastiani, F. (ed.) ECIR 2003. LNCS, vol. 2633, pp. 207–218. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  7. 7.
    Wilkinson, R.: Effective retrieval of structured documents. In: SIGIR 1994, Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 311–317. Springer, New York (1994)Google Scholar
  8. 8.
    De Kretser, O., Moffat, A.: Effective document presentation with a locality-based similarity heuristic. In: SIGIR 1999: Proceedings of the 22nd ACM SIGIR Annual International Conference on Research and Development in Information Retrieval, pp. 113–120. ACM, New York (1999)CrossRefGoogle Scholar
  9. 9.
    Kise, K., Junker, M., Dengel, A., Matsumoto, K.: Passage retrieval based on density distributions of terms and its applications to document retrieval and question answering. In: Dengel, A., Junker, M., Weisbecker, A. (eds.) Reading and Learning. LNCS, vol. 2956, pp. 306–327. Springer, Heidelberg (2004)Google Scholar
  10. 10.
    Mercier, A.: Modélisation et prototypage d’un système de recherche d’informations basé sur la proximité des occurrences de termes de la requête dans les documents. Ph.d dissertation, Ecole Nationale Sup’erieure des Mines de Saint Etienne, Centre G2I (2006)Google Scholar
  11. 11.
    Robertson, S.E., Walker, S., Jones, S., Hancock-Beaulieu, M.M., Gatford, M.: Okapi at trec-3. In Harman, D.K. (ed.) Overview of the Third Text REtrieval Conference (TREC-3), Department of Commerce, National Institute of Standards and Technology, pp. 109–126 (1994)Google Scholar
  12. 12.
    Mercier, A.: Étude comparative de trois approches utilisant la proximité entre les termes de la requête pour le calcul des scores des documents. In: INFORSID 2004, pp. 95–106 (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Annabelle Mercier
    • 1
  • Michel Beigbeder
    • 1
  1. 1.École Nationale Supérieure des Mines de Saint-Étienne, 158 cours Fauriel, 42023 Saint-Etienne Cedex 2France

Personalised recommendations