Exploring the Space of IR Functions

Goswami, Parantapa; Moura, Simon; Gaussier, Eric; Amini, Massih-Reza; Maes, Francis

doi:10.1007/978-3-319-06028-6_31

Exploring the Space of IR Functions

Parantapa Goswami²²,
Simon Moura²²,
Eric Gaussier²²,
Massih-Reza Amini²² &
…
Francis Maes²³

Conference paper

2955 Accesses
2 Citations
10 Altmetric

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8416))

Abstract

In this paper we propose an approach to discover functions for IR ranking from a space of simple closed-form mathematical functions. In general, all IR ranking models are based on two basic variables, namely, term frequency and document frequency. Here a grammar for generating all possible functions is defined which consists of the two above said variables and basic mathematical operations - addition, subtraction, multiplication, division, logarithm, exponential and square root. The large set of functions generated by this grammar is filtered by checking mathematical feasibility and satisfiability to heuristic constraints on IR scoring functions proposed by the community. Obtained candidate functions are tested on various standard IR collections and several simple but highly efficient scoring functions are identified. We show that these newly discovered functions are outperforming other state-of-the-art IR scoring models through extensive experimentation on several IR collections. We also compare the performance of functions satisfying IR constraints to those which do not, and show that the former set of functions clearly outperforms the latter one.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Amati, G., Van Rijsbergen, C.J.: Probabilistic models of information retrieval based on measuring the divergence from randomness. ACM Trans. Inf. Syst. 20(4), 357–389 (2002)
Article Google Scholar
Clinchant, S., Gaussier, E.: Information-based models for ad hoc ir. In: Proceedings of the 33rd ACM SIGIR Conference (2010)
Google Scholar
Cohen, W.W., Schapire, R.E., Singer, Y.: Learning to order things. Journal of Artificial Intelligence Research 10(1), 243–270 (1999)
MathSciNet MATH Google Scholar
Crammer, K., Singer, Y.: Pranking with ranking. In: Advances in Neural Information Processing Systems (NIPS 14), pp. 641–647. MIT Press (2001)
Google Scholar
Cummins, R., O’Riordan, C.: Evolved term-weighting schemes in information retrieval: an analysis of the solution space. Artif. Intell. Rev. 26(1-2), 35–47 (2006)
Article Google Scholar
Cummins, R., O’Riordan, C.: Evolving local and global weighting schemes in information retrieval. Inf. Retr. 9(3), 311–330 (2006)
Article Google Scholar
Cummins, R., O’Riordan, C.: Analysing Ranking Functions in Information Retrieval Using Constraints. In: Information Extraction from the Internet, CreateSpace Independent Publishing Platform (August 2009)
Google Scholar
Cummins, R., O’Riordan, C.: Measuring constraint violations in information retrieval. In: Proceedings of the 32nd SIGIR, pp. 722–723 (2009)
Google Scholar
Fan, W., Gordon, M.D., Pathak, P.: A generic ranking function discovery framework by genetic programming for information retrieval. Inf. Process. Manage. 40(4), 587–602 (2004)
Article MATH Google Scholar
Fang, H., Tao, T., Zhai, C.: A formal study of information retrieval heuristics. In: Proceedings of the 27th ACM SIGIR Conference (2004)
Google Scholar
Freund, Y., Iyer, R., Schapire, R.E., Singer, Y.: An efficient boosting algorithm for combining preferences. Journal of Machine Learning Research (2003)
Google Scholar
Gordon, M.: Probabilistic and genetic algorithms in document retrieval. Commun. ACM 31(10), 1208–1218 (1988)
Article Google Scholar
Joachims, T.: Optimizing search engines using clickthrough data. In: Proceedings of the 8th ACM SIGKDD, pp. 133–142 (2002)
Google Scholar
Maes, F., Wehenkel, L., Ernst, D.: Automatic discovery of ranking formulas for playing with multi-armed bandits. In: Sanner, S., Hutter, M. (eds.) EWRL 2011. LNCS, vol. 7188, pp. 5–17. Springer, Heidelberg (2012)
Chapter Google Scholar
Metzler, D., Croft, W.B.: A markov random field model for term dependencies. In: SIGIR, pp. 472–479 (2005)
Google Scholar
Pathak, P., Gordon, M.D., Fan, W.: Effective information retrieval using genetic algorithms based matching functions adaptation. In: HICSS (2000)
Google Scholar
Ponte, J.M., Croft, W.B.: A language modeling approach to information retrieval. In: Proceedings of the 21st ACM SIGIR Conference (1998)
Google Scholar
Robertson, S.E., Zaragoza, H.: The probabilistic relevance framework: BM25 and beyond. Foundations and Trends in Information Retrieval 3(4), 333–389 (2009)
Article Google Scholar
Salton, G., McGill, J.: Introduction to Modern Information Retrieval. McGraw-Hill, New York (1983)
MATH Google Scholar
Valizadegan, H., Jin, R., Zhang, R., Mao, J.: Learning to rank by optimizing ndcg measure. In: Advances in Neural Information Processing Systems (NIPS 22), pp. 1883–1891 (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

CNRS - LIG/AMA, Université Grenoble Alps, Grenoble, France
Parantapa Goswami, Simon Moura, Eric Gaussier & Massih-Reza Amini
D-Labs, Paris, France
Francis Maes

Authors

Parantapa Goswami
View author publications
You can also search for this author in PubMed Google Scholar
Simon Moura
View author publications
You can also search for this author in PubMed Google Scholar
Eric Gaussier
View author publications
You can also search for this author in PubMed Google Scholar
Massih-Reza Amini
View author publications
You can also search for this author in PubMed Google Scholar
Francis Maes
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Amsterdam, Amsterdam, The Netherlands
Maarten de Rijke & Tom Kenter &
Centrum Wiskunde en Informatica, Amsterdam, The Netherlands and Delft University of Technology, Delft, The Netherlands
Arjen P. de Vries
University of Illinois at Urbana-Champaign, Urbana, IL, USA
ChengXiang Zhai
University of Twente, Twente, The Netheralnds and Erasmus University Rotterdam, Rotterdam, The Netherlands
Franciska de Jong
SalesPredict, Haifa, Israel
Kira Radinsky
Microsoft Research, Cambridge, UK
Katja Hofmann

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Goswami, P., Moura, S., Gaussier, E., Amini, MR., Maes, F. (2014). Exploring the Space of IR Functions. In: de Rijke, M., et al. Advances in Information Retrieval. ECIR 2014. Lecture Notes in Computer Science, vol 8416. Springer, Cham. https://doi.org/10.1007/978-3-319-06028-6_31

Download citation

DOI: https://doi.org/10.1007/978-3-319-06028-6_31
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-06027-9
Online ISBN: 978-3-319-06028-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics