Variational bayes for modeling score distributions
- 108 Downloads
- 2 Citations
Abstract
Empirical modeling of the score distributions associated with retrieved documents is an essential task for many retrieval applications. In this work, we propose modeling the relevant documents’ scores by a mixture of Gaussians and the non-relevant scores by a Gamma distribution. Applying Variational Bayes we automatically trade-off the goodness-of-fit with the complexity of the model. We test our model on traditional retrieval functions and actual search engines submitted to TREC. We demonstrate the utility of our model in inferring precision-recall curves. In all experiments our model outperforms the dominant exponential-Gaussian model.
Keywords
Score distributions Gaussian mixtures Variational inference Recall-precision curvesNotes
Acknowledgments
We would like to thank Avi Arampatzis, Jaap Kamps and Stephen Robertson for many useful discussions. Further, we gratefully acknowledge the support provided by NSF grants IIS-0533625 and IIS-0534482 and by the European Commission who funded parts of this research within the Accurat project under contract number FP7-ICT-248347.
References
- Akaike, H. (1974). A new look at the statistical identification model. IEEE Transactions onAutomatic Control, 19, 716–723.MATHCrossRefMathSciNetGoogle Scholar
- Amati, G. (2003). Probability models for information retrieval based on divergence from randomness. PhD thesis, University of Glasgow.Google Scholar
- Amati, G., & Van Rijsbergen, C. J. (2002). Probabilistic models of information retrieval based on measuring divergence from randomness. ACM Transactions on Information Systems, 20(4), 357–389.CrossRefGoogle Scholar
- Arampatzis, A., & van Hameran, A. (2001). The score-distributional threshold optimization for adaptive binary classification tasks. In: SIGIR ’01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 285–293). New York, NY: ACM. doi: 10.1145/383952.384009.
- Arampatzis, A. T., Robertson, S., & Kamps, J. (2009). Score distributions in information retrieval. In: ICTIR (pp. 139–151).Google Scholar
- Attias, H. (1999). Inferring parameters and structure of latent variable models by variational bayes. In: Proceedings of the 15th conference on uncertainty in artificial intelligence (pp. 21–30). San Francisco: Morgan Kaufmann Publishers.Google Scholar
- Attias, H. (2000). A variational bayesian framework for graphical models. In: In advances in neural information processing systems (Vol. 12, pp. 209–215). Cambridge: MIT Press.Google Scholar
- Baumgarten, C. (1999) A probabilistic solution to the selection and fusion problem in distributed information retrieval. In: SIGIR ’99: Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval (pp. 246–253). New York, NY: ACM. doi: 10.1145/312624.312685.
- Bennett, P. N. (2003). Using asymmetric distributions to improve text classifier probability estimates. In: SIGIR ’03: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval (pp. 111–118), ACM, New York, NY, USA, doi: 10.1145/860435.860457.
- Bishop, C. M. (2006). pattern recognition and machine learning (information science and statistics). New York: Springer.Google Scholar
- Bookstein, A. (1977). When the most “pertinent” document should not be retrieved—an analysis of the swets model. Information Processing & Management, 13(6), 377–383.MATHCrossRefGoogle Scholar
- Bozdogan, H. (1993). Choosing the number of component clusters in the mixture model using a new information complexity criterion “choosing the number of component clusters in the mixture model using a new information complexity criterion of the inverse-fisher information matrix. Information and Classification, 40–54.Google Scholar
- Celeux, G., & Soromenho, G. (1996). An entropy criterion for assessing the number of clusters in a mixture model. Journal of Classification, 13(195–212).Google Scholar
- Collins-Thompson, K., Ogilvie, P., Zhang, Y., & Callan, J. (2003). Information filtering, novelty detection, and named-page finding. In Proceedings of the 11th text retrieval conference.Google Scholar
- Hiemstra, D. (2001). Using language models for information retrieval. PhD thesis, Centre for Telematics and Information Technology, University of Twente.Google Scholar
- Kanoulas, E., Dai, K., Pavlu, V., & Aslam, J. A. (2010). Score distribution models: Assumptions, intuition, and robustness to score manipulation. In: To appear in proceedings of the 33rd annual international ACM SIGIR conference on research and development in information retrieval.Google Scholar
- Manmatha, R., Rath, T., & Feng, F. (2001). Modeling score distributions for combining the outputs of search engines. In: SIGIR ’01: proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 267–275). New York, NY: ACM. doi: 10.1145/383952.384005.
- Oard, D. W., Hedin, B., Tomlinson, S., Baron, J. R. (2009). Overview of the trec 2008 legal track. In: In Proceedings of the 17th text retrieval conference.Google Scholar
- Ounis, I., Lioma, C., Macdonald, C., & Plachouras, V. (2007). Research directions in terrier. In: R. Baeza-Yates. et al (Eds.), Novatica/UPGRADE special issue on next generation web search. Invited Paper, 8(1), 49–56.Google Scholar
- Richardson, S., & Green, P. J. (1997). On bayesian analysis of mixtures with an unknown number of components. Journal of Royal Statistical Society B, 59(4), 731–792.MATHCrossRefMathSciNetGoogle Scholar
- Rissanen, J. (1987). Stochastic complexity (with discussion). Journal of the Royal Statistical Society B, 49, 223–239; 253–265.Google Scholar
- Robertson, E. S., & Walker, S. (1994). Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval. In: SIGIR ’94: Proceedings of the 17th annual international ACM SIGIR conference on research and development in information retrieval (pp. 232–241). New York, NY, USA: Springer.Google Scholar
- Robertson, S. (2007). On score distributions and relevance. In: Amati, G., Carpineto, C., Romano, G. (Eds.), Advances in information retrieval, 29th European conference on IR research, ECIR 2007. Lecture notes in computer science, vol 4425/2007 (pp. 40–51). Springer: New York.Google Scholar
- Robertson, S. E., & Jones, S. K. (1976). Relevance weighting of search terms. Journal of the American Society for Information Science, 27(3), 129–146.CrossRefGoogle Scholar
- Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6, 461–464.MATHCrossRefMathSciNetGoogle Scholar
- Spitters, M., & Kraaij, W. (2000). A language modeling approach to tracking news events. In: Proceedings of TDT workshop 2000 (pp. 101–106).Google Scholar
- Swets, J. A. (1963). Information retrieval systems. Science, 141(3577), 245–250.CrossRefGoogle Scholar
- Swets, J. A. (1969). Effectiveness of information retrieval methods. American Documentation, 20, 72–89.CrossRefGoogle Scholar
- Voorhees, E. M., & Harman, D. K. (2005). TREC: experiment and evaluation in information retrieval. Cambridge: Digital Libraries and Electronic Publishing, MIT Press.Google Scholar
- Zhang, Y., & Callan, J. (2001). Maximum likelihood estimation for filtering thresholds. In: SIGIR ’01: Proceedings of the 24th annual international ACM SIGIR conference on research and development in information retrieval (pp. 294–302). New York, NY: ACM. doi: 10.1145/383952.384012.