Advertisement

A selective approach to index term weighting for robust information retrieval based on the frequency distributions of query terms

  • Ahmet ArslanEmail author
  • Bekir Taner Dinçer
Article
  • 106 Downloads

Abstract

A typical information retrieval (IR) system applies a single retrieval strategy to every information need of users. However, the results of the past IR experiments show that a particular retrieval strategy is in general good at fulfilling some type of information needs while failing to fulfil some other type, i.e., high variation in retrieval effectiveness across information needs. On the other hand, the same results also show that an information need that a particular retrieval strategy failed to fulfil could be fulfilled by one of the other existing retrieval strategies. The challenge in here is therefore to determine in advance what retrieval strategy should be applied to which information need. This challenge is related to the robustness of IR systems in retrieval effectiveness. For an IR system, robustness can be defined as fulfilling every information need of users with an acceptable level of satisfaction. Maintaining robustness in retrieval effectiveness is a long-standing challenge and in this article we propose a simple but powerful method as a remedy. The method is a selective approach to index term weighting and for any given query (i.e., information need) it predicts the “best” term weighting model amongst a set of alternatives, on the basis of the frequency distributions of query terms on a target document collection. To predict the best term weighting model, the method uses the Chi-square statistic, the statistic of the Chi-square goodness-of-fit test. The results of the experiments, performed using the official query sets of the TREC Web track and the Million Query track, reveal in general that the frequency distributions of query terms provide relevant information on the retrieval effectiveness of term weighting models. In particular, the results show that the selective approach proposed in this article is, on average, more effective and more robust than the most effective single term weighting model.

Keywords

Chi-square goodness-of-fit Index term weighting Robustness in retrieval effectiveness Selective information retrieval 

Notes

Acknowledgements

This work is supported by TÜBİTAK, scientific and technological research projects funding program, under Grant 114E558. Any opinions, findings and conclusions or recommendations expressed in this material are the authors’ and do not necessarily reflect those of the sponsor.

References

  1. Agresti, A. (2002). Categorical data analysis. New York: Wiley-Interscience.zbMATHCrossRefGoogle Scholar
  2. Amati, G. (2006). Frequentist and Bayesian approach to information retrieval. In Advances in information retrieval, lecture notes in computer science (Vol. 3936, pp. 13–24). Berlin: Springer.Google Scholar
  3. Amati, G. (2009). Divergence from randomness models (pp. 929–932). Boston, MA: Springer.Google Scholar
  4. Amati, G., Carpineto, C., & Romano, G. (2004). Query difficulty, robustness, and selective application of query expansion. In Advances in information retrieval, lecture notes in computer science (Vol. 2997, pp. 127–137). Berlin: Springer.Google Scholar
  5. Amati, G., & Van Rijsbergen, C. J. (2002). Probabilistic models of information retrieval based on measuring the divergence from randomness. ACM Transactions on Information Systems, 20(4), 357–389.CrossRefGoogle Scholar
  6. Arguello, J., Crane, M., Diaz, F., Lin, J., & Trotman, A. (2016). Report on the SIGIR 2015 workshop on reproducibility, inexplicability, and generalizability of results (RIGOR). SIGIR Forum, 49(2), 107–116.CrossRefGoogle Scholar
  7. Arlot, S., & Celisse, A. (2010). A survey of cross-validation procedures for model selection. Statistics Surveys, 4, 40–79.MathSciNetzbMATHCrossRefGoogle Scholar
  8. Azzopardi, L., Crane, M., Fang, H., Ingersoll, G., Lin, J., Moshfeghi, Y., Scells, H., Yang, P., & Zuccon, G. (2017). The Lucene for information access and retrieval research (LIARR) workshop at SIGIR 2017. In Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval, Shinjuku, Tokyo, Japan, SIGIR ’17 (pp. 1429–1430). ACM.Google Scholar
  9. Balasubramanian, N., & Allan, J. (2010). Learning to select rankers. In Proceedings of the 33rd international ACM SIGIR conference on research and development in information retrieval, Geneva, Switzerland, SIGIR ’10 (pp. 855–856) ACM.Google Scholar
  10. Białecki, A., Muir, R., & Ingersoll, G. (2012). Apache Lucene 4. In Proceedings of the SIGIR 2012 workshop on open source information retrieval, Portland, Oregon, USA (pp. 17–24).Google Scholar
  11. Buckley, C. (2009). Why current IR engines fail. Information Retrieval, 12(6), 652–665.CrossRefGoogle Scholar
  12. Burges, C., Shaked, T., Renshaw, E., Lazier, A., Deeds, M., Hamilton, N., & Hullender, G. (2005). Learning to rank using gradient descent. In Proceedings of the 22nd international conference on machine learning, Bonn, Germany, ICML ’05 (pp. 89–96).Google Scholar
  13. Callan, J., Hoy, M., Yoo, C., & Zhao, L. (2009). The ClueWeb09 dataset. http://boston.lti.cs.cmu.edu/classes/11-742/S10-TREC/TREC-Nov19-09.pdf. Accessed 15 October 2017.
  14. Carterette, B., Pavlu, V., Fang, H., & Kanoulas, E. (2009). Million query track 2009 overview. Technical report. National Institute of Standards and Technology.Google Scholar
  15. Clinchant, S., & Gaussier, É. (2010). Information-based models for ad hoc IR. In Proceedings of the 33rd international ACM SIGIR conference on research and development in information retrieval, Geneva, Switzerland, SIGIR ’10 (pp. 234–241). ACM.Google Scholar
  16. Clinchant, S., & Gaussier, É. (2011). Retrieval constraints and word frequency distributions a log-logistic model for IR. Information Retrieval, 14(1), 5–25.CrossRefGoogle Scholar
  17. Collins-Thompson, K. (2009). Reducing the risk of query expansion via robust constrained optimization. In Proceedings of the 18th ACM conference on information and knowledge management, New York, NY, USA, CIKM ’09 (pp. 837–846). ACM.Google Scholar
  18. Cormack, G. V., Smucker, M. D., & Clarke, C. L. A. (2011). Efficient and effective spam filtering and re-ranking for large web datasets. Information Retrieval, 14(5), 441–465.CrossRefGoogle Scholar
  19. Dinçer, B. T., Macdonald, C., & Ounis, I. (2014). Hypothesis testing for the risk-sensitive evaluation of retrieval systems. In Proceedings of the 37th international ACM SIGIR conference on research and development in information retrieval, Gold Coast, Queensland, Australia, SIGIR ’14 (pp. 23–32). ACM.Google Scholar
  20. Dinçer, B. T., Macdonald, C., & Ounis, I. (2016). Risk-sensitive evaluation and learning to rank using multiple baselines. In Proceedings of the 39th international ACM SIGIR conference on research and development in information retrieval, Pisa, Italy, SIGIR ’16 (pp. 483–492). ACM.Google Scholar
  21. Geng, X., Liu, T. Y., Qin, T., Arnold, A., Li, H., & Shum, H. Y. (2008). Query dependent ranking using \(k\)-nearest neighbor. In Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval, Singapore, Singapore, SIGIR ’08 (pp. 115–122). ACM.Google Scholar
  22. Harman, D., & Buckley, C. (2004). The NRRC reliable information access (RIA) workshop. In Proceedings of the 27th annual international ACM SIGIR conference on research and development in information retrieval, Sheffield, United Kingdom, SIGIR ’04 (pp. 528–529). ACM.Google Scholar
  23. Harman, D., & Buckley, C. (2009). Overview of the reliable information access workshop. Information Retrieval, 12(6), 615–641.CrossRefGoogle Scholar
  24. Harter, S. (1975a). A probabilistic approach to automatic keyword indexing. Part I: On the distribution of specialty words in a technical literature. Journal of the American Society for Information Science (JASIS), 26, 197–216.CrossRefGoogle Scholar
  25. Harter, S. (1975b). A probabilistic approach to automatic keyword indexing. Part II: An algorithm for probabilistic indexing. Journal of the American Society for Information Science (JASIS), 26, 280–289.CrossRefGoogle Scholar
  26. He, B., & Ounis, I. (2003a). A study of parameter tuning for term frequency normalization. In Proceedings of the twelfth international conference on information and knowledge management, New Orleans, LA, USA, CIKM ’03 (pp. 10–16). ACM.Google Scholar
  27. He, B., & Ounis, I. (2003b). University of Glasgow at the robust track—A query-based model selection approach for the poorly-performing queries. Technical report. National Institute of Standards and Technology.Google Scholar
  28. He, B., & Ounis, I. (2004). A query-based pre-retrieval model selection approach to information retrieval. In Proceedings of the RIAO 2004—Coupling approaches, coupling media and coupling languages for information retrieval, Vaucluse, France, RIAO ’04 (pp. 706–719).Google Scholar
  29. He, B., & Ounis, I. (2005). Term frequency normalisation tuning for BM25 and DFR models. In D. E. Losada & J. M. Fernández-Luna (Eds.), Advances in information retrieval (pp. 200–214). Berlin: Springer.CrossRefGoogle Scholar
  30. Järvelin, K., & Kekäläinen, J. (2002). Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems, 20(4), 422–446.CrossRefGoogle Scholar
  31. Kocabaş, I., Dinçer, B. T., & Karaoğlan, B. (2014). A nonparametric term weighting method for information retrieval based on measuring the divergence from independence. Information Retrieval, 17(2), 153–176.CrossRefGoogle Scholar
  32. Krovetz, R. (1993). Viewing morphology as an inference process. In Proceedings of the 16th annual international ACM SIGIR conference on research and development in information retrieval, Pittsburgh, Pennsylvania, USA, SIGIR ’93 (pp. 191–202). ACM.Google Scholar
  33. Lin, J., Crane, M., Trotman, A., Callan, J., Chattopadhyaya, I., Foley, J., Ingersoll, G., Macdonald, C., & Vigna, S. (2016). Toward reproducible baselines: The open-source IR reproducibility challenge. In Advances in information retrieval: 38th European conference on IR research, ECIR 2016, Padua, Italy, March 20-23, 2016. Proceedings (pp. 408–420). Cham: Springer.Google Scholar
  34. Mackenzie, J., Culpepper, J. S., Blanco, R., Crane, M., Clarke, C. L. A, & Lin, J. (2018). Query driven algorithm selection in early stage retrieval. In Proceedings of the eleventh ACM international conference on web search and data mining, Marina Del Rey, CA, USA, WSDM ’18 (pp. 396–404). ACM.Google Scholar
  35. Peng, J., He, B., & Ounis, I. (2009a). Predicting the usefulness of collection enrichment for enterprise search. In Proceedings of the 2nd international conference on theory of information retrieval: Advances in information retrieval theory, ICTIR ’09 (pp. 366–370). Berlin: Springer.Google Scholar
  36. Peng, J., Macdonald, C., He, B., & Ounis, I. (2009b). A study of selective collection enrichment for enterprise search. In Proceedings of the 18th ACM conference on information and knowledge management, Hong Kong, China, CIKM ’09 (pp. 1999–2002). ACM.Google Scholar
  37. Peng, J., Macdonald, C., & Ounis, I. (2010). Learning to select a ranking function. In Proceedings of the 32nd European conference on advances in information retrieval, Milton Keynes, UK, ECIR’2010 (pp. 114–126). Springer.Google Scholar
  38. Peng, J., & Ounis, I. (2009). Selective application of query-independent features in web information retrieval. In Advances in information retrieval, lecture notes in computer science (Vol. 5478, pp. 375–387). Berlin: Springer.Google Scholar
  39. Petersen, C., Simonsen, J. G., Järvelin, K., & Lioma, C. (2016). Adaptive distributional extensions to DFR ranking. In Proceedings of the 25th ACM international on conference on information and knowledge management, Indianapolis, Indiana, USA, CIKM ’16 (pp. 2005–2008). ACM.Google Scholar
  40. Plachouras, V., Cacheda, F., & Ounis, I. (2006). A decision mechanism for the selective combination of evidence in topic distillation. Information Retrieval, 9(2), 139–163.CrossRefGoogle Scholar
  41. Plachouras, V., Ounis, I., & Cacheda, F. (2004). Selective combination of evidence for topic distillation using document and aggregate-level information. In Proceedings of the RIAO 2004—coupling approaches, coupling media and coupling languages for information retrieval, Vaucluse, France, RIAO ’04 (pp. 610–622).Google Scholar
  42. Press, W. H., Teukolsky, S. A., Vetterling, W. T., & Flannery, B. P. (2007). Numerical recipes: The art of scientific computing (3rd ed.). New York, NY: Cambridge University Press.zbMATHGoogle Scholar
  43. Robertson, S., & Walker, S. (1994). Some simple approximations to the 2-Poisson model for probabilistic weighted retrieval. In Proceedings of the 17th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR’94) (pp. 232–241).Google Scholar
  44. Robertson, S., & Zaragoza, H. (2009). The probabilistic relevance framework: BM25 and beyond. Foundations and Trends in Information Retrieval, 3(4), 333–389.CrossRefGoogle Scholar
  45. Robertson, S. E., van Rijsbergen, C. J., & Porter, M. (1981). Probabilistic models of indexing and searching, chap. 4. In S. E. Robertson, C. J. van Rijsbergen, & P. Williams (Eds.), Information retrieval research (pp. 35–56). Oxford: Butterworths.Google Scholar
  46. Santos, R. L., Macdonald, C., & Ounis, I. (2010). Selectively diversifying web search results. In Proceedings of the 19th ACM international conference on information and knowledge management, Toronto, ON, Canada, CIKM ’10 (pp. 1179–1188). ACM.Google Scholar
  47. Teevan, J., Dumais, S. T., & Liebling, D. J. (2008). To personalize or not to personalize: Modeling queries with variation in user intent. In Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval, Singapore, Singapore, SIGIR ’08 (pp. 163–170). ACM.Google Scholar
  48. Tonellotto, N., Macdonald, C., & Ounis, I. (2013). Efficient and effective retrieval using selective pruning. In Proceedings of the Sixth ACM international conference on web search and data mining, Rome, Italy, WSDM ’13 (pp. 63–72). ACM.Google Scholar
  49. Voorhees, E. M. (2004). Measuring ineffectiveness. In Proceedings of the 27th annual international ACM SIGIR conference on research and development in information retrieval, Sheffield, United Kingdom, SIGIR ’04 (pp. 562–563). ACM.Google Scholar
  50. Voorhees, E. M., Rajput, S., & Soboroff, I. (2016). Promoting repeatability through open runs. In Proceedings of the seventh international workshop on evaluating information access, Tokyo, Japan, EVIA 2016 (pp. 17–20).Google Scholar
  51. Wang, L., Bennett, P. N., & Collins-Thompson, K. (2012). Robust ranking models via risk-sensitive optimization. In Proceedings of the 35th international ACM SIGIR conference on research and development in information retrieval, Portland, Oregon, USA, SIGIR ’12 (pp. 761–770). ACM.Google Scholar
  52. White, R. W., Richardson, M., Bilenko, M., & Heath, A. P. (2008). Enhancing web search by promoting multiple search engine use. In Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval, Singapore, Singapore, SIGIR ’08 (pp. 43–50). ACM.Google Scholar
  53. Yom-Tov, E., Fine, S., Carmel, D., & Darlow, A. (2005). Learning to estimate query difficulty: Including applications to missing content detection and distributed information retrieval. In Proceedings of the 28th annual international ACM SIGIR conference on research and development in information retrieval, Salvador, Brazil, SIGIR ’05 (pp. 512–519). ACM.Google Scholar
  54. Zhai, C., & Lafferty, J. (2004). A study of smoothing methods for language models applied to information retrieval. ACM Transactions on Information Systems, 22(2), 179–214.CrossRefGoogle Scholar

Copyright information

© Springer Nature B.V. 2018

Authors and Affiliations

  1. 1.Computer Engineering DepartmentEskişehir Technical UniversityEskişehirTurkey
  2. 2.Computer Engineering DepartmentMuğla Sıtkı Koçman UniversityMuğlaTurkey

Personalised recommendations