A Study of Collection-Based Features for Adapting the Balance Parameter in Pseudo Relevance Feedback

  • Ye Meng
  • Peng Zhang
  • Dawei SongEmail author
  • Yuexian Hou
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9460)


Pseudo-relevance feedback (PRF) is an effective technique to improve the ad-hoc retrieval performance. For PRF methods, how to optimize the balance parameter between the original query model and feedback model is an important but difficult problem. Traditionally, the balance parameter is often manually tested and set to a fixed value across collections and queries. However, due to the difference among collections and individual queries, this parameter should be tuned differently. Recent research has studied various query based and feedback documents based features to predict the optimal balance parameter for each query on a specific collection, through a learning approach based on logistic regression. In this paper, we hypothesize that characteristics of collections are also important for the prediction. We propose and systematically investigate a series of collection-based features for queries, feedback documents and candidate expansion terms. The experiments show that our method is competitive in improving retrieval performance and particularly for cross-collection prediction, in comparison with the state-of-the-art approaches.


Information retrieval Pseudo-relevance feedback Collection characteristics 



This work is supported in part by Chinese National Program on Key Basic Research Project (973 Program, grant No. 2013CB329304, 2014CB744604), the Chinese 863 Program (grant No. 2015AA015403), the Natural Science Foundation of China (grant No. 61272265, 61402324), and the Research Fund for the Doctoral Program of Higher Education of China (grant No. 20130032120044).


  1. 1.
    Allan, J., Connell, M.E., Croft, W.B., Feng, F.-F., Fisher, D., Li, X.: Inquery and trec-9. Technical report, DTIC Document (2000)Google Scholar
  2. 2.
    Bialynicki-Birula, I., Mycielski, J.: Uncertainty relations for information entropy in wave mechanics. Commun. Math. Phys. 44(2), 129–132 (1975)CrossRefMathSciNetGoogle Scholar
  3. 3.
    Buckley, C., Robertson, S.: Relevance feedback track overview: Trec 2008. In: Proceedings of TREC 2008 (2008)Google Scholar
  4. 4.
    Cao, G., Nie, J.-Y., Gao, J., Robertson, S.: Selecting good expansion terms for pseudo-relevance feedback. In: SIGIR, pp. 243–250. ACM (2008)Google Scholar
  5. 5.
    He, B., Ounis, I.: Query performance prediction. Inf. Syst. 31(7), 585–594 (2006)CrossRefGoogle Scholar
  6. 6.
    Jones, K.S.: Experiments in relevance weighting of search terms. Inf. Process. Manage. 15(79), 133–144 (1979)CrossRefGoogle Scholar
  7. 7.
    Kishida, K.: Property of mean average precision as performance measure in retrieval experiment. IPSJ SIG. Notes 74, 97–104 (2001)Google Scholar
  8. 8.
    Kwok, K.-L.: A new method of weighting query terms for ad-hoc retrieval. In Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 187–195. ACM (1996)Google Scholar
  9. 9.
    Lavrenko, V., Croft, W.B.: Relevance based language models. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 120–127. ACM (2001)Google Scholar
  10. 10.
    Lv, Y., Zhai, C.: Adaptive relevance feedback in information retrieval. In Proceedings of the 18th ACM Conference on Information and Knowledge Management, pp. 255–264. ACM(2009)Google Scholar
  11. 11.
    Ogilvie, P., Callan, J.P.: Experiments using the lemur toolkit. TREC 10, 103–108 (2001)Google Scholar
  12. 12.
    Peng, H., Long, F., Ding, C.: Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–1238 (2005)CrossRefGoogle Scholar
  13. 13.
    Pirkola, A., Järvelin, K.: Employing the resolution power of search keys. J. Am. Soc. Inform. Sci. Technol. 52(7), 575–583 (2001)CrossRefGoogle Scholar
  14. 14.
    Plachouras, V., Ounis, I., van Rijsbergen, C.J., Cacheda, F.: University of glasgow at the web track: dynamic application of hyperlink analysis using the query scope. TREC 3, 636–642 (2003)Google Scholar
  15. 15.
    Porter, M.F.: An algorithm for suffix stripping. Program 14(3), 130–137 (1980)CrossRefGoogle Scholar
  16. 16.
    Salton, G.: Improving retrieval performance by relevance feedback. J. Am. Soc. Inf. Sci. 41(4), 288–297 (1990)CrossRefGoogle Scholar
  17. 17.
    Sanderson, M., Turpin, A., Zhang, Y., Scholer, F.: Differences in effectiveness across sub-collections. In: CIKM ACM Conference on Information & Knowledge Management, pp. 1965–1969 (2012)Google Scholar
  18. 18.
    Xu, J., Croft, W.B.: Query expansion using local and global document analysis. In: Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 4–11. ACM (1996)Google Scholar
  19. 19.
    Z. Ye and J. X. Huang.: A simple term frequency transformation model for effective pseudo relevance feedback. In: Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval, pp. 323–332. ACM (2014)Google Scholar
  20. 20.
    Zhai, C., Lafferty, J.: Model-based feedback in the language modeling approach to information retrieval. In: Proceedings of Tenth International Conference on Information & Knowledge Management, pp. 403–410 (2001)Google Scholar
  21. 21.
    Zhai, C., Lafferty, J.: A study of smoothing methods for language models applied to ad hoc information retrieval. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 334–342. ACM (2001)Google Scholar
  22. 22.
    Zhang, P., Song, D., Zhao, X., Hou, Y.: A study of document weight smoothness in pseudo relevance feedback. In: Cheng, P.-J., Kan, M.-Y., Lam, W., Nakov, P. (eds.) AIRS 2010. LNCS, vol. 6458, pp. 527–538. Springer, Heidelberg (2010)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Tianjin Key Laboratory of Cognitive Computing and ApplicationTianjin UniversityTianjinChina
  2. 2.The Computing DepartmentThe Open UniversityBuckinghamshireUK

Personalised recommendations