Skip to main content

A Study of Collection-Based Features for Adapting the Balance Parameter in Pseudo Relevance Feedback

  • Conference paper
  • First Online:
Information Retrieval Technology (AIRS 2015)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9460))

Included in the following conference series:

  • 790 Accesses

Abstract

Pseudo-relevance feedback (PRF) is an effective technique to improve the ad-hoc retrieval performance. For PRF methods, how to optimize the balance parameter between the original query model and feedback model is an important but difficult problem. Traditionally, the balance parameter is often manually tested and set to a fixed value across collections and queries. However, due to the difference among collections and individual queries, this parameter should be tuned differently. Recent research has studied various query based and feedback documents based features to predict the optimal balance parameter for each query on a specific collection, through a learning approach based on logistic regression. In this paper, we hypothesize that characteristics of collections are also important for the prediction. We propose and systematically investigate a series of collection-based features for queries, feedback documents and candidate expansion terms. The experiments show that our method is competitive in improving retrieval performance and particularly for cross-collection prediction, in comparison with the state-of-the-art approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Allan, J., Connell, M.E., Croft, W.B., Feng, F.-F., Fisher, D., Li, X.: Inquery and trec-9. Technical report, DTIC Document (2000)

    Google Scholar 

  2. Bialynicki-Birula, I., Mycielski, J.: Uncertainty relations for information entropy in wave mechanics. Commun. Math. Phys. 44(2), 129–132 (1975)

    Article  MathSciNet  Google Scholar 

  3. Buckley, C., Robertson, S.: Relevance feedback track overview: Trec 2008. In: Proceedings of TREC 2008 (2008)

    Google Scholar 

  4. Cao, G., Nie, J.-Y., Gao, J., Robertson, S.: Selecting good expansion terms for pseudo-relevance feedback. In: SIGIR, pp. 243–250. ACM (2008)

    Google Scholar 

  5. He, B., Ounis, I.: Query performance prediction. Inf. Syst. 31(7), 585–594 (2006)

    Article  Google Scholar 

  6. Jones, K.S.: Experiments in relevance weighting of search terms. Inf. Process. Manage. 15(79), 133–144 (1979)

    Article  Google Scholar 

  7. Kishida, K.: Property of mean average precision as performance measure in retrieval experiment. IPSJ SIG. Notes 74, 97–104 (2001)

    Google Scholar 

  8. Kwok, K.-L.: A new method of weighting query terms for ad-hoc retrieval. In Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 187–195. ACM (1996)

    Google Scholar 

  9. Lavrenko, V., Croft, W.B.: Relevance based language models. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 120–127. ACM (2001)

    Google Scholar 

  10. Lv, Y., Zhai, C.: Adaptive relevance feedback in information retrieval. In Proceedings of the 18th ACM Conference on Information and Knowledge Management, pp. 255–264. ACM(2009)

    Google Scholar 

  11. Ogilvie, P., Callan, J.P.: Experiments using the lemur toolkit. TREC 10, 103–108 (2001)

    Google Scholar 

  12. Peng, H., Long, F., Ding, C.: Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), 1226–1238 (2005)

    Article  Google Scholar 

  13. Pirkola, A., Järvelin, K.: Employing the resolution power of search keys. J. Am. Soc. Inform. Sci. Technol. 52(7), 575–583 (2001)

    Article  Google Scholar 

  14. Plachouras, V., Ounis, I., van Rijsbergen, C.J., Cacheda, F.: University of glasgow at the web track: dynamic application of hyperlink analysis using the query scope. TREC 3, 636–642 (2003)

    Google Scholar 

  15. Porter, M.F.: An algorithm for suffix stripping. Program 14(3), 130–137 (1980)

    Article  Google Scholar 

  16. Salton, G.: Improving retrieval performance by relevance feedback. J. Am. Soc. Inf. Sci. 41(4), 288–297 (1990)

    Article  Google Scholar 

  17. Sanderson, M., Turpin, A., Zhang, Y., Scholer, F.: Differences in effectiveness across sub-collections. In: CIKM ACM Conference on Information & Knowledge Management, pp. 1965–1969 (2012)

    Google Scholar 

  18. Xu, J., Croft, W.B.: Query expansion using local and global document analysis. In: Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 4–11. ACM (1996)

    Google Scholar 

  19. Z. Ye and J. X. Huang.: A simple term frequency transformation model for effective pseudo relevance feedback. In: Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval, pp. 323–332. ACM (2014)

    Google Scholar 

  20. Zhai, C., Lafferty, J.: Model-based feedback in the language modeling approach to information retrieval. In: Proceedings of Tenth International Conference on Information & Knowledge Management, pp. 403–410 (2001)

    Google Scholar 

  21. Zhai, C., Lafferty, J.: A study of smoothing methods for language models applied to ad hoc information retrieval. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 334–342. ACM (2001)

    Google Scholar 

  22. Zhang, P., Song, D., Zhao, X., Hou, Y.: A study of document weight smoothness in pseudo relevance feedback. In: Cheng, P.-J., Kan, M.-Y., Lam, W., Nakov, P. (eds.) AIRS 2010. LNCS, vol. 6458, pp. 527–538. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

Download references

Acknowledgments

This work is supported in part by Chinese National Program on Key Basic Research Project (973 Program, grant No. 2013CB329304, 2014CB744604), the Chinese 863 Program (grant No. 2015AA015403), the Natural Science Foundation of China (grant No. 61272265, 61402324), and the Research Fund for the Doctoral Program of Higher Education of China (grant No. 20130032120044).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dawei Song .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Meng, Y., Zhang, P., Song, D., Hou, Y. (2015). A Study of Collection-Based Features for Adapting the Balance Parameter in Pseudo Relevance Feedback. In: Zuccon, G., Geva, S., Joho, H., Scholer, F., Sun, A., Zhang, P. (eds) Information Retrieval Technology. AIRS 2015. Lecture Notes in Computer Science(), vol 9460. Springer, Cham. https://doi.org/10.1007/978-3-319-28940-3_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-28940-3_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-28939-7

  • Online ISBN: 978-3-319-28940-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics