Applying the User-over-Ranking Hypothesis to Query Formulation

  • Matthias Hagen
  • Benno Stein
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6931)


The User-over-Ranking hypothesis states that the best retrieval performance can be achieved with queries returning about as many results as can be considered at user site[21]. We apply this hypothesis to Lee et al.’s problem of automatically formulating a single promising query from a given set of keywords[16]. Lee et al.’s original approach requires unrestricted access to the retrieval system’s index and manual parameter tuning for each keyword set. Their approach is not applicable on larger scale, not to mention web search scenarios. By applying the User-over-Ranking hypothesis we overcome this restriction and present a fully automatic user-site heuristic for web query formulation from given keywords. Substantial performance gains of up to 60% runtime improvement over previous approaches for similar problems underpin the value of our approach.


Search Engine Yield Factor Result List Query Formulation Internal Estimation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Balasubramanian, N., Kumaran, G., Carvalho, V.R.: Exploring reductions for long web queries. In: Proceedings of SIGIR 2010, pp. 571–578 (2010)Google Scholar
  2. 2.
    Bar-Yossef, Z., Gurevich, M.: Random sampling from a search engine’s index. Journal of the ACM 55(5) (2008)Google Scholar
  3. 3.
    Barker, K., Cornacchia, N.: Using noun phrase heads to extract document keyphrases. In: Proceedings of AI 2000, pp. 40–52 (2000)Google Scholar
  4. 4.
    Bendersky, M., Croft, W.B.: Discovering key concepts in verbose queries. In: Proceedings of SIGIR 2008, pp. 491–498 (2008)Google Scholar
  5. 5.
    Carmel, D., Yom-Tov, E., Darlow, A., Pelleg, D.: What makes a query difficult? In: Proceedings of SIGIR 2006, pp. 390–397 (2006)Google Scholar
  6. 6.
    Cronen-Townsend, S., Zhou, Y., Croft, W.B.: Predicting query performance. In: Proceedings of SIGIR 2002, pp. 299–306 (2002)Google Scholar
  7. 7.
    Hagen, M., Stein, B.: Search strategies for keyword-based queries. In: Proceedings of DEXA 2010 Workshop TIR 2010, pp. 37–41 (2010)Google Scholar
  8. 8.
    Hagen, M., Rüb, T., Stein, B.: Query session detection as a cascade. In: Proceedings of ECIR 2011 Workshop SIR 2011 (2011)Google Scholar
  9. 9.
    Hauff, C., Hiemstra, D., de Jong, F.: A survey of pre-retrieval query performance predictors. In: Proceedings of CIKM 2008, pp. 1419–1420 (2008)Google Scholar
  10. 10.
    He, B., Ounis, I.: Inferring query performance using pre-retrieval predictors. In: Apostolico, A., Melucci, M. (eds.) SPIRE 2004. LNCS, vol. 3246, pp. 43–54. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  11. 11.
    Huston, S., Croft, W.B.: Evaluating verbose query processing techniques. In: Proceedings of SIGIR 2010, pp. 291–298 (2010)Google Scholar
  12. 12.
    Kumaran, G., Allan, J.: Adapting information retrieval systems to user queries. Information Processing and Management 44(6), 1838–1862 (2008)CrossRefGoogle Scholar
  13. 13.
    Kumaran, G., Allan, J.: Effective and efficient user interaction for long queries. In: Proceedings of SIGIR 2008, pp. 11–18 (2008)Google Scholar
  14. 14.
    Kumaran, G., Carvalho, V.R.: Reducing long queries using query quality predictors. In: Proceedings of SIGIR 2009, pp. 564–571 (2009)Google Scholar
  15. 15.
    Lease, M., Allan, J., Croft, W.B.: Regression rank: Learning to meet the opportunity of descriptive queries. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds.) ECIR 2009. LNCS, vol. 5478, pp. 90–101. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  16. 16.
    Lee, C.-J., Chen, R.-C., Kao, S.-H., Cheng, P.-J.: A term dependency-based approach for query terms ranking. In: Proceedings of CIKM 2009, pp. 1267–1276 (2009)Google Scholar
  17. 17.
    Luo, G., Tang, C., Yang, H., Wei, X.: MedSearch: a specialized search engine for medical information retrieval. In: Proceedings of CIKM 2008, pp. 143–152 (2008)Google Scholar
  18. 18.
    Pass, G., Chowdhury, A., Torgeson, C.: A picture of search. In: Proceedings of Infoscale, paper. 1 (2006)Google Scholar
  19. 19.
    Shapiro, J., Taksa, I.: Constructing web search queries from the user’s information need expressed in a natural language. In: Proceedings of SAC 2003, pp. 1157–1162 (2003)Google Scholar
  20. 20.
    Stein, B., Hagen, M.: Making the most of a web search session. In: Proceedings of WI-IAT 2010, pp. 90–97 (2010)Google Scholar
  21. 21.
    Stein, B., Hagen, M.: Introducing the user-over-ranking hypothesis. In: Clough, P., Foley, C., Gurrin, C., Jones, G.J.F., Kraaij, W., Lee, H., Mudoch, V. (eds.) ECIR 2011. LNCS, vol. 6611, pp. 503–509. Springer, Heidelberg (2011)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Matthias Hagen
    • 1
  • Benno Stein
    • 1
  1. 1.Faculty of MediaBauhaus-Universität WeimarGermany

Personalised recommendations