Journal of Computer Science and Technology

, Volume 25, Issue 4, pp 728–738

Learning Query Ambiguity Models by Using Search Logs

  • Ruihua Song
  • Zhicheng Dou
  • Hsiao-Wuen Hon
  • Yong Yu
Regular Paper

DOI: 10.1007/s11390-010-9360-y

Cite this article as:
Song, R., Dou, Z., Hon, HW. et al. J. Comput. Sci. Technol. (2010) 25: 728. doi:10.1007/s11390-010-9360-y

Abstract

dentifying ambiguous queries is crucial to research on personalized Web search and search result diversity. Intuitively, query logs contain valuable information on how many intentions users have when issuing a query. However, previous work showed user clicks alone are misleading in judging a query as being ambiguous or not. In this paper, we address the problem of learning a query ambiguity model by using search logs. First, we propose enriching a query by mining the documents clicked by users and the relevant follow up queries in a session. Second, we use a text classifier to map the documents and the queries into predefined categories. Third, we propose extracting features from the processed data. Finally, we apply a state-of-the-art algorithm, Support Vector Machine (SVM), to learn a query ambiguity classifier. Experimental results verify that the sole use of click based features or session based features perform worse than the previous work based on top retrieved documents. When we combine the two sets of features, our proposed approach achieves the best effectiveness, specifically 86% in terms of accuracy. It significantly improves the click based method by 5.6% and the session based method by 4.6%.

Keywords

ambiguous query log mining query classification 

Copyright information

© Springer 2010

Authors and Affiliations

  • Ruihua Song
    • 1
    • 2
  • Zhicheng Dou
    • 2
  • Hsiao-Wuen Hon
    • 2
  • Yong Yu
    • 1
  1. 1.Department of Computer ScienceShanghai Jiao Tong UniversityShanghaiChina
  2. 2.Microsoft Research AsiaBeijingChina

Personalised recommendations