Query Phrase Suggestion from Topically Tagged Session Logs

  • Eric C. Jensen
  • Steven M. Beitzel
  • Abdur Chowdhury
  • Ophir Frieder
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4027)


Searchers’ difficulty in formulating effective queries for their information needs is well known. Analysis of search session logs shows that users often pose short, vague queries and then struggle with revising them. Interactive query expansion (users selecting terms to add to their queries) dramatically improves effectiveness and satisfaction. Suggesting relevant candidate expansion terms based on the initial query enables users to satisfy their information needs faster. We find that suggesting query phrases other users have found it necessary to add for a given query (mined from session logs) dramatically improves the quality of suggestions over simply using cooccurrence. However, this exacerbates the sparseness problem faced when mining short queries that lack features. To mitigate this, we tag query phrases with higher level topical categories to mine more general rules, finding that this enables us to make suggestions for approximately 10% more queries while maintaining an acceptable false positive rate.


Query Term Query Expansion Initial Query Online Evaluation Suggestion Query 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Spink, A., Jansen, B.J., Ozmutlu, H.C.: Use of query reformulation and relevance feedback by excite users. Internet Research: Electronic Networking Applications and Policy 10(4), 317–328 (2000)CrossRefGoogle Scholar
  2. 2.
    Kelly, D., Dollu, V.D., Fu, X.: The loquacious user: A document-independent source of terms for query expansion. In: ACM Conference on Research and Development in Information Retrieval (2005)Google Scholar
  3. 3.
    Belkin, N.J.: The human element: Helping people find what they don’t know. Communications of the ACM 43(8), 58–61 (2000)CrossRefGoogle Scholar
  4. 4.
    Hersh, W.: Trec 2002 interactive track report. In: Voorhees, E.M., Buckland, L.P. (eds.) Proceedings of the Eleventh Text Retrieval Conference (TREC 2002), vol. SP 500-251, NIST (2002)Google Scholar
  5. 5.
    Beitzel, S.M., Jensen, E.C., Chowdhury, A., Grossman, D., Frieder, O.: Hourly analysis of a very large topically categorized web query log. In: ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 321–328 (2004)Google Scholar
  6. 6.
    Shen, X., Tan, B., Zhai, C.: Context sensitive information retrieval using implicit feedback. In: ACM Conference on Research and Development in Information Retrieval (2005)Google Scholar
  7. 7.
    Murray, G.C., Lin, J., Chowdhury, A.: Characterizing web search user sessions with hierarchical agglomerative clustering (forthcoming, 2006)Google Scholar
  8. 8.
    Sihvonen, A., Vakkari, P.: Subject knowledge, thesaurus-assisted query expansion and search success. In: RIAO (2004)Google Scholar
  9. 9.
    Wen, J.R., Zhang, H.J.: Information Retrieval and Clustering. In: Query Clustering in the Web Context, pp. 195–226. Kluwer Academic Publishers, Dordrecht (2003)Google Scholar
  10. 10.
    Baeza-Yates, R., Hurtado, C., Mendoza, M.: Query recommendation using query logs in search engines. In: International Workshop on Clustering Information over the Web (2004)Google Scholar
  11. 11.
    Fonseca, B.M., Golgher, P., Pÿssas, B., Ribeiro-Neto, B., Ziviani, N.: Concept based interactive query expansion. In: ACM Conference on Information and Knowledge Management (2005)Google Scholar
  12. 12.
    Kawamae, N., Takeya, M., Hanaki, M.: Semantic log analysis based on a user query behavior model. In: IEEE International Conference on Data Mining (2003)Google Scholar
  13. 13.
    Jones, R., Fain, D.C.: Query word deletion prediction. In: ACM Conference on Research and Development in Information Retrieval (2003)Google Scholar
  14. 14.
    Huang, C.K., Chien, L.F., Oyang, Y.J.: Relevant term suggestion in interactive web search based on contextual information in query session logs. Journal of the American Society of Information Science and Technology 54(7), 638–649 (2003)CrossRefGoogle Scholar
  15. 15.
    Gleich, D., Zhukov, L.: Svd based term suggestion and ranking system. In: IEEE International Conference on Data Mining (2004)Google Scholar
  16. 16.
    Herlocker, J.L., Kostan, J.A., Terveen, L.G., Riedl, J.T.: Evaluating collaborative filtering recommender systems. ACM Transactions on Information Systems 22(1), 5–53 (2004)CrossRefGoogle Scholar
  17. 17.
    Manning, C.D., Schutze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)MATHGoogle Scholar
  18. 18.
    Martin, A., Doddington, G., Kamm, T., Ordowski, M., Przybocki, M.: The det curve in assessment of detection task performance. In: Proceedings of the 5th ESCA Conference on Speech Communication and Technology (Eurospeech 1997), pp. 1895–1898 (1997)Google Scholar
  19. 19.
    Manmatha, R., Feng, A., Allan, J.: A critical examination of tdt’s cost function. In: Proceedings of the 25th annual international ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 403–404 (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Eric C. Jensen
    • 1
  • Steven M. Beitzel
    • 1
  • Abdur Chowdhury
    • 2
  • Ophir Frieder
    • 1
  1. 1.Information Retrieval LaboratoryIllinois Institute of TechnologyChicagoUSA
  2. 2.America Online, Inc.SterlingUSA

Personalised recommendations