Advertisement

World Wide Web

, Volume 19, Issue 3, pp 475–497 | Cite as

Query intent mining with multiple dimensions of web search data

  • Di JiangEmail author
  • Kenneth Wai-Ting Leung
  • Wilfred Ng
Article

Abstract

Understanding the users’ latent intents behind the search queries is critical for search engines. Hence, there has been an increasing attention on studying how to effectively mine the intents of search queries by analyzing search engine query log. However, we observe that the information richness of query log is not fully utilized so far and the information underuse heavily limits the performance of the existing methods. In this paper, we tackle the problem of query intent mining by taking full advantage of the information richness of query log from a multi-dimensional perspective. Specifically, we capture the latent relations between search queries via three different dimensions: the URL dimension, the session dimension and the term dimension. We first propose the Result-Oriented Framework (ROF), which is easy to implement and significantly improves both the precision and the recall of query intent mining. We further propose the Topic-Oriented Framework (TOF), in order to significantly reduce the online time and memory consumptions for query intent mining. TOF employs the Query Log Topic Model (QLTM) that derives the latent topics from query log to integrate the information of the three dimensions in a principled way. The latent topics that are considered as low-dimensional descriptions of the query relations and serve as the basis of efficient online query intent mining. We conduct extensive experiments on a major commercial search engine query log. Experimental results show that the two frameworks significantly outperform the state-of-the-art methods with respect to a variety of metrics.

Keywords

Search engine Query log Topic model 

References

  1. 1.
    Arguello, J., Diaz, F., Callan, J., Crespo, J.F.: In: SIGIR (2009)Google Scholar
  2. 2.
    Baker, L., McCallum, A.: In: SIGIR (1998)Google Scholar
  3. 3.
    Beeferman, D., Berger, A.: In: SIGKDD (2000)Google Scholar
  4. 4.
    Blei, D., Ng, A., Jordan, M.: In: NIPS (2002)Google Scholar
  5. 5.
    Boldi, P., Bonchi, F., Castillo, C., Donato, D., Gionis, A., Vigna, S.: In: CIKM (2008)Google Scholar
  6. 6.
    Broder, A: In SIGIR forum (2002)Google Scholar
  7. 7.
    Broder, A., Fontoura, M., Gabrilovich, E., Joshi, A., Josifovski, V., Zhang, T.: In: SIGIR (2007)Google Scholar
  8. 8.
    Calderon-Benavides, L., Gonzalez-Caro, C., Baeza-Yates, R.: In: SIGIR Workshop (2010)Google Scholar
  9. 9.
    Cao, H., Jiang, D., Pei, J., He, Q., Liao, Z., Chen, E., Li, H.: In: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining (ACM), pp. 875–883 (2008)Google Scholar
  10. 10.
    Carman, M., Crestani, F., Harvey, M., Baillie, M.: In CIKM (2010)Google Scholar
  11. 11.
    Celikyilmaz, A., Hakkani-Tur, D., Tur, G.: Leveraging web query logs to learn user intent via bayesian discrete latent variable model. ICML (2011)Google Scholar
  12. 12.
    Craswell, N., Szummer, M.: In: SIGIR (2007)Google Scholar
  13. 13.
    Dang, V., Xue, X., Croft, W.B.: In: CIKM (2011)Google Scholar
  14. 14.
    Deng, H., Lyu, M.R.: In: SIGKDD (2009)Google Scholar
  15. 15.
    Griffiths, T.L., Steyvers, M.: NAS (2004)Google Scholar
  16. 16.
    Han, J., Wang, J., Lu, Y., Tzvetkov, P.: In: ICDM (2002)Google Scholar
  17. 17.
    Hu, Y., Qian, Y., Li, H., Jiang, D., Pei, J., Zheng, Q.: In: SIGIR (2012)Google Scholar
  18. 18.
    Jiang, D., Leung, K., Ng, W.: In: CIKM (2011)Google Scholar
  19. 19.
    Jiang, D., Vosecky, J., Leung, K.W.T., Ng, W.: G-WSTD: a framework for geographic web search topic discovery. In: CIKM (2012)Google Scholar
  20. 20.
    Jo, Y., Oh, A.H.: In: WSDM (2011)Google Scholar
  21. 21.
    Lee, U., Liu, Z., Cho, J.: Automatic identification of user goals in web search. In: Proceedings of the 14th international conference on World Wide Web, pp. 391–400, (ACM, 2005)Google Scholar
  22. 22.
    Li, X., Wang, Y.Y.: In: SIGIR (2008)Google Scholar
  23. 23.
    Manning, C.D., Raghavan, P., Schutze, H.: Introduction to information retrieval (2008)Google Scholar
  24. 24.
    Pantel, P., Lin, T., Gamon, M.: In: ACL (2012)Google Scholar
  25. 25.
    Pereira, F., Tishby, N., Lee, L.: Distributional clustering of English words. ACL (1993)Google Scholar
  26. 26.
    Poblete, B., Castillo, C., Gionis, A.: In: CIKM (2008)Google Scholar
  27. 27.
    Qian, Y., Sakai, T., Ye, J., Zheng, Q., Li, C.: Dynamic query intent mining from a search log stream. In: Proceedings of the 22nd ACM international conference on Conference on information & knowledge management, pp. 1205–1208, (ACM, 2013)Google Scholar
  28. 28.
    Radlinski, F., Szummer, M., Craswell, N.: In: WWW (2010)Google Scholar
  29. 29.
    Rosen-Zvi, M., Griffiths, T., Steyvers, M., Smyth, P.: In: UAI (2004)Google Scholar
  30. 30.
    Sadikov, E., Madhavan, J., Wang, L., Halevy, A.: In: WWW (2010)Google Scholar
  31. 31.
    Shen, D., Pan, R., Sun, J.T., Pan, J.J., Wu, K., Yin, J., Yang, Q.: In: TOIS (2006)Google Scholar
  32. 32.
    Shen, D., Sun, J., Yang, Q., Chen, Z.: In: SIGIR (2006)Google Scholar
  33. 33.
  34. 34.
    Wang, C.J., Chen, H.H.: Intent mining in search query logs for automatic search script generation. Knowl. Inf. Syst. 39(3), 513 (2014)CrossRefGoogle Scholar
  35. 35.
    Wang, X., Zhai, C.: In: CIKM (2008)Google Scholar
  36. 36.
    Wallach, H.: In: ICML (2006)Google Scholar
  37. 37.
    Wallach, H.M.: Unpublished doctoral dissertation. Univ. of Cambridge (2008)Google Scholar
  38. 38.
    Wen, J.R., Nie, J.Y., Zhang, H.J.: Query clustering using content words and user feedback. In: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval (ACM, 2001), pp. 442–443Google Scholar
  39. 39.
    Yang, D., Shen, D.-R., Yu, G., Kou, Y., Nie, T.-Z.: Query intent disambiguation of keyword-based semantic entity search in dataspaces. J. Comput. Sci. Technol. 28(2), 382 (2013)CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  1. 1.Baidu, Inc.BeijingPeople’s Republic of China
  2. 2.Hong Kong University of Science and TechnologyHong KongPeople’s Republic of China

Personalised recommendations