A Probabilistic Topic Model with Social Tags for Query Reformulation in Informational Search

  • Yuqing Mao
  • Haifeng Shen
  • Chengzheng Sun
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7120)


It is non-trivial to formulate a query that can precisely describe the goal of an informational search task. Query reformulation based on the query clustering approach addresses this issue by expanding a new query with related existing queries that were generated by other users. However, the query clustering approach is unable to cluster queries that are intrinsically related but neither contain common terms nor return common clicked Web page URLs. More importantly, it does not address the issue of ranking retrieved results according to their relevance to the search goal. In this paper, we present new query reformulation approach based on a novel probabilistic topic model to discovering the latent semantic relationships between the queries and the URLs. It can not only discover related queries that cannot be clustered by existing query clustering approaches but also rank retrieved results according to the similarities of probability distributions over the latent topics among the queries and the URLs. The results of our experiments have shown that this approach can significantly improve the performance of an informational search task in terms of search accuracy and search efficiency.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Aslam, J., Montague, M.: Models for metasearch. In: SIGIR 2001, pp. 276–284 (2001)Google Scholar
  2. 2.
    Baeza-Yates, R., Hurtado, C., Mendoza, M.: Query Clustering for Boosting Web Page Ranking. In: Favela, J., Menasalvas, E., Chávez, E. (eds.) AWIC 2004. LNCS (LNAI), vol. 3034, pp. 164–175. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  3. 3.
    Beeferman, D., Berger, A.: Agglomerative clustering of a search engine query log. In: KDD 2000, pp. 407–416 (2000)Google Scholar
  4. 4.
    Blei, D.M., McAuliffe, J.D.: Supervised topic models. In: NIPS 2007: Proceedings of Advances in Neural Information Processing Systems (2007)Google Scholar
  5. 5.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. Journal of Machine Learning Research, 993–1022 (2003)Google Scholar
  6. 6.
    Cui, H., Wen, J.R., Nie, J.Y., Ma, W.Y.: Query expansion by mining user logs. IEEE Transaction of Knowledge Data Engineering 15(4), 829–839 (2003)CrossRefGoogle Scholar
  7. 7.
    Griffiths, T.L., Steyvers, M.: Finding scientific topics. Proceeding of the National Academy of Sciences, 5228–5235 (2004)Google Scholar
  8. 8.
    Joachims, T.: Optimizing search engines using clickthrough data. In: KDD 2002, pp. 133–142 (2002)Google Scholar
  9. 9.
    Pass, G., Chowdhury, A., Torgeson, C.: A Picture of Search. In: Infoscale 2006: Proceedings of the 1st International Conference on Scalable Information Systems (2006)Google Scholar
  10. 10.
    Radlinski, F., Joachims, T.: Query Chains: Learning to Rank from Implicit Feedback. In: KDD 2005, pp. 239–248 (2005)Google Scholar
  11. 11.
    Rocchio, J.J.: Relevance feedback in information retrieval. In: Salton, G. (ed.) The SMART Retrieval System, pp. 313–323. Prentice Hall, Inc., Englewood Cliffs (1971)Google Scholar
  12. 12.
    Rose, D.E., Levinson, D.: Understanding user goals in web search. In: WWW 2004, pp. 13–19 (2004)Google Scholar
  13. 13.
    Wei, J., Bressan, S., Ooi, B.C.: Mining term association rules for automatic global query expansion: methodology and preliminary results. In: WISE 2000, pp. 366–373 (2000)Google Scholar
  14. 14.
    Wei, X., Croft, W.B.: LDA-based document models for ad-hoc retrieval. In: SIGIR 2006, pp. 178–185 (2006)Google Scholar
  15. 15.
    Wen, J.R., Nie, J.Y., Zhang, H.J.: Query clustering using content words and user feedback. In: SIGIR 2001, pp. 442–443 (2001)Google Scholar
  16. 16.
    Xiang, B., Jiang, D., Pei, J., Sun, X., Chen, E., Li, H.: Context-aware ranking in web search. In: SIGIR 2010, pp. 451–458 (2010)Google Scholar
  17. 17.
    Xu, J., Croft, W.B.: Query expansion using local and global document analysis. In: SIGIR 1996, pp. 4–11 (1996)Google Scholar
  18. 18.
    Xue, G.R., Zeng, H.J., Chen, Z., Yu, Y., Ma, W.Y., Xi, W.S., Fan, W.G.: Optimizing web search using web click-through data. In: CIKM 2004, pp. 118–126 (2004)Google Scholar
  19. 19.
    Zhou, D., Bian, J., Zheng, S., Zha, H., Giles, C.L.: Exploring Social Annotations for Information Retrieval. In: WWW 2008, pp. 715–724 (2008)Google Scholar
  20. 20.
    Zubiaga, A., García-Plaza, A.P., Fresno, V., Martínez, R.: Content-based Clustering for Tag Cloud Visualization. In: ASONAM 2009 (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Yuqing Mao
    • 1
    • 2
  • Haifeng Shen
    • 2
  • Chengzheng Sun
    • 1
  1. 1.School of Computer EngineeringNanyang Technological UniversitySingaporeSingapore
  2. 2.School of Computer Science, Engineering & MathematicsFlinders UniversityAdelaideAustralia

Personalised recommendations