Automatic Query Expansion Using Data Manifold

  • Lingpeng Yang
  • Donghong Ji
  • Yu Nie
  • Tingting He
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4182)


This paper proposes an automatic query expansion method that combines document re-ranking and standard Rocchio’s relevance feedback. The document re-ranking method ranks the top retrieved documents based on the intrinsic manifold structure collectively revealed by a great amount of data. This is done by using a semi-supervised learning algorithm to integrate pseudo relevant documents with documents to be re-ranked. Given an initial ranked list of retrieved documents, the document re-ranking approach picks a set of documents from the top ones (including query itself) as pseudo relevant documents. In this way, the intrinsic relationship of all the retrieved documents to be re-ranked with the pseudo relevant documents (pseudo irrelevant documents are missing) can be determined via a semi-supervised learning algorithm. Finally, all the retrieved documents can be re-ranked according to above relationship. Evaluation on benchmark corpora show that the approach can achieve much better performance than standard Rocchio’s relevance feedback and performance better than other related approaches.


Information Retrieval Ranking Score Query Expansion Mean Average Precision Initial Retrieval 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Carpineto, C., Demori, R., Romano, G., Bigi, B.: An Information-Theoretic Approach to Automatic Query Expansion. ACM Transactions on Information Systems 19(1), 1–27 (2001)CrossRefGoogle Scholar
  2. 2.
    Crouch, C., Crouch, D., Chen, Q., Holtz, S.: Improving the Retrieval Effectiveness of Very Short Queries. In: Information Processing and Management, vol. 38 (2002)Google Scholar
  3. 3.
    Kurland, O., Lee, L.: PageRank without Hyper-links: Structural Re-ranking using Links Induced by Language models. In: The Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (2005)Google Scholar
  4. 4.
    Mitra, M., Singhal, A., Buckley, C.: Improving Automatic Query Expansion. In: The proceedings of the 21th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (1998)Google Scholar
  5. 5.
    Rocchio, J.: Relevance Feedback in Information Retrieval. In: Salton, G. (ed.) The SMART retrieval system – Experiments in Automatic Query Expansion. Prentice Hall, Englewood Cliffs (1971)Google Scholar
  6. 6.
    Salton, G., Buckley, C.: Improving retrieval performance by relevance feedback. Journal of the American Society of Information Science 41, 288–297 (1990)CrossRefGoogle Scholar
  7. 7.
    Xu, J., Croft, B.: Improving the Effectiveness of Information Retrieval with Local Context Analysis. ACM Transactions on Information Systems 18(1), 79–112 (2000)CrossRefGoogle Scholar
  8. 8.
    Yang, L.P., Ji, D.H., Leong, M.K.: Chinese Document Re-ranking Based on Term Distribution and Maximal Marginal Relevance. In: Lee, G.G., Yamada, A., Meng, H., Myaeng, S.-H. (eds.) AIRS 2005. LNCS, vol. 3689, pp. 299–311. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  9. 9.
    Zhang, B.Y., Li, H., Liu, Y., Ji, L., Xi, W., Fan, W., Chen, Z., Ma, W.: Improving Search Results using Affinity Graph. In: The Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (2005)Google Scholar
  10. 10.
    Zhou, D.Y., Weston, J., Gretton, A., Bousquet, O., Schölkopf, B.: Ranking on Data Manifolds. Advances in Neural Information Processing Systems 16, 169–176 (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Lingpeng Yang
    • 1
  • Donghong Ji
    • 1
  • Yu Nie
    • 1
  • Tingting He
    • 2
  1. 1.Institute for Infocomm ResearchSingapore
  2. 2.Huazhong Normal UniversityWuhanChina

Personalised recommendations