Abstract
This chapter demonstrates how feature-based models can be extended and used for query expansion using a technique known as latent concept expansion (LCE). The approach has three key benefits, including the ability to go beyond the bag of words assumption, the ability to employ arbitrary features during the query expansion process, and the ability to expand with a variety of concept types beyond unigrams. In addition to the basic LCE model, the chapter also describes a number of powerful extensions, including generalized LCE and LCE using hierarchical MRFs that encode document structure during the expansion process.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Blei, D., Ng, A., & Jordan, M. (2003b). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022.
Broder, A., Ciccolo, P., Gabrilovich, E., Josifovski, V., Metzler, D., Riedel, L., & Yuan, J. (2009). Online expansion of rare queries for sponsored search. In Proceedings of the 18th international conference on World Wide Web, WWW ’09 (pp. 511–520). New York: ACM.
Buckley, C., & Salton, G. (1995). Optimization of relevance feedback weights. In Proc. 18th ann. intl. ACM SIGIR conf. on research and development in information retrieval (pp. 351–357).
Clarke, C. L. A., & Cormack, G. V. (2000). Shortest-substring retrieval and ranking. ACM Transactions on Information Systems, 18(1), 44–78.
Collins-Thompson, K., & Callan, J. (2005). Query expansion using random walk models. In Proc. 14th intl. conf. on information and knowledge management (pp. 704–711).
Croft, W. B. (1986). Boolean queries and term dependencies in probabilistic retrieval models. Journal of the American Society for Information Science, 37(4), 71–77.
Croft, W. B., Turtle, H., & Lewis, D. (1991). The use of phrases and structured queries in information retrieval. In Proc. 14th ann. intl. ACM SIGIR conf. on research and development in information retrieval (pp. 32–45).
Fagan, J. (1987). Automatic phrase indexing for document retrieval: An examination of syntactic and non-syntactic methods. In Proc. tenth ann. intl. ACM SIGIR conf. on research and development in information retrieval (pp. 91–101).
Harper, D., & van Rijsbergen, C. J. (1978). An evaluation of feedback in document retrieval using co-occurrence data. Journal of Documentation, 34(3), 189–216.
Ji, X. & Zha, H. (2003). Domain-independent text segmentation using anisotropic diffusion and dynamic programming. In Proceedings of the 26th annual international ACM SIGIR conference on research and development in information retrieval, SIGIR ’03 (pp. 322–329). New York: ACM.
Kurland, O., & Lee, L. (2004). Corpus structure, language models, and ad hoc information retrieval. In Proc. 27th ann. intl. ACM SIGIR conf. on research and development in information retrieval (pp. 194–201).
Lavrenko, V., & Croft, W. B. (2001). Relevance-based language models. In Proc. 24th ann. intl. ACM SIGIR conf. on research and development in information retrieval (pp. 120–127).
Liu, X., & Croft, W. B. (2004). Cluster-based retrieval using language models. In Proc. 27th ann. intl. ACM SIGIR conf. on research and development in information retrieval (pp. 186–193).
Macdonald, C. & Ounis, I. (2007). Expertise drift and query expansion in expert search. In Proceedings of the sixteenth ACM conference on information and knowledge management, CIKM ’07 (pp. 341–350). New York: ACM.
Metzler, D., & Croft, W. B. (2005). A Markov random field model for term dependencies. In Proc. 28th ann. intl. ACM SIGIR conf. on research and development in information retrieval (pp. 472–479).
Metzler, D., & Croft, W. B. (2007). Latent concept expansion using Markov random fields. In Proc. 30th ann. intl. ACM SIGIR conf. on research and development in information retrieval.
Metzler, D., Strohman, T., Turtle, H., & Croft, W. B. (2004b). Indri at TREC 2004: Terabyte track. In Proc. 13th intl. conf. on World Wide Web.
Metzler, D., Strohman, T., Zhou, Y., & Croft, W. B. (2005b). Indri at TREC 2005: terabyte track. In Proc. 14th intl. conf. on World Wide Web.
Murdock, V., & Croft, W. B. (2005). A translation model for sentence retrieval. In Proc. HLT ’05 (pp. 684–691). Morristown: Association for Computational Linguistics.
Papka, R., & Allan, J. (1997). Why bigger windows are better than smaller ones (Technical report). University of Massachusetts, Amherst.
Ponte, J., & Croft, W. B. (1998). A language modeling approach to information retrieval. In Proc. 21st ann. intl. ACM SIGIR conf. on research and development in information retrieval (pp. 275–281).
Rocchio, J. J. (1971). Relevance feedback in information retrieval (pp. 313–323). New York: Prentice-Hall.
Tao, T., Wang, X., Mei, Q., & Zhai, C. (2006). Language model information retrieval with document expansion. In Proc. of HLT/NAACL (pp. 407–414).
van Rijsbergen, C. J. (1977). A theoretical basis for the use of cooccurrence data in information retrieval. Journal of Documentation, 33(2), 106–119.
Wei, X., & Croft, W. B. (2006). LDA-based document models for ad-hoc retrieval. In Proc. 29th ann. intl. ACM SIGIR conf. on research and development in information retrieval (pp. 178–185).
Xu, J., & Croft, W. B. (2000). Improving the effectiveness of information retrieval with local context analysis. ACM Transactions on Information Systems, 18(1), 79–112.
Zhai, C., & Lafferty, J. (2001a). Model-based feedback in the language modeling approach to information retrieval. In Proc. 10th intl. conf. on information and knowledge management (pp. 403–410).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Metzler, D. (2011). Feature-Based Query Expansion. In: A Feature-Centric View of Information Retrieval. The Information Retrieval Series, vol 27. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22898-8_4
Download citation
DOI: https://doi.org/10.1007/978-3-642-22898-8_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-22897-1
Online ISBN: 978-3-642-22898-8
eBook Packages: Computer ScienceComputer Science (R0)