Abstract
It has been widely observed that queries of search engine are becoming longer and closer to natural language. Actually, current search engines do not perform well with natural language queries. Accurately discovering the key concepts of these queries can dramatically improve the effectiveness of search engines. It has been shown that queries seem to be composed in a way that how users summarize documents, which is so much similar to anchor texts. In this paper, we present a technique for automatic extraction of key concepts from queries with anchor texts analysis. Compared with using web counts of documents, we proposed a supervised machine learning model to classify the concepts of queries into 3 sets according to their importance and types. In the end of this paper, we also demonstrate that our method has remarkable improvement over the baseline.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Hu, J., et al.: Understanding User’s Query Intent with Wikipedia. In: WWW 2009 (2009)
Bendersky, et al.: Learning Concept Importance Using a Weighted Dependence Model. In: WSDM 2010 (2010)
Pickens, J., Croft, W.B.: An exploratory analysis of phrases in text retrieval. In: Proc. of RIAO 2000 (1999)
Mishne, G., et al.: Boosting web retrieval through query operations. In: Losada, D.E., Fernández-Luna, J.M. (eds.) ECIR 2005. LNCS, vol. 3408, pp. 502–516. Springer, Heidelberg (2005)
Auria1, et al.: Support Vector Machines as a Technique for Solvency Analysis (2008)
Bendersky, M., et al.: Discovering Key Concepts in Verbose Queries. In: SIGIR 2008 (2008)
Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. In: Machine Learning: Proceedings of the Thirteenth International Conference, pp. 148–156 (1996)
Peng, J., et al.: Incorporating term dependency in the dfr framework. In: SIGIR 2007 (2007)
Kumaran, et al.: Reducing Long Queries Using Query Quality Predictors. In: SIGIR 2009 (2009)
Hiemstra, D.: Term-specific smoothing for the language modeling approach to information retrieval: the importance of a query term. In: SIGIR 2002 (2002)
Mei, Q., Fang, H., Zhai, C.: A study of poisson query generation model for information retrieval. In: SIGIR 2007 (2007)
Tao, et al.: An exploration of proximity measures in information retrieval. In: SIGIR 2007 (2007)
Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Inf. Process. Manage. 24(5), 513–523 (1988)
Huang, J., Gao, J., Miao, J., Li, X., Wang, K., Behr, F.: Exploring Web Scale Language Models for Search Query Processing. In: WWW 2010 (2010)
Ren, P., Yu, Y.: Web site traffic ranking estimation via SVM. In: Huang, D.-S., Zhang, X., Reyes García, C.A., Zhang, L. (eds.) ICIC 2010. LNCS, vol. 6216, pp. 487–494. Springer, Heidelberg (2010)
Bai, J., Chang, Y., et al.: Investigation of partial query proximity in web search. In: WWW 2008 (2008)
Bendersky, M., Croft, W.B., Smith, D.A.: Two-stage query segmentation for information retrieval. In: Proc. SIGIR 2009 (2009)
Cummins, R., O’Riordan, C.: Learning in a pairwise term-term proximity framework for information retrieval
Metzler, D., et al.: A Markov Random Field model for term dependencies. In: SIGIR 2005 (2005)
Allan, J., Callan, J., Bruce Croft, W., Ballesteros, L., Broglio, J., Xu, J., Shu, H.: INQUERY at TREC-5. pp. 119-132. NIST (1997)
Pairwise Comparison, http://en.wikipedia.org/wiki/Pairwise_comparison
Kenneth, et al.: Poisson mixtures. Natural Language Engineering 1(2), 163–190 (1995)
Deng, H., King, I., Lyu, M.R.: Entropy-biased Models for Query Representation on the Click Graph. In: SIGIR 2009 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Liu, J., Ren, P. (2011). Key Concepts Identification and Weighting in Search Engine Queries. In: Du, X., Fan, W., Wang, J., Peng, Z., Sharaf, M.A. (eds) Web Technologies and Applications. APWeb 2011. Lecture Notes in Computer Science, vol 6612. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20291-9_37
Download citation
DOI: https://doi.org/10.1007/978-3-642-20291-9_37
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-20290-2
Online ISBN: 978-3-642-20291-9
eBook Packages: Computer ScienceComputer Science (R0)