Skip to main content

Key Concepts Identification and Weighting in Search Engine Queries

  • Conference paper
  • 1081 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6612))

Abstract

It has been widely observed that queries of search engine are becoming longer and closer to natural language. Actually, current search engines do not perform well with natural language queries. Accurately discovering the key concepts of these queries can dramatically improve the effectiveness of search engines. It has been shown that queries seem to be composed in a way that how users summarize documents, which is so much similar to anchor texts. In this paper, we present a technique for automatic extraction of key concepts from queries with anchor texts analysis. Compared with using web counts of documents, we proposed a supervised machine learning model to classify the concepts of queries into 3 sets according to their importance and types. In the end of this paper, we also demonstrate that our method has remarkable improvement over the baseline.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Hu, J., et al.: Understanding User’s Query Intent with Wikipedia. In: WWW 2009 (2009)

    Google Scholar 

  2. Bendersky, et al.: Learning Concept Importance Using a Weighted Dependence Model. In: WSDM 2010 (2010)

    Google Scholar 

  3. Pickens, J., Croft, W.B.: An exploratory analysis of phrases in text retrieval. In: Proc. of RIAO 2000 (1999)

    Google Scholar 

  4. Mishne, G., et al.: Boosting web retrieval through query operations. In: Losada, D.E., Fernández-Luna, J.M. (eds.) ECIR 2005. LNCS, vol. 3408, pp. 502–516. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  5. Auria1, et al.: Support Vector Machines as a Technique for Solvency Analysis (2008)

    Google Scholar 

  6. Bendersky, M., et al.: Discovering Key Concepts in Verbose Queries. In: SIGIR 2008 (2008)

    Google Scholar 

  7. Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. In: Machine Learning: Proceedings of the Thirteenth International Conference, pp. 148–156 (1996)

    Google Scholar 

  8. Peng, J., et al.: Incorporating term dependency in the dfr framework. In: SIGIR 2007 (2007)

    Google Scholar 

  9. Kumaran, et al.: Reducing Long Queries Using Query Quality Predictors. In: SIGIR 2009 (2009)

    Google Scholar 

  10. Hiemstra, D.: Term-specific smoothing for the language modeling approach to information retrieval: the importance of a query term. In: SIGIR 2002 (2002)

    Google Scholar 

  11. Mei, Q., Fang, H., Zhai, C.: A study of poisson query generation model for information retrieval. In: SIGIR 2007 (2007)

    Google Scholar 

  12. Tao, et al.: An exploration of proximity measures in information retrieval. In: SIGIR 2007 (2007)

    Google Scholar 

  13. Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Inf. Process. Manage. 24(5), 513–523 (1988)

    Article  Google Scholar 

  14. Huang, J., Gao, J., Miao, J., Li, X., Wang, K., Behr, F.: Exploring Web Scale Language Models for Search Query Processing. In: WWW 2010 (2010)

    Google Scholar 

  15. Ren, P., Yu, Y.: Web site traffic ranking estimation via SVM. In: Huang, D.-S., Zhang, X., Reyes García, C.A., Zhang, L. (eds.) ICIC 2010. LNCS, vol. 6216, pp. 487–494. Springer, Heidelberg (2010)

    Google Scholar 

  16. Bai, J., Chang, Y., et al.: Investigation of partial query proximity in web search. In: WWW 2008 (2008)

    Google Scholar 

  17. Bendersky, M., Croft, W.B., Smith, D.A.: Two-stage query segmentation for information retrieval. In: Proc. SIGIR 2009 (2009)

    Google Scholar 

  18. Cummins, R., O’Riordan, C.: Learning in a pairwise term-term proximity framework for information retrieval

    Google Scholar 

  19. Metzler, D., et al.: A Markov Random Field model for term dependencies. In: SIGIR 2005 (2005)

    Google Scholar 

  20. Allan, J., Callan, J., Bruce Croft, W., Ballesteros, L., Broglio, J., Xu, J., Shu, H.: INQUERY at TREC-5. pp. 119-132. NIST (1997)

    Google Scholar 

  21. Pairwise Comparison, http://en.wikipedia.org/wiki/Pairwise_comparison

  22. Kenneth, et al.: Poisson mixtures. Natural Language Engineering 1(2), 163–190 (1995)

    Google Scholar 

  23. Deng, H., King, I., Lyu, M.R.: Entropy-biased Models for Query Representation on the Click Graph. In: SIGIR 2009 (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Liu, J., Ren, P. (2011). Key Concepts Identification and Weighting in Search Engine Queries. In: Du, X., Fan, W., Wang, J., Peng, Z., Sharaf, M.A. (eds) Web Technologies and Applications. APWeb 2011. Lecture Notes in Computer Science, vol 6612. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20291-9_37

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-20291-9_37

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-20290-2

  • Online ISBN: 978-3-642-20291-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics