Abstract
Understanding what the user is looking for is at the heart of delivering a quality search experience. The focus of this chapter is on obtaining semantically enriched representations of search queries with the help of knowledge repositories. Specifically, we (1) identify the types or categories of entities that are targeted by the query, (2) recognize specific entity mentions in queries and annotate them with unique identifiers from the underlying knowledge repository, and (3) automatically generate query templates from a search log, which then can provide structured interpretations of queries.
Download chapter PDF
References
Agarwal, G., Kabra, G., Chang, K.C.C.: Towards rich query interpretation: Walking back and forth for mining query templates. In: Proceedings of the 19th international conference on World Wide Web, WWW ’10, pp. 1–10. ACM (2010). doi: 10.1145/1772690.1772692
Agichtein, E., Brill, E., Dumais, S., Ragno, R.: Learning user interaction models for predicting web search result preferences. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’06, pp. 3–10. ACM (2006). doi: 10.1145/1148170.1148175
Arguello, J., Diaz, F., Callan, J., Crespo, J.F.: Sources of evidence for vertical selection. In: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, SIGIR ’09, pp. 315–322. ACM (2009). doi: 10.1145/1571941.1571997
Ashkan, A., Clarke, C.L.A.: Characterizing commercial intent. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management, CIKM ’09, pp. 67–76. ACM (2009). doi: 10.1145/1645953.1645965
Balog, K., Neumayer, R.: Hierarchical target type identification for entity-oriented queries. In: Proceedings of the 21st ACM international conference on Information and knowledge management, CIKM ’12, pp. 2391–2394. ACM (2012). doi: 10.1145/2396761.2398648
Barr, C., Jones, R., Regelson, M.: The linguistic structure of English web-search queries. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP ’08, pp. 1021–1030 (2008)
Bendersky, M., Croft, W.B.: Discovering key concepts in verbose queries. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’08, pp. 491–498. ACM (2008). doi: 10.1145/1390334.1390419
Bendersky, M., Croft, W.B., Smith, D.A.: Structural annotation of search queries using pseudo-relevance feedback. In: Proceedings of the 19th ACM international conference on Information and knowledge management, CIKM ’10, pp. 1537–1540. ACM (2010). doi: 10.1145/1871437.1871666
Bendersky, M., Croft, W.B., Smith, D.A.: Joint annotation of search queries. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1, pp. 102–111. Association for Computational Linguistics (2011)
Berant, J., Chou, A., Frostig, R., Liang, P.: Semantic parsing on freebase from question-answer pairs. In: Empirical Methods in Natural Language Processing, EMNLP ’13, pp. 1533–1544. Association for Computational Linguistics (2013)
Bergsma, S., Wang, Q.I.: Learning noun phrase query segmentation. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, EMNLP-CoNLL ’07, pp. 819–826. Association for Computational Linguistics (2007)
Blanco, R., Ottaviano, G., Meij, E.: Fast and space-efficient entity linking for queries. In: Proceedings of the Eighth ACM International Conference on Web Search and Data Mining - WSDM ’15, pp. 179–188. ACM (2015). doi: 10.1145/2684822.2685317
Bortnikov, E., Donmez, P., Kagian, A., Lempel, R.: Modeling transactional queries via templates. In: Proceedings of the 34th European Conference on Advances in Information Retrieval, ECIR ’12, pp. 13–24. Springer (2012). doi: 10.1007/978-3-642-28997-2_2
Brants, T., Franz, A.: Web 1T 5-gram Version 1 LDC2006T13 (2006)
Brenes, D.J., Gayo-Avello, D., Garcia, R.: On the fly query entity decomposition using snippets. CoRR abs/1005.5 (2010)
Broder, A.: A taxonomy of web search. SIGIR Forum 36(2), 3–10 (2002)
Carmel, D., Chang, M.W., Gabrilovich, E., Hsu, B.J.P., Wang, K.: ERD’14: Entity recognition and disambiguation challenge. SIGIR Forum 48(2), 63–77 (2014). doi: 10.1145/2701583.2701591
Cornolti, M., Ferragina, P., Ciaramita, M.: A framework for benchmarking entity-annotation systems. In: Proceedings of the 22nd International Conference on World Wide Web, WWW ’13, pp. 249–260 (2013). doi: 10.1145/2488388.2488411
Cornolti, M., Ferragina, P., Ciaramita, M., Rüd, S., Schütze, H.: A piggyback system for joint entity mention detection and linking in web queries. In: Proceedings of the 25th International Conference on World Wide Web, WWW ’16, pp. 567–578. International World Wide Web Conferences Steering Committee (2016). doi: 10.1145/2872427.2883061
Cucerzan, S., Brill, E.: Spelling correction as an iterative process that exploits the collective knowledge of web users. In: Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, EMNLP ’04 (2004)
Dai, H.K., Zhao, L., Nie, Z., Wen, J.R., Wang, L., Li, Y.: Detecting online commercial intention (OCI). In: Proceedings of the 15th International Conference on World Wide Web, WWW ’06, pp. 829–837. ACM (2006). doi: 10.1145/1135777.1135902
Diaz, F.: Integration of news content into web results. In: Proceedings of the Second ACM International Conference on Web Search and Data Mining, WSDM ’09, pp. 182–191. ACM (2009). doi: 10.1145/1498759.1498825
Ferragina, P., Scaiella, U.: TAGME: on-the-fly annotation of short text fragments (by Wikipedia entities). In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, CIKM ’10, pp. 1625–1628. ACM (2010). doi: 10.1145/1871437.1871689
Gabrilovich, E., Broder, A., Fontoura, M., Joshi, A., Josifovski, V., Riedel, L., Zhang, T.: Classifying search queries using the web as a source of knowledge. ACM Trans. Web 3(2), 5:1–5:28 (2009). doi: 10.1145/1513876.1513877
Gabrilovich, E., Markovitch, S.: Computing semantic relatedness using Wikipedia-based explicit semantic analysis. In: Proceedings of the 20th International Joint Conference on Artificial Intelligence, IJCAI’07, pp. 1606–1611. Morgan Kaufmann Publishers Inc. (2007)
Ganti, V., He, Y., Xin, D.: Keyword++: A framework to improve keyword search over entity databases. Proc. VLDB Endow. 3(1–2), 711–722 (2010). doi: 10.14778/1920841.1920932
Garigliotti, D., Hasibi, F., Balog, K.: Target type identification for entity-bearing queries. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’17. ACM (2017). doi: 10.1145/3077136.3080659
Guo, J., Xu, G., Cheng, X., Li, H.: Named entity recognition in query. In: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, SIGIR ’09, pp. 267–274. ACM (2009). doi: 10.1145/1571941.1571989
Guo, J., Xu, G., Li, H., Cheng, X.: A unified and discriminative model for query refinement. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’08, pp. 379–386 (2008). doi: 10.1145/1390334.1390400
Hagen, M., Potthast, M., Stein, B., Bräutigam, C.: Query segmentation revisited. In: Proceedings of the 20th International Conference on World Wide Web, WWW ’11, pp. 97–106 (2011). doi: 10.1145/1963405.1963423
Han, J., Fan, J., Zhou, L.: Crowdsourcing-assisted query structure interpretation. In: Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence, IJCAI ’13, pp. 2092–2098. AAAI Press (2013)
Hasibi, F., Balog, K., Bratsberg, S.E.: Entity linking in queries: Tasks and evaluation. In: Proceedings of the 2015 International Conference on The Theory of Information Retrieval, ICTIR ’15, pp. 171–180. ACM (2015). doi: 10.1145/2808194.2809473
Hasibi, F., Balog, K., Bratsberg, S.E.: Exploiting entity linking in queries for entity retrieval. In: Proceedings of the 2016 ACM on International Conference on the Theory of Information Retrieval, ICTIR ’16, pp. 209–218. ACM (2016). doi: 10.1145/2970398.2970406
Hasibi, F., Balog, K., Bratsberg, S.E.: Entity linking in queries: Efficiency vs. effectiveness. In: Proceedings of the 39th European conference on Advances in Information Retrieval, ECIR ’17, pp. 40–53. Springer (2017). doi: 10.1007/978-3-319-56608-5_4
Hu, J., Wang, G., Lochovsky, F., Sun, J.t., Chen, Z.: Understanding user’s query intent with Wikipedia. In: Proceedings of the 18th International Conference on World Wide Web, WWW ’09, pp. 471–480. ACM (2009). doi: 10.1145/1526709.1526773
Huang, J., Gao, J., Miao, J., Li, X., Wang, K., Behr, F., Giles, C.L.: Exploring web scale language models for search query processing. In: Proceedings of the 19th International Conference on World Wide Web, WWW ’10, pp. 451–460. ACM (2010). doi: 10.1145/1772690.1772737
Huston, S., Croft, W.B.: Evaluating verbose query processing techniques. In: Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’10, pp. 291–298. ACM (2010). doi: 10.1145/1835449.1835499
Jansen, B.J., Booth, D.: Classifying web queries by topic and user intent. In: Proceedings of the 28th of the international conference extended abstracts on Human factors in computing systems, CHI EA ’10, pp. 4285–4290. ACM (2010)
Jansen, B.J., Booth, D.L., Spink, A.: Determining the informational, navigational, and transactional intent of web queries. Inf. Process. Manage. 44(3), 1251–1266 (2008). doi: 10.1016/j.ipm.2007.07.015
Jones, R., Rey, B., Madani, O., Greiner, W.: Generating query substitutions. In: Proceedings of the 15th International Conference on World Wide Web, WWW ’06, pp. 387–396. ACM (2006). doi: 10.1145/1135777.1135835
Kaptein, R., Serdyukov, P., De Vries, A., Kamps, J.: Entity ranking using Wikipedia as a pivot. In: Proceedings of the 19th ACM international conference on Information and knowledge management, CIKM ’10, pp. 69–78. ACM (2010). doi: 10.1145/1871437.1871451
Kraaij, W., Spitters, M.: Language models for topic tracking. In: Croft, W., Lafferty, J. (eds.) Language Modeling for Information Retrieval, The Springer International Series on Information Retrieval, vol. 13, pp. 95–123. Springer (2003)
Lee, U., Liu, Z., Cho, J.: Automatic identification of user goals in web search. In: Proceedings of the 14th International Conference on World Wide Web, WWW ’05, pp. 391–400. ACM (2005). doi: 10.1145/1060745.1060804
Li, X.: Understanding the semantic structure of noun phrase queries. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, ACL ’10, pp. 1337–1345. Association for Computational Linguistics (2010)
Li, X., Wang, Y.Y., Acero, A.: Learning query intent from regularized click graphs. In: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR ’08, pp. 339–346. ACM (2008). doi: 10.1145/1390334.1390393
Li, X., Wang, Y.Y., Acero, A.: Extracting structured information from user queries with semi-supervised conditional random fields. In: Proceedings of the 32Nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’09, pp. 572–579. ACM (2009). doi: 10.1145/1571941.1572039
Li, Y., Hsu, B.J.P., Zhai, C., Wang, K.: Unsupervised query segmentation using clickthrough for information retrieval. In: Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’11, pp. 285–294. ACM (2011). doi: 10.1145/2009916.2009957
Li, Y., Zheng, Z., Dai, H.K.: KDD CUP-2005 report: facing a great challenge. SIGKDD Explor. Newsl. 7(2), 91–99 (2005)
Lin, T., Pantel, P., Gamon, M., Kannan, A., Fuxman, A.: Active objects. In: Proceedings of the 21st international conference on World Wide Web, WWW ’12, pp. 589–598. ACM (2012). doi: 10.1145/2187836.2187916
Liu, X., Zhang, S., Wei, F., Zhou, M.: Recognizing named entities in tweets. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1, HLT ’11, pp. 359–367. Association for Computational Linguistics (2011)
Manshadi, M., Li, X.: Semantic tagging of web search queries. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2, ACL ’09, pp. 861–869. Association for Computational Linguistics (2009)
Marcus, M.P., Marcinkiewicz, M.A., Santorini, B.: Building a large annotated corpus of English: The Penn Treebank. Comput. Linguist. 19(2), 313–330 (1993)
Meij, E., Bron, M., Hollink, L., Huurnink, B., de Rijke, M.: Mapping queries to the linking open data cloud: A case study using DBpedia. Web Semant. 9(4), 418–433 (2011)
Meij, E., Weerkamp, W., De Rijke, M.: Adding semantics to microblog posts. In: Proceedings of the Fifth ACM International Conference on Web Search and Data Mining, WSDM ’12, pp. 563–572. ACM (2012). doi: 10.1145/2124295.2124364
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th International Conference on Neural Information Processing Systems, NIPS’13, pp. 3111–3119. Curran Associates Inc. (2013)
Mishra, N., Saha Roy, R., Ganguly, N., Laxman, S., Choudhury, M.: Unsupervised query segmentation using only query logs. In: Proceedings of the 20th International Conference Companion on World Wide Web, WWW ’11, pp. 91–92. ACM (2011). doi: 10.1145/1963192.1963239
Murnane, E.L., Haslhofer, B., Lagoze, C.: RESLVE: leveraging user interest to improve entity disambiguation on short text. In: Proceedings of the 22nd International Conference on World Wide Web, WWW ’13 Companion, pp. 81–82. ACM (2013). doi: 10.1145/2487788.2487823
Pantel, P., Lin, T., Gamon, M.: Mining entity types from query logs via user intent modeling. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1, ACL ’12, pp. 563–571. Association for Computational Linguistics (2012)
Paparizos, S., Ntoulas, A., Shafer, J., Agrawal, R.: Answering web queries using structured data sources. In: Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data, SIGMOD ’09, pp. 1127–1130. ACM (2009). doi: 10.1145/1559845.1560000
Piccinno, F., Ferragina, P.: From TagME to WAT: A new entity annotator. In: Proceedings of the First International Workshop on Entity Recognition & Disambiguation, ERD ’14, pp. 55–62. ACM (2014). doi: 10.1145/2633211.2634350
Pound, J., Hudek, A.K., Ilyas, I.F., Weddell, G.: Interpreting keyword queries over web knowledge bases. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, CIKM ’12, pp. 305–314. ACM (2012). doi: 10.1145/2396761.2396803
Risvik, K.M., Mikolajewski, T., Boros, P.: Query segmentation for web search. In: Proceedings of the 12th International Conference on World Wide Web, WWW ’03 (2003)
Rose, D.E., Levinson, D.: Understanding user goals in web search. In: Proceedings of the 13th International Conference on World Wide Web, WWW ’04, pp. 13–19 (2004). doi: 10.1145/988672.988675
Rüd, S., Ciaramita, M., Müller, J., Schütze, H.: Piggyback: Using search engines for robust cross-domain named entity recognition. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 965–975 (2011)
Saha Roy, R., Ganguly, N., Choudhury, M., Laxman, S.: An IR-based evaluation framework for web search query segmentation. In: Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’12, pp. 881–890. ACM (2012). doi: 10.1145/2348283.2348401
Sarkas, N., Paparizos, S., Tsaparas, P.: Structured annotations of web queries. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, SIGMOD ’10, pp. 771–782 (2010). doi: 10.1145/1807167.1807251
Sawant, U., Chakrabarti, S.: Learning joint query interpretation and response ranking. In: Proceedings of the 22nd International Conference on World Wide Web, WWW ’13, pp. 1099–1109 (2013). doi: 10.1145/2488388.2488484
Shen, D., Sun, J.T., Yang, Q., Chen, Z.: Building bridges for web query classification. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’06, pp. 131–138. ACM (2006). doi: 10.1145/1148170.1148196
Speretta, M., Gauch, S.: Personalized search based on user search histories. In: Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence, WI ’05, pp. 622–628. IEEE Computer Society (2005). doi: 10.1109/WI.2005.114
Srba, I., Bielikova, M.: A comprehensive survey and classification of approaches for community question answering. ACM Trans. Web 10(3), 18:1–18:63 (2016). doi: 10.1145/2934687
Tan, B., Peng, F.: Unsupervised query segmentation using generative language models and Wikipedia. In: Proceedings of the 17th international conference on World Wide Web, WWW ’08, pp. 347–356. ACM (2008). doi: 10.1145/1367497.1367545
Teevan, J., Dumais, S.T., Horvitz, E.: Personalizing search via automated analysis of interests and activities. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’05, pp. 449–456. ACM (2005). doi: 10.1145/1076034.1076111
Tonon, A., Catasta, M., Prokofyev, R., Demartini, G., Aberer, K., Cudré-Mauroux, P.: Contextualized ranking of entity types based on knowledge graphs. Web Semant. 37–38, 170–183 (2016). doi: 10.1016/j.websem.2015.12.005
Toutanova, K., Klein, D., Manning, C.D., Singer, Y.: Feature-rich part-of-speech tagging with a cyclic dependency network. In: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1, NAACL ’03, pp. 173–180. Association for Computational Linguistics (2003). doi: 10.3115/1073445.1073478
Tsur, G., Pinter, Y., Szpektor, I., Carmel, D.: Identifying web queries with question intent. In: Proceedings of the 25th International Conference on World Wide Web, WWW ’16, pp. 783–793. International World Wide Web Conferences Steering Committee (2016). doi: 10.1145/2872427.2883058
Ullegaddi, P.V., Varma, V.: Learning to rank categories for web queries. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, CIKM ’11, pp. 2065–2068. ACM (2011). doi: 10.1145/2063576.2063891
Vallet, D., Zaragoza, H.: Inferring the most important types of a query: A semantic approach. In: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR ’08, pp. 857–858. ACM (2008). doi: 10.1145/1390334.1390541
Voskarides, N., Meij, E., Tsagkias, M., de Rijke, M., Weerkamp, W.: Learning to explain entity relationships in knowledge graphs. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 564–574. Association for Computational Linguistics (2015)
Wei, X., Peng, F., Dumoulin, B.: Analyzing web text association to disambiguate abbreviation in queries. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’08, pp. 751–752 (2008). doi: 10.1145/1390334.1390485
Yih, S.W.t., Chang, M.W., He, X., Gao, J.: Semantic parsing via staged query graph generation: Question answering with knowledge base. In: Proceedings of the Joint Conference of the 53rd Annual Meeting of the ACL and the 7th International Joint Conference on Natural Language Processing of the AFNLP. ACL - Association for Computational Linguistics (2015)
Yin, X., Shah, S.: Building taxonomy of web search intents for name entity queries. In: Proceedings of the 19th International Conference on World Wide Web, WWW ’10, pp. 1001–1010. ACM (2010). doi: 10.1145/1772690.1772792
Zhang, S., Balog, K.: Design patterns for fusion-based object retrieval. In: Proceedings of the 39th European conference on Advances in Information Retrieval, ECIR ’17. Springer (2017). doi: 10.1007/978-3-319-56608-5_66
Zhou, K., Cummins, R., Halvey, M., Lalmas, M., Jose, J.M.: Assessing and predicting vertical intent for web queries. In: Proceedings of the 34th European conference on Advances in Information Retrieval, ECIR’12, pp. 499–502. Springer (2012). doi: 10.1007/978-3-642-28997-2_50
Author information
Authors and Affiliations
Rights and permissions
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this book are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Copyright information
© 2018 The Editor(s) (if applicable) and the Author(s)
About this chapter
Cite this chapter
Balog, K. (2018). Understanding Information Needs. In: Entity-Oriented Search. The Information Retrieval Series, vol 39. Springer, Cham. https://doi.org/10.1007/978-3-319-93935-3_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-93935-3_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-93933-9
Online ISBN: 978-3-319-93935-3
eBook Packages: Computer ScienceComputer Science (R0)