Abstract
In the information retrieval system, relevance manifestation is pivotal and regularly based on document-term statistics, i.e. term frequency (tf), inverse document frequency (idf), etc.. Query Term Proximity (QTP) within matched documents is mostly under-explored for the relevance estimation in the information retrieval. In this paper, we systematically review the lineage of the notion of QTP in IR and proposed a novel framework for relevance estimation. The proposed framework is referred as Adaptive QTP based User Information Retrieval (AQtpUIR), is intended to promote the document’s relevance among all relevant retrieved ones. Here, the relevance estimation is a weighted combination of document-term (DT) statistics and query-term (QT) statistics. The notions ‘term-term query proximity’ is a simple aggregation of contextual aspects of user search in relevance estimates and query formation. Intuitively, QTP is exploited to promote the documents for balanced exploitation-exploration, and eventually navigate a search towards goals. The design analysis asserts the usability of QTP measures to balance several seeking tradeoffs, e.g. relevance, novelty, result diversity (Coverage and Topicality), and highlight various inherent challenges and issue of the proposed work.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval, vol. 463. ACM Press, New York (1999)
Croft, B.: The importance of interaction in information retrieval. In: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1–2. ACM, July 2019
Schütze, H., Manning, C.D., Raghavan, P.: Introduction to information retrieval. In: Proceedings of the International Communication of Association for Computing Machinery Conference, p. 260, June 2008
Büttcher, S., Clarke, C.L., Lushman, B.: Term proximity scoring for ad-hoc retrieval on very large text collections. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 621–622. ACM, August 2006
White, R.W.: Interactions with Search Systems. Cambridge University Press, Cambridge (2016)
Croft, W.B., Metzler, D., Strohman, T.: Search Engines: Information Retrieval in Practice, vol. 520. Addison-Wesley, Reading (2010)
Rasolofo, Y., Savoy, J.: Term proximity scoring for keyword-based retrieval systems. In: Sebastiani, F. (ed.) ECIR 2003. LNCS, vol. 2633, pp. 207–218. Springer, Heidelberg (2003). https://doi.org/10.1007/3-540-36618-0_15
Khennak, I., Drias, H.: A novel hybrid correlation measure for query expansion-based information retrieval. In: Critical Approaches to Information Retrieval Research, pp. 1–19. IGI Global (2020)
Idreos, S., Papaemmanouil, O., Chaudhuri, S.: Overview of data exploration techniques. In: ACM SIGMOD International Conference on Management of Data, pp. 277–281 (2015)
Patel, J., Singh, V.: Query morphing: a proximity-based approach for data exploration and query reformulation. In: Ghosh, A., Pal, R., Prasath, R. (eds.) MIKE 2017. LNCS (LNAI), vol. 10682, pp. 261–273. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-71928-3_26
Liu, X., Croft, W.B.: Passage retrieval based on language models. In: Proceedings of CIKM 2002, pp. 375–382 (2002)
Song, Y., Hu, Q.V., He, L.: Let terms choose their own kernels: an intelligent approach to kernel selection for healthcare search. Inf. Sci. 485, 55–70 (2019)
Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Inf. Process. Manag. 24(5), 513–523 (1988)
Paik, J.H.: A novel TF-IDF weighting scheme for effective ranking. In: Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval (2013)
He, B., Huang, J.X., Zhou, X.: Modeling term proximity for probabilistic information retrieval models. Inf. Sci. 181(14), 3017–3031 (2011)
Miao, J., Huang, J.X., Ye, Z.: Proximity-based rocchio’s model for pseudo relevance. In: Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 535–544. ACM, August 2012
Zhao, J., Huang, J.X., Ye, Z.: Modeling term associations for probabilistic information retrieval. ACM Trans. Inf. Syst. (TOIS) 32(2), 7 (2014)
Saracevic, T.: The notion of relevance in information science: everybody knows what relevance is: But, what is it really? Synth. Lect. Inf. Concepts Retrieval Serv. 8(3), i–109 (2016)
Cummins, R., O’Riordan, C.: Learning in a pairwise term-term proximity framework for information retrieval. In: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 251–258, July 2009
Callan, J.P.: Passage-level evidence in document retrieval. In: Croft, W.B., van Rijsbergen, C. (eds.) Proceedings of the Seventeenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Dublin, Ireland, July 1994, pp. 302–310. Spring-Verlag (1994)
Kaszkiel, M., Zobel, J.: Effective ranking with arbitrary passages. J. Am. Soc. Inf. Sci. 52(4), 344–364 (2001)
Barry, C.L.: User-defined relevance criteria: an exploratory study. J. Am. Soc. Inf. Sci. 45(3), 149–159 (1994)
Robertson, S., Zaragoza, H.: The probabilistic relevance framework: BM25 and beyond. Found. Trends® Inf. Retrieval 3(4), 333–389 (2009)
Büttcher, S., Clarke, C.L.A.: Efficiency vs. effectiveness in terabyte-scale information retrieval. In: TREC (2005)
He, B., Ounis, I.: Term frequency normalisation tuning for BM25 and DFR models. In: Losada, D.E., Fernández-Luna, J.M. (eds.) ECIR 2005. LNCS, vol. 3408, pp. 200–214. Springer, Heidelberg (2005). https://doi.org/10.1007/978-3-540-31865-1_15
Song, F., Croft, B.: A general language model for information retrieval. In: Proceedings of the 1999 ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 279–280 (1999)
Salton, G., Allan, J., Buckley, C.: Approaches to passage retrieval in full text information systems. In: Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 49–58 (1993)
Beigbeder, M., Mercier, A.: An information retrieval model using the fuzzy proximity degree of term occurences. In: Proceedings of the 2005 ACM Symposium on Applied Computing. ACM (2005)
Clarke, C.L.A., Cormack, G.V., Burkowski, F.J.: Shortest substring ranking (MultiText experiments for TREC-4). In: TREC, vol. 4 (1995)
Hawking, D., Thistlewaite, P.: Proximity operators-so near and yet so far. In: Proceedings of the 4th Text Retrieval Conference (1995)
Singh, V.: Predicting search intent based on in-search context for exploratory search. Int. J. Adv. Pervasive Ubiquit. Comput. (IJAPUC) 11(3), 53–75 (2019)
Singh, V., Dave, M.: Improving result diversity using query term proximity in exploratory search. In: Madria, S., Fournier-Viger, P., Chaudhary, S., Reddy, P.K. (eds.) BDA 2019. LNCS, vol. 11932, pp. 67–87. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-37188-3_5
Arroyuelo, D., et al.: To index or not to index: time-space trade-offs for positional ranking functions in search engines. Inf. Syst. (2019). https://doi.org/10.1016/j.is.2019.101466
Zhao, J., Yun, Y.: A proximity language model for information retrieval. In: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 291–298. ACM, July 2009
Song, R., Taylor, M.J., Wen, J.-R., Hon, H.-W., Yu, Y.: Viewing term proximity from a different perspective. In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R.W. (eds.) ECIR 2008. LNCS, vol. 4956, pp. 346–357. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-78646-7_32
Qiao, Y., Du, Q., Wan, D.: A study on query terms proximity embedding for information retrieval. Int. J. Distrib. Sens. Netw. 13(2) (2017). https://doi.org/10.1177/1550147717694891
Pitis, S.: Methods for retrieving alternative contract language using a prototype. In: Proceedings of the 16th Edition of the International Conference on Articial Intelligence and Law. ACM (2017)
Veretennikov, A.B.: Proximity full-text search by means of additional indexes with multi-component keys: in pursuit of optimal performance. In: Manolopoulos, Y., Stupnikov, S. (eds.) DAMDID/RCDL 2018. CCIS, vol. 1003, pp. 111–130. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-23584-0_7
Pan, M., et al.: An adaptive term proximity based rocchio’s model for clinical decision support retrieval. BMC Med. Inform. Decis. Mak. 19(9) (2019). Article number: 251. https://doi.org/10.1186/s12911-019-0986-6
Schenkel, R., Broschart, A., Hwang, S., Theobald, M., Weikum, G.: Efficient text proximity search. In: Ziviani, N., Baeza-Yates, R. (eds.) SPIRE 2007. LNCS, vol. 4726, pp. 287–299. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-75530-2_26
Svore, K.M., Kanani, P.H., Khan, N.: How good is a span of terms? Exploiting proximity to improve web retrieval. In: Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 154–161. ACM, July 2010
Arroyuelo, D., et al.: To index or not to index: time-space trade-offs for positional ranking functions in search engines. Inf. Syst. (2019). https://doi.org/10.1016/j.is.2019.101466
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Barik, T., Singh, V. (2020). Placing Query Term Proximity in Search Context. In: Bhattacharjee, A., Borgohain, S., Soni, B., Verma, G., Gao, XZ. (eds) Machine Learning, Image Processing, Network Security and Data Sciences. MIND 2020. Communications in Computer and Information Science, vol 1240. Springer, Singapore. https://doi.org/10.1007/978-981-15-6315-7_1
Download citation
DOI: https://doi.org/10.1007/978-981-15-6315-7_1
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-6314-0
Online ISBN: 978-981-15-6315-7
eBook Packages: Computer ScienceComputer Science (R0)