Abstract
In the information retrieval system, relevance manifestation is pivotal and regularly based on document-term statistics, i.e. term frequency (tf), inverse document frequency (idf), etc. Query term proximity within matched documents is mostly under-explored. In this paper, a novel information retrieval framework is proposed, to promote the documents among all relevant retrieved ones. The relevance estimation is a weighted combination of document statistics and query term statistics, and term-term proximity is a simply aggregates of diverse user preferences aspects in query formation, thus adapted into the framework with conventional relevance measures. Intuitively, QTP is exploited to promote the documents for balanced exploitation-exploration, and eventually navigate a search towards goals. The evaluation asserts the usability of QTP measures to balance several seeking tradeoffs, e.g. relevance, novelty, result diversity (Coverage and Topicality), and overall retrieval. The assessment of user search trails indicates significant growth in a learning outcome.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
White, R.W., Roth, R.A.: Exploratory search: beyond the query-response paradigm. Synthesis Lect. Inform. Concepts Retrieval Serv. 1(1), 1–98 (2009)
Idreos, S., Papaemmanouil, O., Chaudhuri, S.: Overview of data exploration techniques. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pp. 277–281. ACM, May 2015
Marchionini, G.: Exploratory search: from finding to understanding. Commun. ACM 49(4), 41–46 (2006)
Kersten, M.L., Idreos, S., Manegold, S., Liarou, E.: The researcher’s guide to the data deluge: querying a scientific database in just a few seconds. PVLDB Chall. Vis. 3(3) (2011)
Singh, V.: Predicting search intent based on in-search context for exploratory search. Int. J. Adv. Pervasive Ubiquit. Comput. (IJAPUC) 11(3), 53–75 (2019)
Van Rijsbergen, C.J.: A theoretical basis for the use of co-occurrence data in information retrieval. J. Doc. 33(2), 106–119 (1977)
Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Inf. Process. Manage. 24(5), 513–523 (1988)
Cosijn, E., Ingwersen, P.: Dimensions of relevance. Inf. Process. Manage. 36(4), 533–550 (2000)
Barry, C.L.: User-defined relevance criteria: an exploratory study. J. Am. Soc. Inform. Sci. 45(3), 149–159 (1994)
Rasolofo, Y., Savoy, J.: Term proximity scoring for keyword-based retrieval systems. In: Sebastiani, F. (ed.) ECIR 2003. LNCS, vol. 2633, pp. 207–218. Springer, Heidelberg (2003). https://doi.org/10.1007/3-540-36618-0_15
Qiao, Y.N., Du, Q., Wan, D.F.: A study on query terms proximity embedding for information retrieval. Int. J. Distrib. Sens. Netw. 13(2), 1550147717694891 (2017)
Keen, E.M.: Some aspects of proximity searching in text retrieval systems. J. Inform. Sci. 18(2), 89–98 (1992)
Beigbeder, M., Mercier, A.: An information retrieval model using the fuzzy proximity degree of term occurrences. In: Proceedings of the 2005 ACM Symposium on Applied Computing, pp. 1018–1022. ACM, March 2005
Büttcher, S., Clarke, C.L., Lushman, B.: Term proximity scoring for ad-hoc retrieval on very large text collections. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 621–622. ACM, August 2006
Schenkel, R., Broschart, A., Hwang, S., Theobald, M., Weikum, G.: Efficient text proximity search. In: Ziviani, N., Baeza-Yates, R. (eds.) SPIRE 2007. LNCS, vol. 4726, pp. 287–299. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-75530-2_26
Svore, K.M., Kanani, P.H., Khan, N.:. How good is a span of terms?: exploiting proximity to improve web retrieval. In: Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 154–161. ACM, July 2010
Zhao, J., Yun, Y.: A proximity language model for information retrieval. In: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 291–298. ACM, July 2009
He, B., Huang, J.X., Zhou, X.: Modeling term proximity for probabilistic information retrieval models. Inf. Sci. 181(14), 3017–3031 (2011)
Sadakane, K., Imai, H.: Text retrieval by using k-word proximity search. In: Proceedings of 1999 International Symposium on Database Applications in Non-Traditional Environments (DANTE 1999) (Cat. No. PR00496), pp. 183–188. IEEE (1999)
Borlund, P.: The IIR evaluation model: a framework for evaluation of interactive information retrieval systems. Inform. Res. Int. Electron. J. 8(3) (2003)
Song, R., Taylor, M.J., Wen, J.-R., Hon, H.-W., Yu, Y.: Viewing term proximity from a different perspective. In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R.W. (eds.) ECIR 2008. LNCS, vol. 4956, pp. 346–357. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-78646-7_32
Miao, J., Huang, J.X., Ye, Z.: Proximity-based rocchio’s model for pseudo relevance. In: Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 535–544. ACM, August 2012
Zhao, J., Huang, J.X., Ye, Z.: Modeling term associations for probabilistic information retrieval. ACM Trans. Inform. Syst. (TOIS) 32(2), 7 (2014)
Ye, Z., He, B., Wang, L., Luo, T.: Utilizing term proximity for blog post retrieval. J. Am. Soc. Inform. Sci. Technol. 64(11), 2278–2298 (2013)
Saracevic, T.: The notion of relevance in information science: everybody knows what relevance is. But, what is it really? Synthesis Lect. Inform. Concepts Retrieval Serv. 8(3), i-109 (2016)
Borlund, P.: The concept of relevance in IR. J. Am. Soc. Inform. Sci. Technol. 54(10), 913–925 (2003)
Drosou, M., Pitoura, E.: YmaLDB: exploring relational databases via result-driven recommendations. VLDB J.—Int. J. Very Large Data Bases 22(6), 849–874 (2013)
Dimitriadou, K., Papaemmanouil, O., Diao, Y.: Explore-by-example: an automatic query steering framework for interactive data exploration. In: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, pp. 517–528. ACM, June 2014
Ruotsalo, T., et al.: IntentRadar: search user interface that anticipates user’s search intents. In: CHI 2014 Extended Abstracts on Human Factors in Computing Systems, pp. 455–458. ACM, April 2014
di Sciascio, C., Sabol, V., Veas, E.E.: Rank as you go: user-driven exploration of search results. In: Proceedings of the 21st International Conference on Intelligent User Interfaces, pp. 118–129. ACM, March 2016
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Singh, V., Dave, M. (2019). Improving Result Diversity Using Query Term Proximity in Exploratory Search. In: Madria, S., Fournier-Viger, P., Chaudhary, S., Reddy, P. (eds) Big Data Analytics. BDA 2019. Lecture Notes in Computer Science(), vol 11932. Springer, Cham. https://doi.org/10.1007/978-3-030-37188-3_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-37188-3_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-37187-6
Online ISBN: 978-3-030-37188-3
eBook Packages: Computer ScienceComputer Science (R0)