Abstract
In this paper, we consider the problem of document ranking in a non-traditional retrieval task, called subtopic retrieval. This task involves promoting relevant documents that cover many subtopics of a query at early ranks, providing thus diversity within the ranking. In the past years, several approaches have been proposed to diversify retrieval results. These approaches can be classified into two main paradigms, depending upon how the ranks of documents are revised for promoting diversity. In the first approach subtopic diversification is achieved implicitly, by choosing documents that are different from each other, while in the second approach this is done explicitly, by estimating the subtopics covered by documents. Within this context, we compare methods belonging to the two paradigms. Furthermore, we investigate possible strategies for integrating the two paradigms with the aim of formulating a new ranking method for subtopic retrieval. We conduct a number of experiments to empirically validate and contrast the state-of-the-art approaches as well as instantiations of our integration approach. The results show that the integration approach outperforms state-of-the-art strategies with respect to a number of measures.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Carbonell, J., Goldstein, J.: The use of MMR, diversity-based reranking for reordering documents and producing summaries. In: SIGIR 1998, pp. 335–336 (1998)
Carterette, B., Chandar, P.: Probabilistic models of ranking novel documents for faceted topic retrieval. In: CIKM 2009, pp. 1287–1296 (2009)
Clarke, C.L.A., Kolla, M., Cormack, G.V., Vechtomova, O., Ashkan, A., Büttcher, S., MacKinnon, I.: Novelty and diversity in information retrieval evaluation. In: SIGIR 2008, pp. 659–666 (2008)
Clarke, C.L.A., Craswell, N., Soboroff, I.: Overview of the TREC 2009 Web Track. In: Proc. of TREC 2009 (2009)
Deselaers, T., Gass, T., Dreuw, P., Ney, H.: Jointly optimising relevance and diversity in image retrieval. In: CIVR 2009, pp. 1–8 (2009)
Ferecatu, M., Sahbi, H.: TELECOM ParisTech at ImageCLEFphoto 2008: Bi-modal text and image retrieval with diversity enhancement. In: Working Notes for the CLEF 2008 workshop (2008)
Gordon, M.D., Lenk, P.: When is the probability ranking principle suboptimal. JASIS 43, 1–14 (1999)
Hofmann, T.: Probabilistic latent semantic indexing. In: SIGIR 1999, pp. 50–57 (1999)
Huang, J., Kumar, S.R., Zabih, R.: An automatic hierarchical image classification scheme. In: MM 1998, pp. 219–228 (1998)
Kurland, O., Lee, L.: Corpus structure, language models, and ad hoc information retrieval. In: SIGIR 2004, pp. 194–201 (2004)
Leelanupab, T., Zuccon, G., Jose, J.M.: Technical report: A study of ranking paradigms and their integrations for subtopic retrieval. Technical report, School of Computing Science, University of Glasgow (2010)
Paramita, M.L., Sanderson, M., Clough, P.: Developing a test collection to support diversity analysis. In: Proc. of Redundancy, Diversity, and IDR workshop SIGIR 2009, pp. 39–45 (2009)
Robertson, S.E.: The probability ranking principle in IR. J. of Doc. 33, 294–304 (1977)
van Rijsbergen, C.J.: Information Retrieval, 2nd edn. Butterworth (1979)
Wang, J., Zhu, J.: Portfolio theory of information retrieval. In: SIGIR 2009, pp. 115–122 (2009)
Zhai, C.X., Cohen, W.W., Lafferty, J.: Beyond independent relevance: methods and evaluation metrics for subtopic retrieval. In: SIGIR 2003, pp. 10–17 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Leelanupab, T., Zuccon, G., Jose, J.M. (2010). When Two Is Better Than One: A Study of Ranking Paradigms and Their Integrations for Subtopic Retrieval. In: Cheng, PJ., Kan, MY., Lam, W., Nakov, P. (eds) Information Retrieval Technology. AIRS 2010. Lecture Notes in Computer Science, vol 6458. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17187-1_15
Download citation
DOI: https://doi.org/10.1007/978-3-642-17187-1_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-17186-4
Online ISBN: 978-3-642-17187-1
eBook Packages: Computer ScienceComputer Science (R0)