Abstract
Resource selection is a key task in distributed information retrieval. There are many factors that affect the performance of resource selection. Learning to rank methods can effectively combine features and are widely used for document ranking in web search. But few of them are explored for resource selection. In this paper, we propose a resource selection algorithm based on learning to rank called LTRRS. By analyzing the factors affecting the effectiveness of resource selection, we extract multi-scale features including term matching features, topical relevance features and central sample index (CSI) based features. By training LambdaMART learning to rank model, we directly optimize NDCG metric of resource ranking list in LTRRS. Experiments on the Sogou-QCL dataset show that LTRRS algorithm can significantly outperform the baseline methods in NDCG and precision metrics.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Callan, J.: Distributed information retrieval. In: Croft, W.B. (ed.) Advances in Information Retrieval: Recent Research from the Center for Intelligent Information Retrieval, pp. 127–150. Springer, Boston (2000). https://doi.org/10.1007/0-306-47019-5_5
Callan, J.P., Lu, Z., Croft, W.B.: Searching distributed collections with inference networks. In: Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 21–28. ACM (1995)
Xu, J., Croft, W.B.: Cluster-based language models for distributed retrieval. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 254–261. Citeseer (1999)
Si, L., Callan, J.: Relevant document distribution estimation method for resource selection. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 298–305. ACM, New York (2003)
Shokouhi, M.: Central-rank-based collection selection in uncooperative distributed information retrieval. In: Amati, G., Carpineto, C., Romano, G. (eds.) ECIR 2007. LNCS, vol. 4425, pp. 160–172. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-71496-5_17
Kang, I.-H., Kim, G.: Query type classification for web document retrieval. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 64–71. ACM, New York (2003)
Arguello, J., Callan, J., Diaz, F.: Classification-based resource selection. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management, pp. 1277–1286. ACM, New York (2009)
Xu, J., Li, X.: Learning to rank collections. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval - SIGIR 2007, Amsterdam, The Netherlands p. 765. ACM Press (2007)
Dai, Z., Kim, Y., Callan, J.: Learning to rank resources. In: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval - SIGIR 2017, Shinjuku, Tokyo, Japan, pp. 837–840. ACM Press (2017)
Kim, Y., Callan, J., Culpepper, J.S., Moffat, A.: Load-balancing in distributed selective search. In: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, pp. 905–908. ACM (2016)
Kulkarni, A., Callan, J.: Selective search: Efficient and effective search of large textual collections (TOIS). ACM Trans. Inf. Syst. 33, 17 (2015)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Burges, C.J.C., Bottou, L., Welling, M., Ghahramani, Z., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 26, pp. 3111–3119. Curran Associates, Inc. (2013)
Wu, Q., Burges, C.J., Svore, K.M., Gao, J.: Adapting boosting for information retrieval measures. Inf. Retr. 13, 254–270 (2010)
Zheng, Y., Fan, Z., Liu, Y., Luo, C., Zhang, M., Ma, S.: Sogou-QCL: a new dataset with click relevance label. In: The 41st International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1117–1120. ACM (2018)
Acknowledgement
The research of this paper was supported by Guangdong Natural Science Foundation (2015A030308017).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Wu, T., Liu, X., Dong, S. (2019). LTRRS: A Learning to Rank Based Algorithm for Resource Selection in Distributed Information Retrieval. In: Zhang, Q., Liao, X., Ren, Z. (eds) Information Retrieval. CCIR 2019. Lecture Notes in Computer Science(), vol 11772. Springer, Cham. https://doi.org/10.1007/978-3-030-31624-2_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-31624-2_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-31623-5
Online ISBN: 978-3-030-31624-2
eBook Packages: Computer ScienceComputer Science (R0)