ECIR 2013: Advances in Information Retrieval pp 865-868 | Cite as
Distributed Information Retrieval and Applications
Abstract
Distributed Information Retrieval (DIR) is a generic area of research that brings together techniques, such as resource selection and results aggregation, dealing with data that, for organizational or technical reasons, cannot be managed centrally. Existing and potential applications of DIR methods vary from blog retrieval to aggregated search and from multimedia and multilingual retrieval to distributed Web search. In this tutorial we briefly discuss main DIR phases, that are resource description, resource selection, results merging and results presentation. The main focus is made on applications of DIR techniques: blog, expert and desktop search, aggregated search and personal meta-search, multimedia and multilingual retrieval. We also discuss a number of potential applications of DIR techniques, such as distributed Web search, enterprise search and aggregated mobile search.
Preview
Unable to display preview. Download preview PDF.
References
- 1.Arguello, J., Callan, J., Diaz, F.: Classification-based resource selection. In: Proceedings of CIKM, pp. 1277–1286. ACM (2009)Google Scholar
- 2.Arguello, J., Diaz, F., Callan, J.: Learning to aggregate vertical results into web search results. In: Proceedings of CIKM, pp. 201–210 (2011)Google Scholar
- 3.Arguello, J., Diaz, F., Callan, J., Crespo, J.F.: Sources of evidence for vertical selection. In: Proceedings of SIGIR, pp. 315–322 (2009)Google Scholar
- 4.Baeza-Yates, R., Murdock, V., Hauff, C.: Efficiency trade-offs in two-tier web search systems. In: Proceedings of SIGIR, pp. 163–170 (2009)Google Scholar
- 5.Callan, J.P., Lu, Z., Croft, W.B.: Searching distributed collections with inference networks. In: Proceedings of SIGIR, pp. 21–28 (1995)Google Scholar
- 6.Callan, J.: Advances in Information Retrieval. In: Distributed Information Retrieval, vol. ch. 5, pp. 127–150. Kluwer Academic Publishers (2000)Google Scholar
- 7.Callan, J., Connell, M.: Query-based sampling of text databases. ACM Transactions of Information Systems 19(2), 97–130 (2001)CrossRefGoogle Scholar
- 8.Callan, J., Crestani, F., Nottelmann, H., Pala, P., Shou, X.M.: Resource selection and data fusion in multimedia distributed digital libraries. In: Proceedings of SIGIR, pp. 363–364 (2003)Google Scholar
- 9.Cambazoglu, B.B., Plachouras, V., Baeza-Yates, R.: Quantifying performance and quality gains in distributed web search engines. In: Proceedings of SIGIR, pp. 411–418 (2009)Google Scholar
- 10.Cambazoglu, B.B., Varol, E., Kayaaslan, E., Aykanat, C., Baeza-Yates, R.: Query forwarding in geographically distributed search engines. In: Proceedings of SIGIR, pp. 90–97 (2010)Google Scholar
- 11.Elsas, J.L., Arguello, J., Callan, J., Carbonell, J.G.: Retrieval and feedback models for blog feed search. In: Proceedings of SIGIR, pp. 347–354 (2008)Google Scholar
- 12.Hong, D., Si, L., Bracke, P., Witt, M., Juchcinski, T.: A joint probabilistic classification model for resource selection. In: Proceedings of SIGIR, pp. 98–105 (2010)Google Scholar
- 13.Kim, J., Croft, W.B.: Ranking using multiple document types in desktop search. In: Proceedings of SIGIR. pp. 50–57 (2010)Google Scholar
- 14.Kulkarni, A., Callan, J.: Document allocation policies for selective searching of distributed indexes. In: Proceedings of CIKM, pp. 449–458 (2010)Google Scholar
- 15.Markov, I.: Modeling document scores for distributed information retrieval. In: Proceedings of SIGIR, pp. 1321–1322 (2011)Google Scholar
- 16.Markov, I., Arampatzis, A., Crestani, F.: Unsupervised linear score normalization revisited. In: Proceedings of SIGIR, pp. 1161–1162 (2012)Google Scholar
- 17.Markov, I., Arampatzis, A., Crestani, F.: On CORI results merging. In: Serdyukov, P., Braslavski, P., Kuznetsov, S.O., Kamps, J., Agichtein, S.R.E., Segalovich, I., Yilmaz, E. (eds.) ECIR 2013. LNCS, vol. 7814, pp. 753–756. Springer, Heidelberg (2013)Google Scholar
- 18.Markov, I., Azzopardi, L., Crestani, F.: Reducing the uncertainty in resource selection. In: Serdyukov, P., Braslavski, P., Kuznetsov, S.O., Kamps, J., Agichtein, S.R.E., Segalovich, I., Yilmaz, E. (eds.) ECIR 2013. LNCS, vol. 7814, pp. 507–519. Springer, Heidelberg (2013)Google Scholar
- 19.Nguyen, D., Demeester, T., Trieschnigg, D., Hiemstra, D.: Federated search in the wild: the combined power of over a hundred search engines. In: Proceedings of CIKM, pp. 1874–1878 (2012)Google Scholar
- 20.Paltoglou, G., Salampasis, M., Satratzemi, M.: Integral based source selection for uncooperative distributed information retrieval environments. In: Proceedings of the ACM LSDS-IR Workshop, pp. 67–74 (2008)Google Scholar
- 21.Seo, J., Croft, W.B.: Blog site search using resource selection. In: Proceedings of CIKM, pp. 1053–1062 (2008)Google Scholar
- 22.Shokouhi, M.: Central-Rank-Based Collection Selection in Uncooperative Distributed Information Retrieval. In: Amati, G., Carpineto, C., Romano, G. (eds.) ECIR 2007. LNCS, vol. 4425, pp. 160–172. Springer, Heidelberg (2007)CrossRefGoogle Scholar
- 23.Shokouhi, M., Si, L.: Federated search. Foundations and Trends in Information Retrieval 5, 1–102 (2011)CrossRefGoogle Scholar
- 24.Shokouhi, M., Zobel, J.: Robust result merging using sample-based score estimates. ACM Transactions of Information Systems 27(3), 1–29 (2009)CrossRefGoogle Scholar
- 25.Si, L., Callan, J.: Using sampled data and regression to merge search engine results. In: Proceedings of SIGIR, pp. 19–26 (2002)Google Scholar
- 26.Si, L., Callan, J.: Relevant document distribution estimation method for resource selection. In: Proceedings of SIGIR, pp. 298–305 (2003)Google Scholar
- 27.Si, L., Callan, J., Cetintas, S., Yuan, H.: An effective and efficient results merging strategy for multilingual information retrieval in federated search environments. Information Retrieval 11(1), 1–24 (2008)CrossRefGoogle Scholar
- 28.Sushmita, S., Joho, H., Lalmas, M., Villa, R.: Factors affecting click-through behavior in aggregated search interfaces. In: Proceedings of CIKM, pp. 519–528 (2010)Google Scholar
- 29.Thomas, P.: To what problem is distributed information retrieval the solution? Journal of the American Society for Information Science and Technology 63(7), 1471–1476 (2012)CrossRefGoogle Scholar
- 30.Thomas, P., Hawking, D.: Server selection methods in personal metasearch: a comparative empirical study. Information Retrieval 12(5), 581–604 (2009)CrossRefGoogle Scholar
- 31.Thomas, P., Noack, K., Paris, C.: Evaluating interfaces for government metasearch. In: Proceedings of IIiX, pp. 65–74 (2010)Google Scholar
- 32.Thomas, P., Shokouhi, M.: Sushi: scoring scaled samples for server selection. In: Proceedings of SIGIR, pp. 419–426 (2009)Google Scholar
- 33.Xu, J., Croft, W.B.: Cluster-based language models for distributed retrieval. In: Proceedings of SIGIR, pp. 254–261 (1999)Google Scholar