Information Retrieval

, Volume 10, Issue 3, pp 297–319

Result merging methods in distributed information retrieval with overlapping databases


DOI: 10.1007/s10791-007-9023-y

Cite this article as:
Wu, S. & McClean, S. Inf Retrieval (2007) 10: 297. doi:10.1007/s10791-007-9023-y


In distributed information retrieval systems, document overlaps occur frequently among different component databases. This paper presents an experimental investigation and evaluation of a group of result merging methods including the shadow document method and the multi-evidence method in the environment of overlapping databases. We assume, with the exception of resultant document lists (either with rankings or scores), no extra information about retrieval servers and text databases is available, which is the usual case for many applications on the Internet and the Web.

The experimental results show that the shadow document method and the multi-evidence method are the two best methods when overlap is high, while Round-robin is the best for low overlap. The experiments also show that [0,1] linear normalization is a better option than linear regression normalization for result merging in a heterogeneous environment.


Result merging Distributed information retrieval Overlapping databases 

Copyright information

© Springer Science+Business Media, LLC 2007

Authors and Affiliations

  1. 1.School of Computing and MathematicsUniversity of UlsterNorthern IrelandUK

Personalised recommendations