A Plugin Architecture Enabling Federated Search for Digital Libraries

  • Sergey Chernov
  • Christian Kohlschütter
  • Wolfgang Nejdl
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4312)


Today, users expect a variety of digital libraries to be searchable from a single Web page. The German Vascoda project provides this service for dozens of information sources. Its ultimate goal is to provide search quality close to the ranking of a central database containing documents from all participating libraries. Currently, however, the Vascoda portal is based on a non-cooperative metasearch approach, where results from sources are merged randomly and ranking quality is sub-optimal. In this paper, we describe a Lucene-based plugin which replaces this method by a truly federated search across different search engines, where the exchange of document statistics improves document ranking. Preliminary evaluation results show ranking results equal to a centralized setup.


Search Engine Digital Library Query Time Vector Space Model Search Quality 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Balke, W.-T., Nejdl, W., Siberski, W., Thaden, U.: DL meets P2P – Distributed Document Retrieval based on Classification and Content. In: Rauber, A., Christodoulakis, S., Tjoa, A.M. (eds.) ECDL 2005. LNCS, vol. 3652, pp. 379–390. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  2. 2.
    Callan, J.: Distributed Information Retrieval. In: Croft, W.B. (ed.) Advances in information retrieval, pp. 127–150 (2000)Google Scholar
  3. 3.
    Castelli, D.: DILIGENT: A Digital Library Infrastructure on Grid Enabled Technology. ERCIM News 59 (2004),
  4. 4.
    Craswell, N.E.: Methods for Distributed Information Retrieval. PhD thesis, ANU, January 01 (2001),
  5. 5.
    Bruce Croft, W.: Combining Approaches to IR. In: DELOS Workshop: Information Seeking, Searching and Querying in Digital Libraries (2000)Google Scholar
  6. 6.
    Cutting, D., et al.: Lucene,
  7. 7.
    Fuhr, N., Klas, C.-P., Schaefer, A., Mutschke, P.: Daffodil: An Integrated Desktop for Supporting High-Level Search Activities in Federated Digital Libraries. In: Agosti, M., Thanos, C. (eds.) ECDL 2002. LNCS, vol. 2458, pp. 597–612. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  8. 8.
    Gospodnetic, O., Hatcher, E.: Lucene in Action. Manning (2005)Google Scholar
  9. 9.
    Gravano, L., Chang, K.C.-C., Garcia-Molina, H., Paepcke, A.: STARTS: Stanford Proposal for Internet Meta-Searching. In: SIGMOD 1997: Proceedings of the 1997 ACM International Conference on Management of Data, pp. 207–218 (1997)Google Scholar
  10. 10.
    Green, N., Ipeirotis, P.G., Gravano, L.: SDLIP + STARTS = SDARTS a Protocol and Toolkit for Metasearching. In: JCDL 2001: Proceedings of the The First ACM and IEEE Joint Conference on Digital Libraries, pp. 207–214 (2001)Google Scholar
  11. 11.
    Lagoze, C., Van de Sompel, H., Nelson, M., Warner, S.: The Open Archives Initiative Protocol for Metadata Harvesting Protocol Version 2.0 of 2002-06-14,
  12. 12.
    Liu, X., Maly, K., Zubair, M., Hong, Q., Nelson, M.L., Knudson, F.: Holtkamp. Federated Searching Interface Techniques for Heterogeneous OAI Repositories. Journal of Digital Information 4(2) (2002)Google Scholar
  13. 13.
    Meng, W., Yu, C.T., Liu, K.-L.: Building Efficient and Effective Metasearch Engines. ACM Comput. Surv. 34(1), 48–89 (2002), CrossRefGoogle Scholar
  14. 14.
    National Information Standards Organization. Z39.50: Application Service Definition and Protocol Specification (2003)Google Scholar
  15. 15.
    Neuroth, H., Pianos, T.: VASCODA: A German Scientific Portal for Cross-Searching Distributed Digital Resource Collections. In: Koch, T., Sølvberg, I.T. (eds.) ECDL 2003. LNCS, vol. 2769, pp. 257–262. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  16. 16.
    Sadeh, T.: Google Scholar Versus Metasearch Systems. High Energy Physics Libraries Webzine, 12 (2006)Google Scholar
  17. 17.
    Salton, G., Wong, A., Yang, C.S.: A Vector Space Model for Automatic Indexing. Commun. ACM 18(11), 613–620 (1975)MATHCrossRefGoogle Scholar
  18. 18.
    Si, L., Jin, R., Callan, J.P., Ogilvie, P.: A Language Modeling Framework for Resource Selection and Results Merging. In: CIKM 2002: Proceedings of the ACM 11th Conference on Information and Knowledge Management, pp. 391–397 (2002),

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Sergey Chernov
    • 1
  • Christian Kohlschütter
    • 1
  • Wolfgang Nejdl
    • 1
  1. 1.L3S Research CenterUniversity of HannoverHannoverGermany

Personalised recommendations