International Journal on Digital Libraries

, Volume 5, Issue 2, pp 84–98

Scale and performance in semantic storage management of data grids

  • Stergios V. Anastasiadis
  • Syam Gadde
  • Jeffrey S. Chase
Regular contribution

DOI: 10.1007/s00799-004-0086-8

Cite this article as:
Anastasiadis, S., Gadde, S. & Chase, J. Int J Digit Libr (2005) 5: 84. doi:10.1007/s00799-004-0086-8

Abstract

Data grids are middleware systems that offer secure shared storage of massive scientific datasets over wide area networks. The main challenge in their design is to provide reliable storage, search, and transfer of numerous or large files over geographically dispersed heterogeneous platforms. The Storage Resource Broker (SRB) is an example of a system that provides these services and that has been deployed in multiple high-performance scientific projects during the past few years. In this paper, we take a detailed look at several of its functional features and examine its scalability using synthetic and trace-based workloads. Unlike traditional file systems, SRB uses a commodity database to manage both system- and user-defined metadata. We quantitatively evaluate this decision and draw insightful conclusions about its implications to the system architecture and performance characteristics. We find that the bulk transfer facilities of SRB demonstrate good scalability properties, and we identify the bottleneck resources across different data search and transfer tasks. We examine the sensitivity to several configuration parameters and provide details about how different internal operations contribute to the overall performance.

Keywords

Data grids Middleware systems Distributed storage systems Semantic Web  

Copyright information

© Springer-Verlag 2004

Authors and Affiliations

  • Stergios V. Anastasiadis
    • 1
    • 2
  • Syam Gadde
    • 2
  • Jeffrey S. Chase
    • 1
  1. 1.Department of Computer ScienceDuke UniversityDurhamUSA
  2. 2.Duke-UNC Brain Imaging and Analysis CenterDurhamUSA