Advertisement

RDIM: A Self-adaptive and Balanced Distribution for Replicated Data in Scalable Storage Clusters

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3758)

Abstract

As storage systems scale from a few storage nodes to hundreds or thousands, data distribution and load balancing become increasingly important. We present a novel decentralized algorithm, RDIM (Replication Under Dynamic Interval Mapping), which maps replicated objects to a scalable collection of storage nodes. RDIM distributes objects to nodes evenly, redistributing as few objects as possible when new nodes are added or existing nodes are removed to preserve this balanced distribution. It supports weighted allocation and guarantees that replicas of a particular object are not placed on the same node. Its time complexity and storage requirements compare favorably with known methods.

Keywords

Data Object Storage Node Mapping Storage Data Replication Balance Distribution 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Xin, Q., Miller, E.L., Long, D.D.E., Brandt, S.A., Schwarz, T., Litwin, W.: Reliability mechanisms for very large storage systems. In: Proceedings of the 20th IEEE / 11th NASA Goddard Conference on Mass Storage Systems and Technologies, April 2003, pp. 146–156 (2003)Google Scholar
  2. 2.
    Litwin, W., Neimat, M.-A., Schneider, D.A.: LH*—a scalable, distributed data structure. ACM Transactions on Database Systems 21(4), 480–525 (1996)CrossRefGoogle Scholar
  3. 3.
    Devine, R.: Design and implementation of DDH: A distributed dynamic hashing algorithm. In: Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms, pp. 101–114 (1993)Google Scholar
  4. 4.
    Choy, D.M., Fagin, R., Stockmeyer, L.: Efficiently extendible mappings for balanced data distribution. Algorithmica 16, 215–232 (1996)zbMATHCrossRefMathSciNetGoogle Scholar
  5. 5.
    Brinkmann, A., Salzwedel, K., Scheideler, C.: Efficient, distributed data placement strategies for storage area networks. In: Proceedings of the 12th ACM Symposium on Parallel Algorithms and Architectures (SPAA), pp. 119–128. ACM Press, New York (2000) Extended AbstractGoogle Scholar
  6. 6.
    Brinkmann, A., Salzwedel, K., Scheideler, C.: Compact, adaptive placement schemes for non-uniform capacities. In: Proceedings of the 14th ACM Symposium on Parallel Algorithms and Architectures (SPAA), Winnipeg, Manitoba, Canada, August 2002, pp. 53–62 (2002)Google Scholar
  7. 7.
    Honicky, R.J., Miller, E.L.: A fast algorithm for online placement and reorganization of replicated data. In: Proceedings of the 17th International Parallel & Distributed Processing Symposium, Nice, France (April 2003)Google Scholar
  8. 8.
    Honicky, R.J., Miller, E.L.: Replication under scalable hashing: A family of algorithms for scalable decentralized data distribution. In: Proceedings of the 18th International Parallel & Distributed Processing Symposium (IPDPS 2004), Santa Fe, NM, April 2004. IEEE, Los Alamitos (2004)Google Scholar
  9. 9.
    Liu, Z., Zhou, X.-M.: An Adaptive Data Objects Placement Algorithm For Non-Uniform Capacities. In: Proceedings of the 3rd International Conference on Grid and Cooperative Computing, WuHan (October 2004)Google Scholar
  10. 10.
    Matsumoto, M., Nishimura, T.: Mersenne Twister: A 623-dimensionally equidistributed uniform pseudorandom number generator. ACM Trans. on Modeling and Computer Simulation 8(1), 3–30 (1998)zbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  1. 1.Institute of ComputerNational University of Defense TechnologyChangshaChina

Personalised recommendations