Skip to main content

Novel Data Placement Algorithm for Distributed Storage System Based on Fault-Tolerant Domain

Abstract

The 3-replica redundancy strategy is widely used to solve the problem of data reliability in large-scale distributed storage systems. However, its storage capacity utilization is only 33%. In this paper, a data placement algorithm based on fault-tolerant domain (FTD) is proposed. Owing to the fine-grained design of the FTD, the data reliability of systems using two replicas is comparable to that of current mainstream systems using three replicas, and the capacity utilization is increased to 50%. Moreover, the proposed FTD provides a new concept for the design of distributed storage systems. Distributed storage systems can take FTDs as the units for data placement, data migration, data repair and so on. In addition, fault detection can be performed independently and concurrently within the FTDs.

This is a preview of subscription content, access via your institution.

References

  1. [1]

    PINHEIRO E, WEBER W D, BARROSO L A. Failure trends in a large disk drive population [C]//Proceedings of the 5th USENIX Conference on File and Storage Technologies (FAST’ 07). San Jose, CA, USA: USENIX Association, 2007: 17–29.

    Google Scholar 

  2. [2]

    CHEMAWAT S, GOBIOFF H, LEUNG S T. The Google file system [J]. ACM SIGOPS Operating Systems Review, 2003, 37(5): 29–43.

    Article  Google Scholar 

  3. [3]

    SHVACHKO K, KUANG H R, RADIA S, et al. The hadoop distributed file system [C]//2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST). Washington, DC, USA: IEEE, 2010: 1–10.

    Google Scholar 

  4. [4]

    WEIL S A, BRANDT S A, MILLER E L, et al. Ceph: A scalable, high-performance distributed file system [C]//Proceedings of the 7th Symposium on Operating Systems Design and Implementation. Seattle, WA, USA: USENIX Association, 2006: 307–320.

    Google Scholar 

  5. [5]

    ADYA A, J, BOLOSKY W J, CASTRO M, et al. Farsite: Federated, available, and reliable storage for an incompletely trusted environment [C]//Proceedings of the 5th symposium on Operating systems design and implementation (OSDI 2002). Boston, MA, USA: ACM, 2002, 36: 1–14.

    Google Scholar 

  6. [6]

    YANG S L, ZHANG G Y. Review of data recovery in storage systems based on erasure codes [J]. Journal of Frontiers of Computer Science and Technology, 2017, 11(10): 1531–1544 (in Chinese).

    Google Scholar 

  7. [7]

    REED I S, SOLOMON G. Polynomial codes over certain finite fields [J]. Journal of the Society for Industrial and Applied Mathematics, 1960, 8(2): 300–304.

    MathSciNet  Article  Google Scholar 

  8. [8]

    WEIL S A, BRANDT S A, MILLER E L, et al. CRUSH: Controlled, scalable, decentralized placement of replicated data [C]//Proceedings of the 2006 ACM/IEEE Conference on Supercomputing.Tampa, USA: ACM, 2006: 122–133.

    Google Scholar 

  9. [9]

    HONICKY R J, MILLER E L. Replication under ucal-able uashing: A family of algorithms for scalable decentralized data distribution [C]//Proceedings of 18th International Parallel and Distributed Processing Symposium (IPDPS 2004). Santa Fe, NM, USA: IEEE, 2004: 1–10.

    Google Scholar 

  10. [10]

    The OpenStack Foundation. Swift’s overview and concepts: The rings [EB/OL]. (2019-10-26) [2019-10-30]. https://docs.openstack.org/swift/latest/over-view_ring.html.

  11. [11]

    XIN Q, MILLER E L, SCHWARZ S J T J E. Evaluation of distributed recovery in large-scale storage systems [C]//Proceedings of the 13th IEEE International Symposium on High Performance Distributed Computing. Washington, DC, USA: IEEE, 2004: 172–181.

    Google Scholar 

  12. [12]

    BACHWANI R, GRYZ L, BIANCHINI R, et al. Dynamically quantifying and improving the reliability of distributed storage systems [C]//Proceedings of the 27th Symposium on Reliable Distributed Systems. Naples, Italy: IEEE, 2008: 85–94.

    Google Scholar 

  13. [13]

    BANERJEE S, DAS A, MAZUMDER A, et al. On the impact of coding parameters on storage requirement of region-based fault tolerant distributed file system design [C]//Proceedings of the 2014 International Conference on Computing, Networking and Communications (ICNC). Washington, DC, USA: IEEE, 2014: 78–82.

    Google Scholar 

  14. [14]

    LIAN Q, CHEN W, ZHANG Z. On the impact of replica placement to the reliability of distributed brick storage systems [C]//25th IEEE International Conference on Distributed Computing Systems (ICDCS’ 05). Washington, DC, USA: IEEE, 2005: 187–196.

    Chapter  Google Scholar 

  15. [15]

    ZHANG L F, TAN X J, DU K. Optimal reliability analysis for large scale storage systems [J]. Computer Engineering and Applications, 2013, 49(1): 112–119 (in Chinese).

    Google Scholar 

  16. [16]

    KARGER D, LEHMAN E, LEIGHTON T, et al. Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the World Wide Web [C]//Proceedings of the 29th annual A CM symposium on Theory of computing. El Paso, Texas, USA: ACM, 1997: 654–663.

    Google Scholar 

Download references

Acknowledgement

We would like to express our sincere gratitude to Prof. DENG Dameng for providing guidance regarding the mathematical analysis in this paper.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Xiaoyong Li.

Additional information

Foundation item: the Science and Technology Project of Minhang District in Shanghai (No. 2018MH331)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Shi, L., Wang, Z. & Li, X. Novel Data Placement Algorithm for Distributed Storage System Based on Fault-Tolerant Domain. J. Shanghai Jiaotong Univ. (Sci.) 26, 463–470 (2021). https://doi.org/10.1007/s12204-020-2253-5

Download citation

Key words

  • data reliability
  • failure domain
  • fault-tolerant domain
  • data placement
  • storage system
  • distributed system

CLC number

  • TP 391

Document coder

  • A