The 3-replica redundancy strategy is widely used to solve the problem of data reliability in large-scale distributed storage systems. However, its storage capacity utilization is only 33%. In this paper, a data placement algorithm based on fault-tolerant domain (FTD) is proposed. Owing to the fine-grained design of the FTD, the data reliability of systems using two replicas is comparable to that of current mainstream systems using three replicas, and the capacity utilization is increased to 50%. Moreover, the proposed FTD provides a new concept for the design of distributed storage systems. Distributed storage systems can take FTDs as the units for data placement, data migration, data repair and so on. In addition, fault detection can be performed independently and concurrently within the FTDs.
This is a preview of subscription content, access via your institution.
Buy single article
Instant access to the full article PDF.
Tax calculation will be finalised during checkout.
PINHEIRO E, WEBER W D, BARROSO L A. Failure trends in a large disk drive population [C]//Proceedings of the 5th USENIX Conference on File and Storage Technologies (FAST’ 07). San Jose, CA, USA: USENIX Association, 2007: 17–29.
CHEMAWAT S, GOBIOFF H, LEUNG S T. The Google file system [J]. ACM SIGOPS Operating Systems Review, 2003, 37(5): 29–43.
SHVACHKO K, KUANG H R, RADIA S, et al. The hadoop distributed file system [C]//2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST). Washington, DC, USA: IEEE, 2010: 1–10.
WEIL S A, BRANDT S A, MILLER E L, et al. Ceph: A scalable, high-performance distributed file system [C]//Proceedings of the 7th Symposium on Operating Systems Design and Implementation. Seattle, WA, USA: USENIX Association, 2006: 307–320.
ADYA A, J, BOLOSKY W J, CASTRO M, et al. Farsite: Federated, available, and reliable storage for an incompletely trusted environment [C]//Proceedings of the 5th symposium on Operating systems design and implementation (OSDI 2002). Boston, MA, USA: ACM, 2002, 36: 1–14.
YANG S L, ZHANG G Y. Review of data recovery in storage systems based on erasure codes [J]. Journal of Frontiers of Computer Science and Technology, 2017, 11(10): 1531–1544 (in Chinese).
REED I S, SOLOMON G. Polynomial codes over certain finite fields [J]. Journal of the Society for Industrial and Applied Mathematics, 1960, 8(2): 300–304.
WEIL S A, BRANDT S A, MILLER E L, et al. CRUSH: Controlled, scalable, decentralized placement of replicated data [C]//Proceedings of the 2006 ACM/IEEE Conference on Supercomputing.Tampa, USA: ACM, 2006: 122–133.
HONICKY R J, MILLER E L. Replication under ucal-able uashing: A family of algorithms for scalable decentralized data distribution [C]//Proceedings of 18th International Parallel and Distributed Processing Symposium (IPDPS 2004). Santa Fe, NM, USA: IEEE, 2004: 1–10.
The OpenStack Foundation. Swift’s overview and concepts: The rings [EB/OL]. (2019-10-26) [2019-10-30]. https://docs.openstack.org/swift/latest/over-view_ring.html.
XIN Q, MILLER E L, SCHWARZ S J T J E. Evaluation of distributed recovery in large-scale storage systems [C]//Proceedings of the 13th IEEE International Symposium on High Performance Distributed Computing. Washington, DC, USA: IEEE, 2004: 172–181.
BACHWANI R, GRYZ L, BIANCHINI R, et al. Dynamically quantifying and improving the reliability of distributed storage systems [C]//Proceedings of the 27th Symposium on Reliable Distributed Systems. Naples, Italy: IEEE, 2008: 85–94.
BANERJEE S, DAS A, MAZUMDER A, et al. On the impact of coding parameters on storage requirement of region-based fault tolerant distributed file system design [C]//Proceedings of the 2014 International Conference on Computing, Networking and Communications (ICNC). Washington, DC, USA: IEEE, 2014: 78–82.
LIAN Q, CHEN W, ZHANG Z. On the impact of replica placement to the reliability of distributed brick storage systems [C]//25th IEEE International Conference on Distributed Computing Systems (ICDCS’ 05). Washington, DC, USA: IEEE, 2005: 187–196.
ZHANG L F, TAN X J, DU K. Optimal reliability analysis for large scale storage systems [J]. Computer Engineering and Applications, 2013, 49(1): 112–119 (in Chinese).
KARGER D, LEHMAN E, LEIGHTON T, et al. Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the World Wide Web [C]//Proceedings of the 29th annual A CM symposium on Theory of computing. El Paso, Texas, USA: ACM, 1997: 654–663.
We would like to express our sincere gratitude to Prof. DENG Dameng for providing guidance regarding the mathematical analysis in this paper.
Foundation item: the Science and Technology Project of Minhang District in Shanghai (No. 2018MH331)
About this article
Cite this article
Shi, L., Wang, Z. & Li, X. Novel Data Placement Algorithm for Distributed Storage System Based on Fault-Tolerant Domain. J. Shanghai Jiaotong Univ. (Sci.) 26, 463–470 (2021). https://doi.org/10.1007/s12204-020-2253-5
- data reliability
- failure domain
- fault-tolerant domain
- data placement
- storage system
- distributed system
- TP 391