Skip to main content
Log in

Mirrored and hybrid disk arrays and their reliability

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Replication and erasure coding are two alternative methods for disk arrays to deal with disk failures. This work concentrates on mirrored disk arrays, classified as RAID1, and hybrid disk arrays, which implement redundancy by storing XORed data blocks instead of replicas. We evaluate the reliability of disk arrays without and with repair using traditional reliability modeling techniques. A shortcut method based on asymptotic expansions is also used to compare the reliability of RAID(4+k) arrays with mirrored and hybrid disks. RAID1 with distributed redundancy attains more balanced disk loads and improved performance with respect to basic mirroring (BM) upon disk failure, but is less reliable than BM. Hybrid disk arrays incurring the same level of redundancy as RAID1 are more reliable than RAID1, but incur a higher cost for updates. The application of the asymptotic expansion method to hierarchical RAID shows that it is advantageous to associate higher redundancy with lower levels at the same overall redundancy overhead. It is also shown that sharing disk space sharing between RAID1 and RAID5 in heterogeneous disk arrays—HDAs may result in a lowered reliability. In addition to the classical rebuild model, we present an extension with a limited number of spares. Recovery methods based on reconfiguration from higher to lower reliability RAID arrays are also presented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. https://en.wikipedia.org/wiki/Solid-state_drive.

Abbreviations

BM:

Basic mirroring

CD:

Chained declustering

CTMC:

Continuous time Markov chain

GRD:

Group rotate declustering

HDA:

Heterogeneous disk array

HRAID:

Hierarchical RAID

ID:

Interleaved declustering

LSI:

LSI logics’ RAID array

MTTF:

Mean time to failure

MTTR:

Mean time to repair

MTTDL:

Mean time to data loss

PDS:

Parity defining set (for Weaver codes)

RAID:

Redundant array of independent disks

SADA:

Self-adaptive disk array

SSPiRAL:

Survivable storage using parity in redundant array layouts

XOR:

eXclusive-OR

References

  1. Amer, A., Long, D.D.E., Paris, J.F., Schwarz, T.: Increased reliability with SSPiRAL data layouts. In: Proceedings 16th IEEE Int’l Symposium on Modeling, Analysis, and Simulation of Computer and Telecomm. Systems (MASCOTS’08), pp. 189–198. Baltimore, MD (2008)

  2. Bachmat, E., Schindler, J.: Analysis of methods for scheduling low priority disk drive tasks. In: Proceedings of ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, pp. 55–65. Marina del Rey, CA (2002)

  3. Blaum, M., Brady, J., Bruck, J., Menon, J., Vardy, A.: The EVENODD code and its generalization. In: Jin, H. et al. (eds.) Chapter 14 in High Performance Mass Storage and Parallel I/O: Technologies and Applications, pp. 187–208. IEEE & Wiley Press, New York (2002)

  4. Chen, S.-Z., Towsley, D.F.: A performance evaluation of RAID architectures. IEEE Trans. Comput. 45(10), 1116–1130 (1996)

    Article  MATH  Google Scholar 

  5. Chen, P.M., Lee, E.K., Gibson, G.A., Katz, R.H., Patterson, D.A.: RAID: High-performance, reliable secondary storage. ACM Computing Surveys 26(2), 145–185 (1994)

    Article  Google Scholar 

  6. Gibson, G.A.: Redundant Disk Arrays: Reliable, Parallel Secondary Storage. MIT Press, Cambridge (1992)

    Google Scholar 

  7. Hafner, J.L.: WEAVER codes: highly fault tolerant erasure codes for storage systems. In: Proceedings 4th USENIX Conference on File and Storage Technologies (FAST’05), pp. 211–224. San Francisco, CA (2005)

  8. Hsiao, H.-I., DeWitt, D.J.: Chained declustering: a new availability strategy for multiprocessor database machines. In: Proceedings of IEEE International Conference. on Data Engineering (ICDE’90), pp. 456–465. Los Angeles, CA (1990)

  9. Iliadis, I., Venkatesan, V.: Expected annual fraction of data loss as a metric for data storage reliability. In: Proceedings of IEEE 22nd Int’l Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems (MASCOTS’14), pp. 375–384. Paris, France (2014)

  10. Jacob, B.L., Ng, S.W., Wang, D.T.: Memory Systems: Cache, DRAM, Disk. Morgan Kaufmann, Burlington (2008)

    Google Scholar 

  11. Paris, J.F., Schwarz, T. J. E., Long, D.D.E.: Self-adaptive disk arrays. In: Proceedings of 8th International Symposium on Stabilization, Safety, and Security of Distributed Systems, pp. 469–483. Dallas, TX (2006)

  12. Patterson, D.A.: A simple way to estimate the cost of downtime. In: Proceedings of 16th Conference on Systems Administration (LISA 2002), pp. 185–188. Philadelphia, PA (2002)

  13. Schroeder, B., Gibson, G.A.: Understanding disk failure rates: what does an MTTF of 1,000,000 hours mean to you? ACM Trans. Storage 3(3), 8-1–8-31 (2007)

    Article  Google Scholar 

  14. Thomasian, A.: Reconstruct versus read-modify writes in RAID. Inf. Process. Lett. 93(4), 163–168 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  15. Thomasian, A.: Shortcut method for reliability comparisons in RAID5. J. Syst. Softw. 79(11), 1599–1605 (2006)

    Article  Google Scholar 

  16. Thomasian, A., Blaum, M.: Mirrored disk organization reliability analysis. IEEE Trans. Comput. 55(12), 1640–1644 (2006)

    Article  Google Scholar 

  17. Thomasian, A., Blaum, M.: Higher reliability redundant disk arrays: organization, operation, and coding. ACM Trans. Storage Syst. 5(3), 7:1–7:59 (2009)

    Google Scholar 

  18. Thomasian, A., Menon, J.: Performance analysis of RAID5 disk arrays with a vacationing server model for rebuild mode operation. In: Proceedings 10th International Conference on Data Engineering (ICDE), pp. 111–119. Houston, TX (1994)

  19. Thomasian, A., Menon, J.: RAID5 performance with distributed sparing. IEEE Trans. Parallel Distrib. Syst. 8(6), 640–657 (1997)

    Article  Google Scholar 

  20. Thomasian, A., Tang, Y.: Performance, reliability, and performability of a hybrid RAID array and a comparison with traditional RAID1 arrays. Clust. Comput. 15(3), 239–253 (2012)

    Article  Google Scholar 

  21. Thomasian, A., Xu, J.: Reliability and performance of mirrored disk organizations. Comput. J. 51(6), 615–629 (2008)

    Article  Google Scholar 

  22. Thomasian, A., Xu, J.: RAID level selection for heterogeneous disk arrays. Clust. Comput. 14(2), 115–127 (2011)

    Article  Google Scholar 

  23. Thomasian, A., Xu, J.: Data allocation in a heterogeneous disk array (HDA) with multiple RAID levels for database applications. Comput. Syst. 21(5), 345–359 (2016). https://arxiv.org/abs/1510.04868

  24. Thomasian, A., Tang, Y., Hu, Y.: Hierarchical RAID: design, performance, reliability, and recovery. J. Parallel Distrib. Comput. 72(12), 1753–1769 (2012)

    Article  Google Scholar 

  25. Trivedi, K.S.: Probability and Statistics with Reliability, Queuing, and Computer Science Applications, 2nd edn. Wiley, New York (2001)

    MATH  Google Scholar 

  26. Wilkes, J., Golding, R., Staelin, C., Sullivan, T.: The HP AutoRAID hierarchical storage system. ACM Trans. Comput. Syst. 14(1), 108–136 (1996)

    Article  Google Scholar 

  27. Wilner, A. Multiple drive failure tolerant RAID system. US Patent US 6,327,672 B1, LSI Logic Corporation, Milpitas, CA, (2001)

Download references

Acknowledgements

Dr. Jun Xu at NJIT and Dr. Yujie Tang at Shenzhen Institute of Advanced Technology: www.siat.ac.cn collaborated on research topics covered in this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alexander Thomasian.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Thomasian, A. Mirrored and hybrid disk arrays and their reliability. Cluster Comput 22 (Suppl 1), 2485–2494 (2019). https://doi.org/10.1007/s10586-018-2127-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-018-2127-x

Keywords

Navigation