Skip to main content
Log in

Performance, reliability, and performability of a hybrid RAID array and a comparison with traditional RAID1 arrays

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

We describe a hybrid mirrored disk organization patented by LSI Logic Corp. and compare its performance, reliability, and performability with traditional mirrored RAID1 disk organizations and RAID(4+), ≥1. LSI RAID has the same level of redundancy as mirrored disks, but also utilizes parity coding. Unlike RAID1, which cannot tolerate all two disk failures, LSI RAID similarly to RAID6 is 2 Disk Failure Tolerant (2DFT), but in addition it can tolerate almost all three disk failures, while RAID1 organizations are generally 1DFT. We list analytic expressions for the reliability of various RAID1 organizations and use enumeration when the reliability expression cannot be obtained analytically. An asymptotic expansion method based on disk unreliabilities is used for an easy comparison of RAID reliabilities. LSI RAID performance is evaluated with the Read-Modify-Write (RMW) and ReConstruct Write (RCW) methods to update parities. The combination of the two methods is used to balance data and parity disk loads, which results in maximizing the I/O throughput. The analysis shows that LSI RAID has an inferior performance with respect to basic mirroring in processing an OLTP workload, but it outperforms RAID6. LSI RAID in spite of its higher Mean Time to Data Loss (MTTDL) is outperformed by other RAID1 organizations as far as its performability is concerned, i.e., the number of I/Os carried out by the disk array operating at maximum I/Os Per Second (IOPS) until data loss occurs. A survey of RAID1 organizations and distributed replicated systems is also included.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. Figure 5 in [36] is different in that the system reliability is plotted versus decreasing disk reliabilities, not normalized time.

  2. We have corrected the first part of (7) in [35].

  3. The final result given here is slightly different from (9) in [35].

Abbreviations

BM:

Basic Mirroring

CD:

Chained Declustering

CRAID:

Clustered RAID

Ddisk:

Data disk

DoutD:

Data out-Degree

GRD:

Group Rotate Declustering

HRAID:

Hierarchical RAID

HST:

Head Settling Time

ID:

Interleaved Declustering

IOPS:

I/Os per Second

kDFT:

k Disk Failure Tolerant

LSE:

Latent Sector Error

MDS:

Maximum Distance Separable

MTTDL:

Mean Time to Data Loss

MTTF:

Mean Time to Failure

OLTP:

OnLine Transaction Processing

OSM:

Orthogonal Striping and Mirroring

PCM:

Permanent Customer Model

Pdisk:

Parity disk

PinD:

Parity in-Degree

RAID:

Redundant Array of Independent Disks

RCW:

ReConstruct Write

RMD:

Rotated Mirrored Declustering

RMW:

Read-Modify-Write

RPM:

Rotations Per Minute

RS:

Reed-Solomon code

SADA:

Self-Adaptive Disk Array

SSPiRAL:

Survivable Storage using Parity in Redundant Array Layouts

VSM:

Vacationing Server Model

XOR:

eXclusive OR

References

  1. Alvarez, G.A., Burkhard, W.A., Stockmeyer, L.J., Cristian, F.: Declustered disk array architectures with optimal and near-optimal parallelism. In: Proc. 25th Ann’l Int’l Symp. on Computer Architecture (ISCA 1998), Barcelona, Spain, June, pp. 109–120 (1998)

    Google Scholar 

  2. Amer, A., Long, D.D.E., Paris, J.-F., Schwarz, T.: Increased reliability with SSPiRAL data layouts. In: Proc. 16th Int’l Symp. on Modeling, Analysis, and Simulation of Computer and Telecomm. Systems (MASCOTS’08), Baltimore, MD, Sept., pp. 189–198 (2008)

    Google Scholar 

  3. Bachmat, E., Schindler, J.: Analysis of methods for scheduling low priority disk drive tasks. In: Proc. ACM SIGMETRICS Conf. on Measurement and Modeling of Computer Systems, Los Angeles, CA, June, pp. 55–65 (2002)

    Google Scholar 

  4. Chen, P.M., Lee, E.K., Gibson, G.A., Katz, R.H., Patterson, D.A.: RAID: high-performance, reliable secondary storage. ACM Comput. Surv. 26(2), 145–185 (1994)

    Article  Google Scholar 

  5. Chen, S.-Z., Towsley, D.F.: A performance evaluation of RAID architectures. IEEE Trans. Comput. 45(10), 1116–1130 (1996)

    Article  MATH  Google Scholar 

  6. Chen, M.S., Hsiao, H.-I., Li, C.-S., Yu, P.S.: Using rotational mirrored declustering for replica placement in a disk-array-based video server. Multimed. Syst. 5(6), 371–379 (1997)

    Article  MATH  Google Scholar 

  7. Dholakia, A., Eleftheriou, E., Hu, X.-Y., Iliadis, I., Menon, J., Rao, K.K.: A new intra-disk redundancy scheme for high-reliability RAID storage systems in the presence of unrecoverable errors. ACM Trans. Storage 4(1), 1 (2008)

    Article  Google Scholar 

  8. Gibson, G.A.: Redundant Disk Arrays: Reliable, Parallel Secondary Storage. MIT Press, Cambridge (1992)

    Google Scholar 

  9. Hafner, J.L., Deenadhayalan, V., Kanungo, T., Rao, K.K.: Performance metrics for erasure codes in storage systems. In: IBM research report RJ 10231, Almaden, CA, USA, August (2004)

  10. Hafner, J.L.: WEAVER codes: highly fault tolerant erasure codes for storage systems. In: Proc. 4th USENIX Conf. on File and Storage Technologies (FAST’05), San Francisco, CA, December, pp. 211–224 (2005)

    Google Scholar 

  11. Haverkort, B.R., Marie, R., Rubino, R., Trivedi, K.S.: Performability Modelling: Techniques and Tools. Wiley, New York (2001)

    Google Scholar 

  12. Hsiao, H.-I., DeWitt, D.J.: Chained declustering: a new availability strategy for multiprocessor database machines. In: Proc. IEEE Int’l Conf. on Data Engineering (ICDE’90), Los Angeles, CA, February, pp. 456–465 (1990)

    Google Scholar 

  13. Hsiao, H.-I., DeWitt, D.J.: A performance study of three high available data replication strategies. Distrib. Parallel Databases 1(1), 53–80 (1993)

    Article  Google Scholar 

  14. Hwang, K., Jin, H., Ho, R.S.C.: Orthogonal striping and mirroring in distributed RAID for I/O-centric cluster computing. IEEE Trans. Parallel Distrib. Syst. 13(1), 26–44 (2002)

    Article  Google Scholar 

  15. Iliadis, I., Haas, R., Hu, X.-Y., Eleftheriou, E.: Disk scrubbing versus intradisk redundancy for RAID storage systems. ACM Trans. Storage 7(2), 5 (2011)

    Article  Google Scholar 

  16. Menon, J., Mattson, D.: Comparison of sparing alternatives for disk arrays. In: Proc. 19th Ann’l Int’l Symp. on Computer Architecture (ISCA 1992), Gold Coast, Australia, May, pp. 318–329 (1992)

    Google Scholar 

  17. Merchant, A., Yu, P.S.: Analytic modeling and comparisons of striping strategies for replicated disk arrays. IEEE Trans. Comput. 44(3), 419–433 (1995)

    Article  MATH  Google Scholar 

  18. Merchant, A., Yu, P.S.: Analytic modeling of clustered RAID with mapping based on nearly random permutation. IEEE Trans. Comput. 45(3), 367–373 (1996)

    Article  MATH  Google Scholar 

  19. Muntz, R.R., Lui, J.C.S.: Performance analysis of disk arrays under failure. In: 6th Int’l Conf. on Very Large Data Bases, Brisbane, Queensland, Australia, August, pp. 162–173 (1990)

    Google Scholar 

  20. Paris, J.-F., Schwarz, T.J.E., Long, D.D.E.: Self-adaptive disk arrays. In: Proc. 8th Int’l Symp. on Stabilization, Safety, and Security of Distributed Systems (SSS 2006), Dallas, TX, November, pp. 469–483 (2006)

    Chapter  Google Scholar 

  21. Park, C.-I.: Efficient placement of parity and data to tolerate two disk failures in disk array systems. IEEE Trans. Parallel Distrib. Syst. 6(11), 1177–1184 (1995)

    Article  Google Scholar 

  22. Schroeder, B., Gibson, G.A.: Understanding disk failure rates: what does an MTTF of 1,000, 000 hours mean to you? ACM Trans. Storage 3(3), 8 (2007)

    Article  Google Scholar 

  23. Schroeder, B., Damouras, S., Gill, P.: Understanding latent sector errors and how to protect against them. ACM Trans. Storage 8(3), 8 (2010)

    Google Scholar 

  24. Shang, P., Wang, J., Zhu, H., Gu, P.: A new placement-ideal layout for multiway replication storage system. IEEE Trans. Comput. 60(8), 1142–1156 (2011)

    Article  MathSciNet  Google Scholar 

  25. Teradata: DBC/1012 database computer system manual release 2.0. Document No. C10-0001-02, Teradata Corp., November (1985)

  26. Kari, H.H.: Latent sector faults and reliability of disk arrays. Ph.D. thesis, University of Helsinki, Espoo, Finland (1997)

  27. Lee, E.K., Thekkath, C.A.: Petal: distributed virtual disks. In: Proc. 7th Int’l Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS-VII), Cambridge, MA, October, pp. 84–92 (1996)

    Chapter  Google Scholar 

  28. Li, M., Shu, J., Zheng, W.: GRID codes: strip-based erasure codes with high fault tolerance for storage systems. ACM Trans. Storage 4(4), 15 (2009)

    Article  Google Scholar 

  29. Sun, H., Tyan, T., Johnson, S., Elling, R., Talagala, N., Wood, R.B.: Performability analysis of storage systems in practice: methodology and tools. In: Proc. 3rd Int’l Service Availability Symp. (ISAS 2006). Helsinki, Finland, May 2006. Lecture Notes in Computer Science, vol. 4328, pp. 62–75. Springer, Berlin (2006) (Revised selected papers)

    Chapter  Google Scholar 

  30. Thomasian, A., Menon, J.: Performance analysis of RAID5 disk arrays with a vacationing server model for rebuild mode operation. In: Proc. IEEE Int’l Conf. on Data Engineering (ICDE’94), Houston, TX, February, pp. 111–119 (1994)

    Chapter  Google Scholar 

  31. Thomasian, A., Menon, J.: RAID5 performance with distributed sparing. IEEE Trans. Parallel Distrib. Syst. 8(6), 640–657 (1997)

    Article  Google Scholar 

  32. Thomasian, A.: Reconstruct versus read-modify writes in RAID. Inf. Process. Lett. 93(4), 163–168 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  33. Thomasian, A.: Clustered RAID arrays and their access costs. Comput. J. 48(6), 702–713 (2005)

    Article  MathSciNet  Google Scholar 

  34. Thomasian, A.: Mirrored disk routing and scheduling. Clust. Comput. 9(4), 475–484 (2006)

    Article  Google Scholar 

  35. Thomasian, A.: Shortcut method for reliability comparisons in RAID5. J. Syst. Softw. 79(11), 1599–1605 (2006)

    Article  Google Scholar 

  36. Thomasian, A., Blaum, M.: Mirrored disk organization reliability analysis. IEEE Trans. Comput. 55(12), 1640–1644 (2006)

    Article  Google Scholar 

  37. Thomasian, A., Fu, G., Han, C.: Performance of two-disk failure-tolerant disk arrays. IEEE Trans. Comput. 56(6), 799–814 (2007)

    Article  MathSciNet  Google Scholar 

  38. Thomasian, A., Xu, J.: Reliability and performance of mirrored disk organizations. Comput. J. 51(6), 615–629 (2008)

    Article  Google Scholar 

  39. Thomasian, A., Blaum, M.: Higher reliability redundant disk arrays: organization, operation, and coding. ACM Trans. Storage 5(3), 7 (2009)

    Article  Google Scholar 

  40. Thomasian, A.: Survey and analysis of disk scheduling methods. Comput. Archit. News 39(2), 8–25 (2011)

    Article  Google Scholar 

  41. Thomasian, A., Xu, J.: RAID level selection for heterogeneous disk arrays. Clust. Comput. 14(2), 115–127 (2011)

    Article  Google Scholar 

  42. Thomasian, A., Tang, Y.: Performance, reliability, and performability aspects of hierarchical RAID. In: Proc. 6th Int’l Conf. on Networking, Architecture, and Storage (NAS 2011), Dalian, China, July, pp. 92–101 (2011)

    Chapter  Google Scholar 

  43. Trivedi, K.S.: Probability and Statistics with Reliability, Queuing, and Computer Science Applications, 2nd edn. Wiley, New York (2001)

    Google Scholar 

  44. Venkatesan, V., Iliadis, I., Hu, X.-Y., Haas, R., Fragouli, C.: Effect of replica placement on the reliability of large-scale data storage systems. In: Proc. 18th Ann’l IEEE/ACM Int’l Symp. on Modeling, Analysis and Simulation of Computer and Telecomm. Systems (MASCOTS’10), Miami, FL, August, pp. 79–88 (2010)

    Google Scholar 

  45. Venkatesan, V., Iliadis, I., Fragouli, C., Urbanke, R.: Reliability of clustered vs. declustered replica placement in data storage systems. In: Proc. 19th Ann’l IEEE/ACM Int’l Symp. on Modeling, Analysis and Simulation of Computer and Telecomm. Systems (MASCOTS’11), Raffles Hotel, Singapore, August, pp. 307–317 (2011)

    Google Scholar 

  46. Venkatesan, V., Iliadis, I., Hass, R.: Reliability of data storage systems under network rebuild bandwidth constraints. In: Proc. 20th Ann’l IEEE/ACM Int’l Symp. on Modeling, Analysis and Simulation of Computer and Telecomm. Systems (MASCOTS’12), Washington, D.C., August, pp. 79–88 (2012)

    Google Scholar 

  47. Wilner, A.: Multiple drive failure tolerant RAID system. US Patent 6,327,672, December 2001

  48. Xu, L., Bohossian, V., Bruck, J., Wagner, D.G.: Low-density MDS codes and factors of complete graphs. IEEE Trans. Inf. Theory 45(6), 1817–1836 (1999)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alexander Thomasian.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Thomasian, A., Tang, Y. Performance, reliability, and performability of a hybrid RAID array and a comparison with traditional RAID1 arrays. Cluster Comput 15, 239–253 (2012). https://doi.org/10.1007/s10586-012-0216-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-012-0216-9

Keywords

Navigation