Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Energy efficient redundant configurations for real-time parallel reliable servers

Abstract

Modular redundancy and temporal redundancy are traditional techniques to increase system reliability. In addition to being used as temporal redundancy, with technology advancements, slack time in a system can also be used by energy management schemes to save energy. In this paper, we consider the combination of modular and temporal redundancy to achieve energy efficient reliable real-time service provided by multiple servers. We first propose an efficient adaptive parallel recovery scheme that appropriately processes service requests in parallel to increase the number of faults that can be tolerated and thus system reliability. Then we explore schemes to determine the optimal redundant configurations of the parallel servers to minimize system energy consumption for a given reliability goal or to maximize system reliability for a given energy budget. Our analysis results show that small requests, optimistic approaches, and parallel recovery favor lower levels of modular redundancy, while large requests, pessimistic approaches and restricted serial recovery favor higher levels of modular redundancy.

This is a preview of subscription content, log in to check access.

References

  1. Austin T, Blaauw D, Mudge T, Flautner K (2004) Making typical silicon matter with razor. In: IEEE computer

  2. Aydin H, Devadas V, Zhu D (2006) System-level energy management for periodic real-time tasks. In: Proc of the 27th IEEE real-time systems symposium (RTSS). Piscataway, NJ, USA. IEEE Comput Soc, Los Alamitos

  3. Aydin H, Melhem R, Mossé D, Mejia-Alvarez P (2001) Dynamic and aggressive scheduling techniques for power-aware real-time systems. In: Proc of the 22th IEEE real-time systems symposium

  4. Bohrer P, Elnozahy EN, Keller T, Kistler M, Lefurgy C, McDowell C, Rajamony R (2002) The case for power management in web servers. In: Power aware computing. Plenum/Kluwer, New York. Chap 1

  5. Burd TD, Brodersen RW (1995) Energy efficient CMOS microprocessor design. In: Proc of fhe HICSS conference

  6. Castillo X, McConnel S, Siewiorek D (1982) Derivation and calibration of a transient error reliability model. IEEE Trans Comput 31(7):658–671

  7. Chen J-J, Kuo T-W (2007) Procrastination determination for periodic real-time tasks in leakage-aware dynamic voltage scaling systems. In: Proc of the 2007 IEEE/ACM int’l conference on computer-aided design (ICCAD), pp 289–294

  8. Ejlali A, Schmitz MT, Al-Hashimi BM, Miremadi SG, Rosinger P (2005) Energy efficient SEU-tolerance in DVS-enabled real-time systems through information redundancy. In: Proc of the int’l symposium on low power and electronics and design (ISLPED)

  9. Elnozahy EM, Kistler M, Rajamony R (2002a) Energy-efficient server clusters. In: Proc of power aware computing systems

  10. Elnozahy EM, Melhem R, Mossé D (2002b) Energy-efficient duplex and TMR real-time systems. In: Proc of the IEEE real-time systems symposium

  11. Fan X, Ellis C, Lebeck A (2003) The synergy between power-aware memory systems and processor voltage. In: Proc of the workshop on power-aware computing systems

  12. Foster I (1995) Design and building parallel programs. Addison-Wesley, Reading. Chap 1.4.4

  13. Hua S, Qu G (2005) Power minimization techniques on distributed real-time systems by global and local slack management. In: Proc of the 2005 conference on Asia South Pacific design automation. ACM Press, New York, pp 830–835

  14. Intel (2006) Intel XScale Processors. http://developer.intel.com/design/intelxscale/

  15. Irani S, Shukla S, Gupta R (2003) Algorithms for power savings. In: Proc of the 14th symposium on discrete algorithms

  16. Ishihara T, Yauura H (1998) Voltage scheduling problem for dynamically variable voltage processors. In: Proc of the 1998 international symposium on low power electronics and design

  17. Iyer R, Rossetti DJ, Hsueh M (1986) Measurement and modeling of computer reliability as affected by system activity. ACM Trans Comput Syst 4(3):214–237

  18. Jejurikar R, Pereira C, Gupta R (2004) Leakage aware dynamic voltage scaling for real-time embedded systems. In: Proc of the 41st annual design automation conference (DAC)

  19. Lebeck AR, Fan X, Zeng H, Ellis CS (2000) Power aware page allocation. In: Proc of the 9th international conference on architectural support for programming languages and operating systems

  20. Lefurgy C, Rajamani K, Rawson F, Felter W, Kistler M, Keller TW (2003) Energy management for commercial servers. IEEE Comput 36(12):39–48

  21. Luo J, Jha NK (2000) Power-conscious joint scheduling of periodic task graphs and aperiodic tasks in distributed real-time embedded systems. In: Proc of international conference on computer aided design

  22. Luo J, Jha NK (2002) Static and dynamic variable voltage scheduling algorithms for real-time heterogeneous distributed embedded systems. In: Proc of 15th international conference on VLSI design

  23. Mahapatra RN, Zhao W (2005) An energy-efficient slack distribution technique for multimode distributed real-time embedded systems. IEEE Trans Parallel Distrib Syst 16(7):650–662

  24. Melhem R, Mossé D, Elnozahy EM (2004) The interplay of power management and fault recovery in real-time systems. IEEE Trans Comput 53(2):217–231

  25. Mishra R, Rastogi N, Zhu D, Mossé D, Melhem R (2003) Energy aware scheduling for distributed real-time systems. In: Proc of international parallel and distributed processing symposium (IPDPS), Piscataway, NJ, USA. IEEE Comput Soc, Los Alamitos, pp 21–29

  26. Pillai P, Shin KG (2001) Real-time dynamic voltage scaling for low-power embedded operating systems. In: Proc of 18th ACM symposium on operating systems principles (SOSP’01)

  27. Pop P, Poulsen K, Izosimov V, Eles P (2007) Scheduling and voltage scaling for energy/reliability trade-offs in fault-tolerant time-triggered embedded systems. In: Proc of the 5th IEEE/ACM int’l conference on hardware/software codesign and system synthesis (CODES+ISSS), pp 233–238

  28. Pradhan DK (1986) Fault tolerance computing: theory and techniques. Prentice Hall, New York

  29. Rambus (1999) RDRAM. http://www.rambus.com/

  30. Rusu C, Ferreira A, Scordino C, Watson A, Melhem R, Mossé D (2006) Energy-efficient real-time heterogenerous sever clusters. In: Proc of the IEEE real-time and embedded technology and applications symposium (RTAS)

  31. Saewong S, Rajkumar R (2003) Practical voltage scaling for fixed-priority RT-systems. In: Proc of the 9th IEEE real-time and embedded technology and applications symposium

  32. Seth K, Anantaraman A, Mueller F, Rotenberg E (2003) FAST: frequency-aware static timing analysis. In: Proc of the IEEE real-time system symposium

  33. Sharma V, Thomas A, Abdelzaher T, Skadron K, Lu Z (2003) Power-aware QoS management in Web servers. In: Proc of the 24th IEEE real-time system symposium

  34. Shin KG, Kim H (1994) A time redundancy approach to TMR failures using fault-state likelihoods. IEEE Trans Comput 43(10):1151–1162

  35. Sinha A, Chandrakasan AP (2001) JouleTrack—a Web based tool for software energy profiling. In: Proc of design automation conference

  36. Thompson S, Packan P, Bohr M (1998) MOS scaling: transistor challenges for the 21st century. Intel Technol J Q3

  37. Unsal OS, Koren I, Krishna CM (2002) Towards energy-aware software-based fault tolerance in real-time systems. In: Proc of the international symposium on low power electronics design (ISLPED)

  38. Weiser M, Welch B, Demers A, Shenker S (1994) Scheduling for reduced CPU energy. In: Proc of the first USENIX symposium on operating systems design and implementation

  39. Xu R, Zhu D, Rusu C, Melhem R, Mossé D (2005) Energy Efficient Policies for Embedded Clusters. In: Proc of the conference on language, compilers, and tools for embedded systems (LCTES). ACM, New York, pp 1–10

  40. Yao F, Demers A, Shenker S (1995) A scheduling model for reduced CPU energy. In: Proc of the 36th annual symposium on foundations of computer science

  41. Zhang Y, Chakrabarty K (2003) Energy-aware adaptive checkpointing in embedded real-time systems. In: Proc of IEEE/ACM design, automation and test in Europe conference (DATE)

  42. Zhang Y, Chakrabarty K (2004) Task feasibility analysis and dynamic voltage scaling in fault-tolerant real-time embedded systems. In: Proc of IEEE/ACM design, automation and test in Europe conference (DATE)

  43. Zhao B, Zhu D, Aydin H (2008) Reliability-aware dynamic voltage scaling for energy-constrained real-time embedded systems. In: Proc of the IEEE international conference on computer design (ICCD)

  44. Zhu D (2006) Reliability-aware dynamic energy management in dependable embedded real-time systems. In: Proc of the IEEE real-time and embedded technology and applications symposium (RTAS)

  45. Zhu D, Aydin H (2006) Energy management for real-time embedded systems with reliability requirements. In: Proc of the int’l conf on computer aidded design

  46. Zhu D, Aydin H (2007) Reliability-aware energy management for periodic real-time tasks. In: Proc of the IEEE real-time and embedded technology and applications symposium (RTAS)

  47. Zhu D, Melhem R, Mossé D (2004a) The effects of energy management on reliability in real-time embedded systems. In: Proc of the int’l conf on computer aidded design

  48. Zhu D, Melhem R, Mossé D, Elnozahy E (2004b) Analysis of an energy efficient optimistic TMR scheme. In: Proc of the 10th int’l conference on parallel and distributed systems

  49. Zhu D, Qi X, Aydin H (2007) Priority-monotonic energy management for real-time systems with reliability requirements. In: Proc of the IEEE international conference on computer design (ICCD)

  50. Zhu D, Aydin H, Chen J-J (2008a) Optimistic reliability aware energy management for real-time tasks with probabilistic execution times. In: Proc of the 29th IEEE real-time systems symposium (RTSS)

  51. Zhu D, Qi X, Aydin H (2008b) Energy management for periodic real-time tasks with variable assurance requirements. In: Proc of the IEEE int’l conference on embedded and real-time computing systems and applications (RTCSA)

Download references

Author information

Correspondence to Dakai Zhu.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Zhu, D., Melhem, R. & Mossé, D. Energy efficient redundant configurations for real-time parallel reliable servers. Real-Time Syst 41, 195–221 (2009). https://doi.org/10.1007/s11241-009-9067-8

Download citation

Keywords

  • Energy management
  • Fault tolerance
  • Transient faults
  • Parallel servers