Leakage-Aware Modulo Scheduling for Embedded VLIW Processors

  • Yong GuanEmail author
  • Jingling Xue


As semi-conductor technologies move down to the nanometer scale, leakage power has become a significant component of the total power consumption. In this paper, we present a leakage-aware modulo scheduling algorithm to achieve leakage energy saving for applications with loops on Very Long Instruction Word (VLIW) architectures. The proposed algorithm is designed to maximize the idleness of function units integrated with the dual-threshold domino logic, and reduce the number of transitions between the active and sleep modes. We have implemented our technique in the Trimaran compiler and conducted experiments using a set of embedded benchmarks from DSPstone and Mibench on the cycle-accurate VLIW simulator of Trimaran. The results show that our technique achieves significant leakage energy saving compared with a previously published DAG-based (Directed Acyclic Graph) leakage-aware scheduling algorithm.


leakage power very long instruction word (VLIW) software pipelining modulo scheduling 

Supplementary material

11390_2011_1143_MOESM1_ESM.pdf (48 kb)
(PDF 48.2 kb)


  1. [1]
    Chen J J, Kuo T W. Allocation cost minimization for periodic hard real-time tasks in energy-constrained DVS systems. In Proc. the 2006 IEEE/ACM International Conference on Computer-Aided Design (ICCAD2006), San Jose, USA, Nov. 5–9, 2006, pp.255–260.Google Scholar
  2. [2]
    Khouri K S, Jha N K. Leakage power analysis and reduction during behavioral synthesis. IEEE Transactions on Very Large Scale Integration Systems, 2002, 10(6): 876–885.CrossRefGoogle Scholar
  3. [3]
    Luo J, Jha N K, Peh L S. Simultaneous dynamic voltage scaling of processors and communication links in real-time distributed embedded systems. IEEE Transactions on Very Large Scale Integration Systems, 2007, 15(4): 427–437.CrossRefGoogle Scholar
  4. [4]
    Wang Y, Liu H, Liu D, Qin Z, Shao Z, Sha E. Overhead-aware energy optimization for real-time streaming applications on multiprocessor system-on-chip. ACM Transactions on Design Automation of Electronic Systems, 16(2): Article No.14.Google Scholar
  5. [5]
    James K, Siva N, Anantha C. Subthreshold leakage modeling and reduction techniques. In Proc. the 2002 IEEE/ACM International Conference on Computer-Aided Design (ICCAD2002), San Jose, USA, Nov. 10–14, 2002, pp.141–148.Google Scholar
  6. [6]
    Sylvester D, Kaul H. Power-driven challenges in nanometer design. IEEE Design Test of Computers, 2001, 18(6): 12–21.CrossRefGoogle Scholar
  7. [7]
    Butts J A, Sohi G S. A static power model for architects. In Proc. the 33rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), Monterey, Canada, Dec. 10–13, 2000, pp.191–201.Google Scholar
  8. [8]
    Dropshot S, Kursun V, Albonesi D H. Managing static leakage energy in microprocessor functional units. In Proc. the 35th Annual ACM/IEEE International Symposium on Microarchitecture (MICRO), Istanbul, Turkey, Nov. 18–22, 2002. pp.321–332.Google Scholar
  9. [9]
    Tsai Y F, Duarte D E, Vijaykrishnan N, Irwin M J. Characterization and modeling of run-time techniques for leakage power reduction. IEEE Transactions on Very Large Scale Integration Systems, 2004, 12(11): 1221–1233.CrossRefGoogle Scholar
  10. [10]
    Tschanz J W, Narendra S G, Ye Y, Bloechel B A, Borkar S, De V. Dynamic sleep transistor and body bias for active leakage power control of microprocessors. IEEE Journal of Solid-State Circuits, 2003, 38(11): 1838–1845.CrossRefGoogle Scholar
  11. [11]
    Carpenter G. Low power SOC for IBM’s PowerPC information appliance platform, http://www/
  12. [12]
    Chen J J, Kuo T W. Procrastination for leakage-aware ratemonotonic scheduling on a dynamic voltage scaling processor. In Proc. the 2006 ACM SIGPLAN/SIGBED Conference on Language, Compilers, and Tool Support for Embedded Systems (LCTES 2006), Ottawa, Canada, Jun. 14–16, 2006, pp.153–162.Google Scholar
  13. [13]
    de Langen P, Juurlink B. Leakage-aware multiprocessor scheduling for low power. In Proc. the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), Rhode Island, Greece, Apr. 25–29, 2006. pp.8–15.Google Scholar
  14. [14]
    Lee Y H, Reddy K P, Krishna C M. Scheduling techniques for reducing leakage power in hard real-time systems. In Proc. the 15th Euromicro Conference on Real-Time Systems, Porto, Portugal, Jul. 2–4, 2003, pp.105–112.Google Scholar
  15. [15]
    Luo J, Jha N K. Power-efficient scheduling for heterogeneous distributed real-time embedded systems. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2007, 26(6): 1161–1170.CrossRefGoogle Scholar
  16. [16]
    Quan G, Niu L, Hu S X, Mochocki B. Fixed priority scheduling for reducing overall energy on variable voltage processors. In Proc. the 25th IEEE Real-Time System Symposium (RTSS 2004), Lisbon, Portugal, Dec. 5–8, 2004, pp.309–318.Google Scholar
  17. [17]
    Yan L, Luo J, Jha N K. Joint dynamic voltage scaling and adaptive body biasing for heterogeneous distributed real-time embedded systems. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2005, 24(7): 1030–1041.CrossRefGoogle Scholar
  18. [18]
    Zhang W, Allu B. Reducing branch predictor leakage energy by exploiting loops. ACM Transactions on Embedded Computing Systems, 2007, 6(2): 1539–9087.CrossRefGoogle Scholar
  19. [19]
    Zhang W, Tsai Y F, Duarte D, Vijaykrishnan N, Kandemir M, Irwin M J. Reducing dynamic and leakage energy in VLIW architectures. ACM Transactions on Embedded Computing Systems, 2006, 5(1): 1–28.Google Scholar
  20. [20]
    Zhong X, Xu C Z. System-wide energy minimization for real-time tasks: Lower bound and approximation. In Proc. IEEE/ACM International Conference on Computer-Aided Design (ICCAD2006), San Jose, USA, Nov. 5–9, 2006, pp.516–521.Google Scholar
  21. [21]
    Zhong X, Xu C Z. System-wide energy minimization for realtime tasks: Lower bound and approximation. ACM Transactions on Embedded Computing Systems, 2008, 7(3):1–14.MathSciNetCrossRefGoogle Scholar
  22. [22]
    Kim H S, Vijaykrishnan N, Kandemir M, Irwin M J. Adapting instruction level parallelism for optimizing leakage in VLIW architectures. In Proc. 2003 ACM SIGPLAN Conference on Language, Compiler, and Tool for Embedded Systems (LCTES 2003), San Diego, USA, Jun. 11–13, 2003, pp.275–283.Google Scholar
  23. [22]
    Sery G, Borkar S, De V. Life is CMOS: Why chase the life after?. In Proc. the 39th Design Automation Conference (DAC 2002), New Orleans, USA, Jun. 10–14, 2002, pp.78–83.Google Scholar
  24. [24]
    Nagpal R, Srikant Y N. Compiler-assisted leakage energy optimization for clustered VLIW architectures. In Proc. the 6th ACM & IEEE International Conference on Embedded Software (EMSOFT 2006), Seoul, Korea, Oct. 22–25, 2006, pp.233–241.Google Scholar
  25. [25]
    Siddharth R, Santosh P, Soner O, Rajiv G. Optimizing static power dissipation by functional units in superscalar processors. In Proc. the 11th International Conference on Compiler Construction (CC 2002), Grenoble, France, Apr. 8–12, 2002, pp.261–275.Google Scholar
  26. [26]
    You Y P, Lee C, Lee J K. Compilers for leakage power reduction. ACM Transactions on Design Automation of Electronic Systems, 2006, 11(1): 147–164.CrossRefGoogle Scholar
  27. [27]
    Rong H, Douillet A, Gao G R. Register allocation for software pipelined multi-dimensional loops. In Proc. the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2005), Chicago, USA, Jun. 12–15, 2005, pp.154–167.Google Scholar
  28. [28]
    Gao L, Nguyen Q H, Li L, Xue J, Ngai T F. Threadsensitive modulo scheduling for multicore processors. In Proc. the 37th International Conference on Parallel Processing (ICPP 2008), Portland, USA, Sept. 8–12, 2008, pp.132–140.Google Scholar
  29. [29]
    Rau B R. Iterative module scheduling: An algorithm for software pipelining loops. In Proc. the 27th Annual International Symposium on Microarchitecture, San Jose, USA, Nov. 30–Dec. 2, 1994, pp.63–74.Google Scholar
  30. [30]
    Wang L, Xue J, Yang X. Reuse-aware modulo scheduling for stream processors. In Proc. Design, Automation & Test in Europe Conference & Exhibition (DATE2010), Dresden, Germany, Mar. 8–12, 2010, pp.1112–1117.Google Scholar
  31. [31]
    Yang H, Gao G R, Leung C. On achieving balanced power consumption in software pipelined loops. In Proc. the 2002 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES 2002), Grenoble, France, Oct. 8–11, 2002, pp.210–217.Google Scholar
  32. [32]
    Yang H, Govindarajan R, Gao G R, Cai G, Hu Z. Exploiting schedule slacks for rate-optimal power-minimum software pipelining. In Proc. Workshop on Compilers and Operating Systems for Low Power (COLP-2002), Charlottesville, USA, Sept. 22, 2002, pp.151–161.Google Scholar
  33. [33]
    Yun H S, Kim J. Power-aware modulo scheduling for highperformance VLIW processors. In Proc. the 2001 International Symposium on Low Power Electronics and Design (ISLPED 2001), Huntington Beach, USA, Aug. 6–7, 2001, pp.40–45.Google Scholar
  34. [34]
    Chakrapani L N, Gyllenhaal J, HwuWW, Mahlke S A, Palem K V, Rabbah R M. Trimaran: An infrastructure for research in instruction-level parallelism. In Proc. LCPC 2004, West Lafayette, USA, Sept. 22–24, 2004, pp.32–41.Google Scholar
  35. [35]
    Uzivojnovic V, Velarde J M, Schlager C, Meyr H. DSPSTONE: A DSP-oriented benchmarking methodology. In Proc. the International Conference on Signal Processing and Technology (ICSPAT 1994), Orlando, USA, Nov. 1–4, 1994, pp.715–722.Google Scholar
  36. [36]
    Shao Z, Wang M, Chen Y, Xue C, Qiu M, Yang L T, Sha E H M. Real-time dynamic voltage loop scheduling for multicore embedded systems. IEEE Transactions on Circuits and Systems II, 2007, 54(5):445–449.CrossRefGoogle Scholar
  37. [37]
    Chantem T, Dick R P, Hu X S. Temperature-aware scheduling and assignment for hard real-time applications on MPSoCs. In Proc. the Conference on Design, Automation and Test in Europe (DATE2008), Munich, Germany, Mar. 10–14, 2008, pp.288–293.Google Scholar
  38. [38]
    Quan G, Zhang Y, Wiles W, Pei P. Guaranteed scheduling for repetitive hard real-time tasks under the maximal temperature constraint. In Proc. the 6th IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS 2008), Atlanta, USA, Oct. 19–24, 2008, pp.267–272.Google Scholar

Copyright information

© Springer 2011

Authors and Affiliations

  1. 1.College of Information EngineeringCapital Normal UniversityBeijingChina
  2. 2.Programming Languages and Compilers GroupSchool of Computer Science and Engineering University of New South WalesSydneyAustralia

Personalised recommendations