Skip to main content

Low Power Engineering

  • Chapter
Embedded Systems Design

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 3436))

  • 1617 Accesses

Abstract

Resource usage in embedded system platforms depends on application workload characteristics, desired quality of service and environmental conditions. In general, system workload is highly non-stationary due to the heterogeneous nature of information content. Quality of service depends on user requirements, which may change over time. In addition, both can be affected by environmental conditions such as network congestion and wireless link quality.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Acquaviva, A., Benini, L., Riccò, B.: Software Controlled Processor Speed- Setting for Low-Power Streaming Multimedia. Transaction on CAD (November 2001)

    Google Scholar 

  2. Acquaviva, A., Simunic, T., Deolalikar, V., Roy, S.: Remote Power Control of Wireless Network Interfaces. In: Proceedings of PATMOS, Turin (September 2003)

    Google Scholar 

  3. Benini, L., Bogliolo, A., De Micheli, G.: A survey of design techniques for system-level dynamic power management. IEEE Trans. on VLSI Systems 8(3), 299–316 (2000)

    Article  Google Scholar 

  4. Yuan, W., Nahrstedt, K.: A Middleware Framework Coordinating Processor/ Power Resource Management for Multimedia Applications. In: Proceedings of IEEE GLOBECOM (November 2001)

    Google Scholar 

  5. Lu, Y.H., Benini, L., De Micheli, G.: Dynamic Frequency Scaling with Buffer Insertion for Mixed Workloads. IEEE Transaction on CAD (November 2002)

    Google Scholar 

  6. Zhao, J., Chandramouli, R., Vijaykrishnan, N., Irwin, M.J., Kang, B., Somasundaram, S.: Influence of MPEG-4 Parameters on System Energy. Proceedings of IEEE ASIC/SOC (2002)

    Google Scholar 

  7. Chung, E.Y., Benini, L., De Micheli, G.: Contents Provider-Assisted Dynamic Voltage Scaling for Low Energy Multimedia Applications. Proceedings of IEEE ISLPED (August 2002)

    Google Scholar 

  8. Delaney, B., Jayant, N., Hans, M., Simunic, T., Acquaviva, A.: A low-power, fixed-point, front-end feature extraction for a distributed speech recognition system. In: IEEE Proceedings of ICASSP (May 2002)

    Google Scholar 

  9. Sinha, A., Wang, A., Chandrakasan, A.: Energy Scalable System Design. IEEE Transaction on VLSI 10(2) (April 2002)

    Google Scholar 

  10. He, Z.L., Chan, K.K., Tsui, C.Y., Liou, M.L.: Low-Power Motion Estimation Design Using Adaptive Pixel Truncation. IEEE Proceedings of ISLPED (1997)

    Google Scholar 

  11. He, Z.L., Liou, M.L.: Reducing Hardware Complexity of Motion Estimation Algorithms using Truncated Pixels. In: Proceedings of IEEE ISCAS (June 1997)

    Google Scholar 

  12. Yuan, W., Nahrstedt, K., Kim, K.: R-EDF: A Reservation Based EDF Scheduling Algorithm for Multiple Multimedia Task Classes. In: IEEE Real-Time Technology and Applications Symposium (May 2001)

    Google Scholar 

  13. Kumar, P., Srivastava, M.: Power Aware Multimedia Systems using Run-Time Prediction. In: Proceedings of IEEE VLSI Design (January 2001)

    Google Scholar 

  14. Yavatkar, R., Laksman, K.: A CPU Scheduling Algorithm for Continuous Media Applications. In: Workshop on Network and OS Support for Digital Audio and Video (April 1995)

    Google Scholar 

  15. Gatti, F., Acquaviva, A., Benini, L., Riccò, B.: Power Control Techniques for TFT LCD Displays. In: Procedings of ACM CASES, Grenoble (2002)

    Google Scholar 

  16. Gruian, F.: Energy-Centric Scheduling for Real-Time Systems, Doctoral Dissertation, Lund University, Faculty of technology (2002)

    Google Scholar 

  17. Min, R., Chandrakasan, A.: A Framework for Energy-Scalable Communication in High Density Wireless Networks. In: Proceedings of IEEE ISLPED (August 2002)

    Google Scholar 

  18. Sinha, A., Wang, A., Chandrakasan, A.: Algorithmic Transforms for Efficient Energy Scalable Computation. In: Proceedings of IEEE ISLPED (August 2000)

    Google Scholar 

  19. Simunic, T., Benini, L., Acquaviva, A., Glynn, P., de Micheli, G.: Dynamic voltage scaling and power management for portable systems. In: IEEE Proceedings of DAC (June 2001)

    Google Scholar 

  20. Chandrasena, L.H., Liebelt, M.J.: A comprehensive analysis of energy savings in dynamic supply voltage scaling systems using data dependent voltage level selection. In: Proceedings of IEEE International Conference on Multimedia and Expo (July-August 2000)

    Google Scholar 

  21. Pouwelse, J., Langendoen, K., Sips, H.: Energy priority scheduling for variable voltage processors. In: IEEE Proceedings of ISLPED (August 2001)

    Google Scholar 

  22. Chandrakasan, A.P., Sheng, S., Brodersen, R.W.: Low Power CMOS Digital Design. IEEE Journal of Solid State Circuits 27(4) (April 1992)

    Google Scholar 

  23. Choi, I., Shim, H., Chang, N.: Low-Power Color TFT LCD Display for Hand- Held Embedded Systems. In: IEEE Proceedings of ISLPED (August 2002)

    Google Scholar 

  24. Qu, G., Potkonjak, M.: Energy minimization with guaranteed quality of service. In: Proceedigns of IEEE ISLPED (July 2000)

    Google Scholar 

  25. Abu-Sufah, W., Kuck, D., Lawrie, D.: On the performance enhancement of paging systems through program analysis and transformations. IEEE Trans. on Computers C-30(5), 341–355 (1981)

    Article  Google Scholar 

  26. Albenosi, D.H.: Selective cache ways: On-demand cache resource allocation. Journal of Instruction-Level Paralleism 2, 1–6 (2000)

    Google Scholar 

  27. Albera, G., Bahar, I.: Power/performance advantages of victim bu®er in highperformance processors. In: IEEE Alessandro Volta Memorial Intnl. Wsh. on Low Power Design (VOLTA), Como, Italy, March 1999, pp. 43–51 (1999)

    Google Scholar 

  28. Allen, R., Kennedy, K.: Vector register allocation. IEEE Trans. on Computers 41(10), 1290–1316 (1992)

    Article  Google Scholar 

  29. Amarasinghe, S., Anderson, J., Lam, M., Tseng, C.: The SUIF compiler for scalable parallel machines. In: Proc. of the 7th SIAM Conf. on Parallel Proc. for Scientific Computing (1995)

    Google Scholar 

  30. Ancourt, C., Barthou, D., Guettier, C., Irigoin, F., Jeannet, B., Jourdan, J., Mattioli, J.: Automatic data mapping of signal processing applications. In: Proc. Intnl. Conf. on Applic.-Spec. Array Processors, Zurich, Switzerland, July 1997, pp. 350–362 (1997)

    Google Scholar 

  31. Bajwa, R.S., Hiraki, M., Kojima, H., Gorny, D.J., Nitta, K., Shridhar, A., Seki, K., Sasaki, K.: Instruction bu®ering to reduce power in processors for signal processing. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 5(4), 417–424 (1997)

    Article  Google Scholar 

  32. Banerjee, P., Chandy, J., Gupta, M., Hodges, E., Holm, J., Lain, A., Palermo, D., Ramaswamy, S., Su, E.: The Paradigm compiler for distributed-memory multicomputers. IEEE Computer Magazine 28(10), 37–47 (1995)

    Google Scholar 

  33. Banerjee, U., Eigenmann, R., Nicolau, A., Padua, D.: Automatic program parallelisation. Proc. of the IEEE, invited paper 81(2), 211–243 (1993)

    Google Scholar 

  34. Bartolini, S., Prete, C.A.: A software strategy to improve cache performance. IEEE TC on Computer Architecture Newsletter, 35–40 (January 2001)

    Google Scholar 

  35. Belady, L.A.: A study of replacement algorithms for a virtual-storage computer. IBM Systems J. 5(6), 78–101 (1966)

    Article  Google Scholar 

  36. Bellas, N., Hajj, I., Polychronopoulos, C., Stamoulis, G.: Architectural and compiler support for energy reduction in the memory hierarchy of high performance microprocessors. In: Proc of International Symposium on Low Power Electronic Design (ISLPED) (August 1998)

    Google Scholar 

  37. Benini, L., Macii, A., Macii, E., Poncino, M.: Selective instruction compression for memory energy reduction in embedded systems. In: Proc of International Symposium on Low Power Electronic Design (ISLPED) (August 1999)

    Google Scholar 

  38. Benini, L., Macii, A., Nannarelli, A.: Cached-code compression for energy minimization in embedded processors. In: Proc of Interational Symposium on Low Power Electronic Design (ISLPED) (August 2001)

    Google Scholar 

  39. Benini, L., Bruni, D., Chinosi, M., Silvano, C., Zaccaria, V.: A Power Modeling and Estimation Framework for VLIW-Based Embedded System. ST Journal of System Research 3(1), 110–118 (2002)

    Google Scholar 

  40. Benini, L., De Micheli, G.: System-level power optimization techniques and tools. ACM Trans. on Design Automation for Embedded Systems (TODAES) 5(2), 115–192 (2000)

    Article  Google Scholar 

  41. Besdéz, A., Ferenc, R., Gyimttthy, T., Dolenc, A., Karsisto, K.: Survey of codesize reduction methods. ACM Computing Surveys (CSUR) 35(3), 223–267 (2003)

    Article  Google Scholar 

  42. Bhattacharyya, S., Murthy, P., Lee, E.: Optimal parenthesization of lexical orderings for DSP block diagrams. In: IEEE Wsh. on VLSI signal processing, Osaka, Japan (October 1995); Kuroda, I., Nishitani, T. (eds.) Also in VLSI Signal Processing VIII, pp. 177–186. IEEE Press, New York (1995)

    Google Scholar 

  43. Bodin, F., Jalby, W., Windheiser, D., Eisenbeis, C.: A quantitative algorithm for data locality optimization, Technical Report IRISA/INRIA, Rennes, France (1992)

    Google Scholar 

  44. Bona, A., Sami, M., Sciuto, D., Zaccaria, V., Silvano, C., Zafalon, R.: An instruction-level methodology for power estimation and optimization of embedded vliw cores. In: Proc. of Design Automation and Test in Europe (DATE) (March 2002)

    Google Scholar 

  45. Calder, B., Grunwald, D., Emer, J.: Predictive sequential associative cache. In: Proc. of 2nd International Symposium on High Performance Computer Architecture (HPCA) (February 1996)

    Google Scholar 

  46. Catthoor, F.: Energy-delay e±cient data storage and transfer architectures and methodologies: current solutions and remaining problems. In: Smailagic, A., Brodersen, R., De Man, H. (eds.) J. of VLSI Signal Processing (special issue on “IEEE CS Annual Wsh. on VLSI”), July 1999, vol. 21(3), pp. 219–232. Kluwer, Boston (1999)

    Google Scholar 

  47. Catthoor, F., Wuytack, S., De Greef, E., Balasa, F., Nachtergaele, L., Vandecappelle, A.: Custom Memory Management Methodology – Exploration of Memory Organisation for Embedded Multimedia System Design. Kluwer Acad. Publ., Boston (1998) ISBN 0-7923-8288-9

    MATH  Google Scholar 

  48. Catthoor, F., Danckaert, K., Kulkarni, C., Brockmeyer, E., Kjeldsberg, P.G., Van Achteren, T., Omnes, T.: Data Access and Storage Management for Embedded Programmable Processors. Kluwer Acad. Publ., Boston (2002) ISBN 0-7923-7689-7

    MATH  Google Scholar 

  49. Centoducatte, P., Araujo, G., Pannain, R.: Compressed code execution on dsp architectures. In: Proc. of International Symposium on System Synthesis (ISSS) (November 1999)

    Google Scholar 

  50. Chen, S., Postula, A.: Synthesis of custom interleaved memory systems. IEEETrans. on VLSI Systems 8(1), 74–83 (2000)

    Article  Google Scholar 

  51. Cheng, W.C., Pedram, M.: Power-aware bus encoding techniques for i/o and data busses in an embedded system. Journal of Circuits, Systems, and Computers 11(4), 351–364 (2002)

    Article  Google Scholar 

  52. Chin, W., Darlington, J., Guo, Y.: Parallelizing conditional recurrences. In: Fraigniaud, P., Mignotte, A., Robert, Y., Bougé, L. (eds.) Euro-Par 1996. LNCS, vol. 1124, pp. 579–586. Springer, Heidelberg (1996)

    Google Scholar 

  53. Cierniak, M., Li, W.: Unifying Data and Control Transformations for Distributed Shared-Memory Machines. In: Proc. of the SIGPLAN 1995 Conf. on Programming Language Design and Implementation, La Jolla, February 1995, pp. 205–217 (1995)

    Google Scholar 

  54. Conte, T.M., Banerjia, S., Larin, S.Y., Menezes, K.N.: Instruction fetch mechanisms for vliw architectures with compressed encodings. In: Proc. of 29th International Symposium on Microarchitecture (MICRO) (December 1996)

    Google Scholar 

  55. Cotterell, S., Vahid, F.: Synthesis of customized loop caches for core-based embedded systems. In: Proc. of International Conference on Computer Aided Design (ICCAD) (November 2002)

    Google Scholar 

  56. Cotterell, S., Vahid, F.: Tuning of loop cache architectures to programs in embedded system design. In: Proc. of International Symposium on System Synthesis (ISSS) (October 2002)

    Google Scholar 

  57. Danckaert, K., Catthoor, F., De Man, H.: A preprocessing step for global loop transformations for data transfer and storage optimization. In: Proc. Intnl. Conf. on Compilers, Arch. and Synth. for Emb. Sys., San Jose CA, November 2000, pp. 34–40 (2000)

    Google Scholar 

  58. Darte, A., Risset, T., Robert, Y.: Loop nest scheduling and transformations. In: Dongarra, J.J., et al. (eds.) Environments and Tools for Parallel Scientific Computing. Advances in Parallel Computing, vol. 6, pp. 309–332. North Holland, Amsterdam (1993)

    Google Scholar 

  59. Debray, S., Evans, W., Muth, R., Sutter, B.D.: Compiler techniques for code compaction. ACM Transactions on Programming Languages and Systems (TOPLAS) 22(2), 378–415 (2000)

    Article  Google Scholar 

  60. Dezan, C., Le Verge, H., Quinton, P., Saouter, Y.: The Alpha du CENTAUR experiment. In: Quinton, P., Robert, Y. (eds.) Algorithms and parallel VLSI architectures II, pp. 325–334. Elsevier, Amsterdam (1992)

    Google Scholar 

  61. Ding, C., Kennedy, K.: The memory bandwidth bottleneck and its amelioration by a compiler. In: Proc. Intnl. Parallel and Distr. Proc. Symp(IPDPS) in Cancun, Mexico, May 2000, pp. 181–189 (2000)

    Google Scholar 

  62. Doalla, R., Fraguela, B., Zapata, E.: Set associative cache behaviour optimization. In: Proc. EuroPar Conf., Toulouse, France, September 1999, pp. 229–238 (1999)

    Google Scholar 

  63. Eisenbeis, C., Jalby, W., Windheiser, D., Bodin, F.: A Strategy for Array Management in Local Memory. In: Proc. of the 4th Wsh. on Languages and Compilers for Parallel Computing (August 1991)

    Google Scholar 

  64. Fang, J.Z., Lu, M.: An iteration partition approach for cache or local memory thrashing on parallel processing. IEEE Trans. on Computers C-42(5), 529–546 (1993)

    Article  Google Scholar 

  65. Feautrier, P.: Compiling for massively parallel architectures: a perspective. In: Moonen, M., Catthoor, F. (eds.) Intnl. Wsh. on Algorithms and Parallel VLSI Architectures, Leuven, Belgium, August 1994. Also in Algorithms and Parallel VLSI Architectures III, pp. 259–270. Elsevier, Amsterdam (1995)

    Chapter  Google Scholar 

  66. Fraboulet, A., Huard, G., Mignotte, A.: Loop alignment for memory access optimisation. In: Proc. 12th ACM/IEEE Intnl. Symp. on System-Level Synthesis (ISSS), San Jose CA, December 1999, pp. 71–70 (1999)

    Google Scholar 

  67. Gannon, D., Jalby, W., Gallivan, K.: Strategies for cache and local memory management by global program transformations. J. of Parallel and Distributed Computing 5, 568–586 (1988)

    Article  Google Scholar 

  68. Gordon-Ross, A., Vahid, F.: Dynamic loop caching meets preloaded loop caching – a hybrid approach. In: Proc. of International Conference on Computer Design (ICCD) (September 2002)

    Google Scholar 

  69. Gordon-Ross, A., Cotterell, S., Vahid, F.: Exploiting fixed programs in embedded systems: A loop cache example. In: Proc. of IEEE Computer Architecture Letters) (January 2002)

    Google Scholar 

  70. Grun, P., Dutt, N., Nicolau, A.: Memory aware compilation through accurate timing extraction. In: Proc. 37th ACM/IEEE Design Automation Conf., Los Angeles, CA, June 2000, pp. 316–321 (2000)

    Google Scholar 

  71. Grun, P., Dutt, N., Nicolau, A.: MIST: an algorithm for memory miss tra±c management. In: Proc. IEEE Intnl. Conf. on Comp. Aided Design, Santa Clara, CA, November 2000, pp. 431–437 (2000)

    Google Scholar 

  72. Halambi, A., Shrivastava, A., Biswas, P., Dutt, N., Nicolau, A.: An e±cient compiler technique for code size reduction using reduced bit-width isas. In: Proc of Design Automation Conference (DAC) (March 2002)

    Google Scholar 

  73. Hall, M., Anderson, J., Amarasinghe, S., Murphy, B., Liao, S., Bugnion, E., Lam, M.: Maximizing multiprocessor performance with the SUIF compiler. IEEE Computer Magazine 30(12), 84–89 (1996)

    Google Scholar 

  74. Harmsze, F., Timmer, A., Meerbergen, J.v.: Memory arbitration and cache management in stream-based systems. In: Proc. 3rd ACM/IEEE Design and Test in Europe Conf (DATE), Paris, France, April 2000, pp. 257–262 (2000)

    Google Scholar 

  75. Inoue, K., Moshnyaga, V.G., Murakami, K.: A history-based i-cache for lowenergy multimedia applications. In: Proc. of ACM/IEEE International Symposium on Low Power Electronics (ISLPED) (August 2002)

    Google Scholar 

  76. Irwin, M.J., Kandemir, M., Vijaykrishnan, N., Sivasubramaniam, A.: A holistic approach to system level energy optimisation. In: Proc. IEEE Wsh. on Power and Timing Modeling, Optimization and Simulation (PATMOS), Goettingen, Germany, October 2000, pp. 88–107 (2000)

    Google Scholar 

  77. Irwin, J., May, M.D., Muller, H.L., Page, D.: Predictable instruction caching for media processors. In: Proc of Internation Conference on Application-Specific Systems, Architectures and processors (ASAP) (July 2002)

    Google Scholar 

  78. Ishihara, T., Yasuura, H.: A power reduction technique with object code merging for application specific embedded processors. In: Proc. of Design Automation and Test in Europe (DATE) (March 2000)

    Google Scholar 

  79. Jayapala, M., Barat, F., OpDeBeeck, P., Catthoor, F., Deconinck, G., Corporaal, H.: A low energy clustered instruction memory hierarchy for long instruction word processors. In: Proc. of 12th International Workshop on Power And Timing Modeling, Optimization and Simulation (PATMOS) (September 2002)

    Google Scholar 

  80. Jimenez, M., Llaberia, J., Fernandez, A., Morancho, E.: A unified transformation technique for multi-level blocking. In: Proc. EuroPar Conf., Lyon, France, August 1996. Lecture notes in computer science series, pp. 402–405. Springer, Heidelberg (1996)

    Google Scholar 

  81. Jouppi, N.: Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch bu®ers. In: Proc. ACM Intnl. Symp. on Computer Arch., May 1990, pp. 364–373 (1990)

    Google Scholar 

  82. Kamble, M., Ghose, K.: Analytical Energy Dissipation Models for Low Power Caches. In: Proc. IEEE Intnl. Symp. on Low Power Design, Monterey, CA, August 1997, pp. 143–148 (1997)

    Google Scholar 

  83. Kampe, M., Dahlgren, F.: Exploration of spatial locality on emerging applications and the consequences for cache performance. In: Proc. Intnl. Parallel and Distr. Proc. Symp(IPDPS), Cancun, Mexico, May 2000, pp. 163–170 (2000)

    Google Scholar 

  84. Kang, J., Werf, A.v.d., Lippens, P.: Mapping array communication onto FIFO communication – towards an implementation. In: Proc. 13th ACM/IEEE Intnl. Symp. on System-Level Synthesis (ISSS), Madrid, Spain, September 2000, pp. 207–213 (2000)

    Google Scholar 

  85. Kelly, W., Pugh, W.: Generating schedules and code within a unified reordering transformation framework, Technical Report UMIACS-TR-92-126, CS-TR- 2995, Institute for Advanced Computer Studies Dept. of Computer Science, Univ. of Maryland, College Park, MD 20742 (1992)

    Google Scholar 

  86. Kim, S., Vijaykrishnan, N., Kandemir, M., Sivasubramaniam, A., Irwin, M.J., Geethanjali, E.: Power-aware partitioned cache architectures. In: Proc. of ACM/IEEE International Symposium on Low Power Electronics (ISLPED) (August 2001)

    Google Scholar 

  87. Kim, S., Vijaykrishnan, N., Kandemir, M., Sivasubramaniam, A., Irwin, M.J.: Partitioned instruction cache architecture for energy e±ciency. In: ACM Transactions on Embedded Computing Systems(TECS) (July 2002)

    Google Scholar 

  88. Kin, J., Gupta, M., Mangione-Smith, W.H.: Filtering memory references to increase energy e±ciency. IEEE Transactions on Computers 49(1), 1–15 (January 2000)

    Article  Google Scholar 

  89. Kjeldsberg, P.G.: Storage requirement estimation and optimisation for dataintensive applications. Doctoral dissertation, Norwegian Univ. of Science and Technology, Trondheim, Norway (March 2001)

    Google Scholar 

  90. Kolson, D., Nicolau, A., Dutt, N.: Minimization of memory tra±c in high-level synthesis. In: Proc. 31st ACM/IEEE Design Automation Conf., San Diego, CA, June 1994, pp. 149–154 (1994)

    Google Scholar 

  91. Lekatsas, H., Henkel, J., Wolf, W.: Code compression for low power embedded system design. In: Proc of Design Automation Conference (DAC) (June 2000)

    Google Scholar 

  92. Leung, S.T., Zahorjan, J.: Restructuring arrays for e±cient parallel loop execution, Technical Report, Dep. of CSE, Univ. of Washington (February 1994)

    Google Scholar 

  93. Li, W., Pingali, K.: A singular loop transformation framework based on nonsingular matrices. In: Proc. 5th Annual Wsh. on Languages and Compilers for Parallelism, New Haven CN (August 1992)

    Google Scholar 

  94. Li, W., Pingali, K.: Access normalization: loop restructuring for NUMA compilers. In: Proc. 5th Intnl. Conf. on Architectural Support for Prog. Lang. and Operating Systems (ASPLOS) (April 1992)

    Google Scholar 

  95. Lim, H.B., Yew, P.-C.: Efficient integration of compiler-directed cache coherence and data prefetching. In: Proc. Intnl. Parallel and Distr. Proc. Symp. (IPDPS) in Cancun, Mexico, May 2000, pp. 331–339 (2000)

    Google Scholar 

  96. Liveris, N., Zervas, N.D., Soudris, D., Goutis, C.E.: A code transformationbased methodology for improving i-cache performance of dsp applications (March 2002)

    Google Scholar 

  97. Loveman, D.B.: Program improvement by source-to-source transformation. J. of the ACM 24(1), 121–145 (1977)

    Article  MATH  MathSciNet  Google Scholar 

  98. Mahendale, M., Sherlekar, S.D., Venkatesh, G.: Extensions to programmable dsp architectures for reduced power dissipation. In: Proc. of VLSI Design (January 1998)

    Google Scholar 

  99. Manjiakian, N., Abdelrahman, T.: Array data layout for reduction of cache conflicts. In: Intnl. Conf. on Parallel and Distributed Computing Systems (1995)

    Google Scholar 

  100. McCrackin, D.: Eliminating interlocks in deeply pipelined processors by delay enforced multistreaming. IEEE Trans. on Computers C-40(10), 1125–1132 (1991)

    Article  Google Scholar 

  101. McKinley, K.: A compiler optimization algorithm for shared-memory multiprocessors. IEEE Trans. on Parallel and Distributed Systems 9(8), 769–787 (1998)

    Article  Google Scholar 

  102. McKinley, K., Hall, M., Harvey, T., Kennedy, K., McIntosh, N., Oldham, J., Paleczny, M., Roth, G.: Experiences using the ParaScope editor: an interactive parallel programming tool. In: 4th ACM SIGPLAN Symp. on Principles and Practice of Parallel Programming, San Diego, USA (May 1993)

    Google Scholar 

  103. McKinley, K., Carr, S., Tseng, C.-W.: Improving data locality with loop transformations. ACM Trans. on Programming Languages and Systems 18(4), 424–453 (1996)

    Article  Google Scholar 

  104. Middelhoek, P., Mekenkamp, G., Molenkamp, B., Krol, T.: A transformational approach to VHDL and CDFG based high-level synthesis: a case study. In: Proc. IEEE Custom Integrated Circuits Conf., Santa Clara, CA, May 1995, pp. 37–40 (1995)

    Google Scholar 

  105. Moon, S.M., Ebcioglu, K.: A study on the number of memory ports in multiple instruction issue machines. Micro’26, 49–58 (November 1993)

    Google Scholar 

  106. Murthy, P., Bhattacharyya, S.: A bu®er merging technique for reducing memory requirements of synchronous dataflow specifications. In: Proc. 12th ACM/IEEE Intnl. Symp. on System-Level Synthesis (ISSS), San Jose, CA, December 1999, pp. 78–84 (1999)

    Google Scholar 

  107. Nachtergaele, L., Tiwari, V., Dutt, N.: System and architecture-level power reduction of microprocessor-based communication and multi-media applications. In: Proc. IEEE Intnl. Conf. on Comp. Aided Design, Santa Clara, CA, November 2000, pp. 569–573 (2000)

    Google Scholar 

  108. Padua, D.A., Wolfe, M.J.: Advanced compiler optimizations for supercomputers. Communications of the ACM 29(12), 1184–1201 (1986)

    Article  Google Scholar 

  109. Panda, P.R.: Memory optimizations and exploration for embedded systems. In: Doctoral Dissertation, U.C.Irvine (April 1998)

    Google Scholar 

  110. Panda, P.R.: Memory bank customization and assignment in behavioural synthesis. In: Proc. IEEE Intnl. Conf. Comp. Aided Design, Santa Clara CA, November 1999, pp. 477–481 (1999)

    Google Scholar 

  111. Panda, P.R., Nakamura, H., Dutt, N.D., Nicolau, A.: Augmenting loop tiling with data alignment for improved cache performance. IEEE Trans. on Computers 48(2), 142–149 (1999)

    Article  Google Scholar 

  112. Panda, P.R., Dutt, N.D., Nicolau, A.: Data cache sizing for embedded processor applications. In: Proc. 1st ACM/IEEE Design and Test in Europe Conf (DATE), Paris, France, February 1998, pp. 925–926 (1998)

    Google Scholar 

  113. Panda, P.R., Dutt, N.D., Nicolau, A.: Local memory exploration and optimization in embedded systems. IEEE Trans. on Comp. aided Design CAD-18(1), 3–13 (1999)

    Article  Google Scholar 

  114. Panda, P., Dutt, N.: Low power mapping of behavioural arrays to multiple memories. In: Proc. IEEE Intnl. Symp. on Low Power Design, Monterey, CA, August 1996, pp. 289–292 (1996)

    Google Scholar 

  115. Parameswaran, S., Henkel, J.: I-copes: Fast instruction code placement for embedded systems to improve performance and energy e±ciency. In: Proc. of Internation Conference on Computer Aided Design (ICCAD) (November 2001)

    Google Scholar 

  116. Parhi, K.: Algorithmic transformation techniques for concurrent processors. Proc. of the IEEE 77(12), 1879–1895 (1989)

    Article  Google Scholar 

  117. Parikh, A., Kandemir, M., Vijaykrishnan, N., Irwin, M.J.: Instruction scheduling based on energy and performance constraints. In: Proc of IEEE Computer Society Annual Workshop on VLSI (WVLSI) (April 2000)

    Google Scholar 

  118. Passos, N., Sha, E.: Full parallelism of uniform nested loops by multi-dimensional retiming. Proc. Intnl. Conf. on Parallel Processing 2, 130–133 (1994)

    Google Scholar 

  119. Passos, N., Sha, E., Chao, L.-F.: Multi-dimensional interleaving for time-andmemory design optimization. In: Proc. IEEE Intnl. Conf. on Computer Design, Austin TX, pp.440-445 (October 1995)

    Google Scholar 

  120. Patterson, D., Hennessey, J.: Computer architecture: A quantitative approach. Morgan Kaufmann Publ., San Francisco (1996)

    MATH  Google Scholar 

  121. Powell, M.D., et al.: Reducing set-associative cache energy via way-prediction and selective direct-mapping. In: Proc. of 34th International Symposium on Microarchitecture (MICRO) (November 2001)

    Google Scholar 

  122. Ramanujam, J., Hong, J., Kandemir, M., Narayan, A.: Reducing memory requirements of nested loops for embedded systems. In: 38th ACM/IEEE Design Automation Conf., Las Vegas NV, June 2001, pp. 359–364 (2001)

    Google Scholar 

  123. Ravi, S., Lakshminarayana, G., Jha, N.: Removal of memory access bottlenecks for scheduling control-flow intensive behavioural descriptions. In: Proc. IEEE Intnl. Conf. Comp. Aided Design, Santa Clara CA, November 1998, pp. 577–584 (1998)

    Google Scholar 

  124. Saltz, J., Berrymann, H., Wu, J.: Multiprocessors and runtime compilation. In: Proc. Intnl. Wsh. on Compilers for Parallel Computers, Paris, France (1990)

    Google Scholar 

  125. Samsom, H., Claesen, L., De Man, H.: SynGuide: an environment for doing interactive correctness preserving transformations. In: Eggermont, L., Dewilde, P., Deprettere, E., van Meerbergen, J. (eds.) IEEE Wsh. on VLSI signal processing, Veldhoven, The Netherlands, October 1993. Also in VLSI Signal Processing VI, pp. 269–277. IEEE Press, New York (1993)

    Chapter  Google Scholar 

  126. Shang, W., Hodzic, E., Chen, Z.: On uniformization of a±ne dependence algorithms. IEEE Trans. on Computers 45(7), 827–839 (1996)

    Article  MATH  MathSciNet  Google Scholar 

  127. Shang, W., O’Keefe, M., Fortes, J.: Generalized cycle shrinking, presented at Wsh. on “Algorithms and Parallel VLSI Architectures II”. In: Quinton, P., Robert, Y. (eds.) Also in Algorithms and parallel VLSI architectures II, Bonas, France, June 1991, pp. 131–144. Elsevier, Amsterdam (1991)

    Google Scholar 

  128. Shin, D., Kim, J.: An operation rearrangement technique for low power vliw instruction fetch. In: Proc of Workshop on Complexity-E®ective Design (2000)

    Google Scholar 

  129. Shiue, W.T., Chakrabarti, C.: Memory exploration for low power embedded systems. In: Proc. 36th ACM/IEEE Design Automation Conf., New Orleans, LA, June 1999, pp. 140–145 (1999)

    Google Scholar 

  130. Sias, J.W., Hunter, H.C., Mei, W., Hwu, W.: Enhancing loop bu®ering of media and telecommunications applications using low-overhead predication. In: Proc. of 34th Annual International Symposium on Microarchitecture (MICRO) (December 2001)

    Google Scholar 

  131. Steinke, S., Wehmeyer, L., Lee, B.-S., Marwedel, P.: Assigning program and data objects to scratchpad for energy reduction. In: Proc. of Design Automation and Test in Europe (DATE) (March 2002)

    Google Scholar 

  132. Tang, W., Gupta, R., Nicolau, A.: Design of a predictive filter cache for energy savings in high performance processor architectures. In: Proc. of Internal Conference on Computer Design (ICCD) (September 2001)

    Google Scholar 

  133. Tang, W., Gupta, R., Nicolau, A.: Power savings in embedded processors through decode filter cache. In: Proc. of Design Automation and Test in Europe (DATE) (March 2002)

    Google Scholar 

  134. Tang, W., Gupta, R., Nicolau, A.: Reducing power with an l0 instruction cache using history-based prediction. In: Proc. of Internal Workshop on Innovative Architecture for Future Generation High-Performance processors and Systems (IWIA) (January 2002)

    Google Scholar 

  135. Thiele, L.: On the design of piecewise regular processor arrays. In: Proc. IEEE Intnl. Symp. on Circuits and Systems, Portland OR, May 1989, pp. 2239–2242 (1989)

    Google Scholar 

  136. Thomas, D.E., Dirkes, E., Walker, R., Rajan, J., Nestor, J., Blackburn, R.: The system architect’s workbench. In: Proc. 25th ACM/IEEE Design Automation Conf., San Francisco CA, June 1988, pp. 337–343 (1988)

    Google Scholar 

  137. Torrie, E., Martonosi, M., Tseng, C.-W., Hall, M.: Characterizing the memory behaviour of compiler-parallelized applications. IEEE Trans. on Parallel and Distributed Systems 7(12), 1224–1236 (1996)

    Article  Google Scholar 

  138. Truong, D.N., Bodin, F., Seznec, A.: Accurate data distribution into blocks may boost cache performance. IEEE TC on Computer Architecture Newsletter, special issue on “Interaction between Compilers and Computer Architectures”, 55–57 (June 1997)

    Google Scholar 

  139. Tzen, T., Ni, L.: Dependence uniformization: a loop parallelization technique. IEEE Trans. on Parallel and Distributed Systems 4(5), 547–557 (1993)

    Article  Google Scholar 

  140. Vandecappelle, A., Miranda, M., Brockmeyer, E., Catthoor, F., Verkest, D.: Global Multimedia System Design Exploration using Accurate Memory Organization Feedback. In: Proc. 36th ACM/IEEE Design Automation Conf., New Orleans LA, June 1999, pp. 327–332 (1999)

    Google Scholar 

  141. Vander Aa, T., Jayapala, M., Barat, F., Deconinck, G., Lauwereins, R., Catthoor, F., Corporaal, H.: Instruction bu®ering exploration for low energy vliws with instruction clusters. In: Proc. of the Asian Pacific Design and Automation Conference 2004 (ASPDAC 2004), Yokohama, Japan (January 2004)

    Google Scholar 

  142. Vander Aa, T., Jayapala, M., Barat, F., Deconinck, G., Lauwereins, R., Corporaal, H., Catthoor, F.: Instruction bu®ering exploration for low energy embedded processors. In: Proc. of 13th International Workshop on Power And Timing Modeling, Optimization and Simulation (PATMOS) (September 2003)

    Google Scholar 

  143. Verhaegh, W., Aarts, E., Gorp, P.V.: Period assignment in multi-dimensional periodic scheduling. In: Proc. IEEE Intnl. Conf. Comp. Aided Design, Santa Clara CA, November 1998, pp. 585–592 (1998)

    Google Scholar 

  144. Wolf, M., Lam, M.: A data locality optimizing algorithm. In: Proc. of the SIGPLAN’ 1991 Conf. on Programming Language Design and Implementation, Toronto ON, Canada, June 1991, pp. 30–43 (1991)

    Google Scholar 

  145. Wolfe, M.: The Tiny loop restructuring tool. In: Proc. of Intnl. Conf. on Parallel Processing, pp. II.46-II.53 (1991)

    Google Scholar 

  146. Wong, D., Davis, E., Young, J.: A software approach to avoiding spatial cache collisions in parallel processor systems. IEEE Trans. on Parallel and Distributed Systems 9(6), 601–608 (1998)

    Article  Google Scholar 

  147. Wuytack, S., Catthoor, F., Jong, G.D., Lin, B., De Man, H.: Flow Graph Balancing for Minimizing the Required Memory Bandwidth. In: Proc. 9th ACM/IEEE Intnl. Symp. on System-Level Synthesis, La Jolla CA, November 1996, pp. 127–132 (1996)

    Google Scholar 

  148. Wuytack, S., Catthoor, F., Jong, G.D., De Man, H.: Minimizing the Required Memory Bandwidth in VLSI System Realizations. IEEE Trans. on VLSI Systems 7(4), 433–441 (1999)

    Article  Google Scholar 

Download references

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Bouyssounouse, B., Sifakis, J. (2005). Low Power Engineering. In: Embedded Systems Design. Lecture Notes in Computer Science, vol 3436. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-31973-3_30

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-31973-3_30

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-25107-1

  • Online ISBN: 978-3-540-31973-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics