Abstract
The work presented in this book targets nomadic battery operated embedded systems. In this context, a large amount of related work exists. The goal of this chapter is to present a structured overview of the relevant related work in the design of embedded systems, which forms the broad context. The presented ordering will cover both the architectural as well as the related mapping aspects. An overview will be presented of the state of the art for the different components that form an embedded system. Specific related work and comparisons the individual contributions of this book will be presented in the respective chapters.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
J.Van Praet, G.Goosens, D.Lanneer, H.De Man, Instruction set definition and instruction selection for asips. Proc. 7th ACM/IEEE Intnl. Symp. on High-Level Synthesis, Niagara-on-the-Lake, Canada, May 1994.
A. Cheng and G. Tyson. An energy efficient instruction set synthesis framework for low power embedded system designs. IEEE Trans. Comput., 54(6):698–713, 2005.
J.Lee, K.Choi, and N.Dutt. Energy-efficient instruction set synthesis for application-specific processors. ISLPED ’03: Proceedings of the 2003 international symposium on Low power electronics and design, pages 330–333, New York, NY, USA, 2003. ACM Press.
P. Yu and T. Mitra. Scalable instructions identification for instruction-set extensible processors. Proc. of CASES, September 2004.
Texas Instruments, Inc, http://www.ti.com. TMS320C6000 CPU and Instruction Set Reference Guide, October 2000.
Philips Research, http://www.siliconhive.com. Philips SiliconHive Avispa Accelerator.
J.van de Waerdt, S.Vassiliadis, S.Das, S.Mirolo, C.Yen, B.Zhong, C.Basto, J.van Itegem, D.Amirtharaj, K.Kalra, P.odriguez, and H.van Antwerpen. The tm3270 media-processor. MICRO ’05: Proceedings of the 38th Annual IEEE/ACM Intnl. Symposium on Microarchitecture (MICRO’05), pages 331–342, Washington, DC, USA, 2005. IEEE Computer Society.
H. Rong, Z. Tang, R. Govindarajan, A. Douillet, and G. Gao. Single-dimension software pipelining for multidimensional loops. ACM Trans. Archit. Code Optim., 4(1), 2007.
B.Rau. Iterative modulo scheduling. Technical Report HPL-94–115, HP Laboratories, 1994.
S.Mahlke, D.Lin, W.Chen, R.Hank, and R.Bringmann. Effective compiler support for predicated execution using the hyperblock. MICRO 25: Proceedings of the 25th annual international symposium on Microarchitecture, pages 45–54. IEEE Computer Society Press, 1992.
L. Xue, O. Ozturk, and M. Kandemir. A memory-conscious code parallelization scheme. Proc. of the 44th annual conference on design automation, pages 230–233, New York, NY, USA, 2007. ACM Press.
P.Op de Beeck, C.Ghez, E.Brockmeyer, M.Miranda, F.Catthoor, and G.Deconinck. Background data organisation for the low-power implementation in real-time of a digital audio broadcast receiver on a simd processor. DATE ’03: Proceedings of the conference on Design, Automation and Test in Europe, page 11144, Washington, DC, USA, 2003. IEEE Computer Society.
S. Rixner, W. Dally, B. Khialany, P.Mattson, U.Kapasi, and J.Owens. Register organization for media processing. Proc. of 26th Intnl. Symposium on High-Performance Computer Architecture (HiPC), pages 375–386, January 2000.
V. Lapinskii, M. Jacome, and G. de Veciana. Application-specific clustered vliw datapaths: Early exploration on a parameterized design space. IEEE Trans. on Computer Aided Design of Integrated Circuits and Systems, 21(8): 889–903, August 2002.
A. Gangawar, M. Balakrishnan, and A. Kumar. Impact of intercluster communication mechanisms on ilp in clustered VLIW architectures. ACM TODAES, Vol.12, No.1, pages 1–29, Jan. 2007.
G.Tyso, M.Smelyanskiy, and E.Davidson. Evaluating the use of register queues in software pipelined loops. IEEE Trans. on Computers, pages 769–783, August 2001.
M.Fernandes, J.Llosa, and N.Topham. Using queues for register file organization in vliw architectures. Internal Report ECS-CSG-29-29, Depart of Computer Science, University of Edinburgh, February 1997.
J.Zalamea, J.Llosa, E.Ayguade, and M.Valero. Two-level hierarchical register file organization for vliw processors. Microarchitecture, 2000. MICRO-33. Proceedings. 33rd Annual IEEE/ACM Intnl. Symposium on, pages 137–146, 2000.
U. Kapasi, S. Rixner, W.Dally, B. Khailany, J. Ahn, P. Mattson, and J.Owens. Programmable stream processors. IEEE Computer, pages 54–62, Aug. 2003.
K.Asanović. Vector Microprocessors. PhD thesis, University of California Berkeley, 1998.
C. Kozyrakis and D. Patterson. Scalable vector processors for embedded systems. IEEE Micro, 23(6): 36–45, 2003.
Y. Zhang and D. Chen. Efficient global register allocation for minimizing energy consumption. SIGPLAN Not., 37(4): 42–53, 2002.
M. Smith, N. Ramsey, and G. Holloway. A generalized algorithm for graph-coloring register allocation. PLDI ’04: Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation, pages 277–288, New York, NY, USA, 2004.
G.Chaitin. Register allocation and spilling via graph coloring. Proc. of Compiler Construction, 1982.
A. Das, W. Dally, and P. Mattson. Compiling for stream processing. PACT ‘06: Proceedings of the 15th international conference on Parallel architectures and compilation techniques, pages 33–42, New York, NY, USA, 2006. ACM.
T.Schuster, B.Bougard, P.Raghavan, R.Priewasser, D.Novo, L.Vanderperre, and F.Catthoor. Design of a low power pre-synchronization asip for multimode sdr terminals. Proc. of SAMOS, 2007.
J.Hennessy and D.Patterson. Computer Architecture: A Quantitative Approach. Morgan Kauffman, 1996 (Second Edition).
Texas Instruments, Inc, http://www.ti.com/. MSP430 Ultra Low Power Microcontrollers, January 2009.
ARM, http://www.arm.com/products/CPUs/families/ARM7Family.html. ARM 7 Family, January 2009.
J.Rabaey, U.C.Berkeley, sensor node publications at website http://www.eecs.berkeley.edu/Pubs/Faculty/rabaey.html.
Texas Instruments, Inc, http://www.ti.com/. TSM320C54x DSP devices, January 2009.
Texas Instruments, Inc, http://www.ti.com/. TMS320C64x/C64x+ DSP CPU and Instruction Set Reference Guide, May 2006.
Silicon Hive, http://www.siliconhive.com. SiliconHive HiveFlex XSP.
Montium TP Processor, http://www.recoresystems.com. Montium Tile Processor Reference Manual, 2005.
U. Ramacher, “Software-Defined Radio Prospects for Multistandard Mobile Phones”, IEEE Computer Magazine, Vol.40, No.10, pp.62–69, Oct. 2007.
M. Woh, Y. Lin, S. Seo, S. Mahlke, T. Mudge, C. Chakrabarti, R.Bruce, D.Kershaw, A.Reid, M.Wilder, K.Flautner, “From SODA to scotch: The evolution of a wireless baseband processor”, Proc. of Intnl. Symp. on Microarchitecture (MICRO-41), pp.152–163, Nov. 2008.
J.Glossner, K.Chirca, M.Schulte, H.Wang, N.Nasimzada, D.Har, S.Wang, A.Hoane, G.Nacer, M.Moudgill1, S. Vassiliadis, “Sandblaster Low Power DSP”, Proc. IEEE Custom Integrated Circuits Conf. (CICC), Orlando FL, pp.575–581, Sep. 2004.
O.Schliebusch, H.Meyr, R.Leupers, “Optimized ASIP Synthesis from Architecture Description Language Models”, ISBN 978-1-4020-5685-7, Springer, Heidelberg, Germany, 2007.
M.Baron. Cortex a8:high speed, low power. Microprocessor Report, October 2005.
L.Benini, D.Bruni, M.Chinosi, C.Silvano, V.Zaccaria, and R.Zafalon. A power modeling and estimation framework for vliw-based embedded systems. PATMOS Intnl. Symposium, 2001.
A.Abbo, R.Kleihorst, V.Choudhary L.Sevat, P.Wielage, S.Mouy, and M.Heijligers. Xetal-ii: A 107 gops, 600mw massively-parallel processor for video scene analysis. Proc. of ISSCC, 2007.
Nvidia, http://www.nvidia.com/object/geforce∖_8600M.html. NVidia GEForce 8600 Series Processor, January 2009.
ATI, http://ati.amd.com/products/mobilityradeonhd3600/index.html. ATI Radeon 3600 Series Processor, January 2009.
B.Mei, S.Vernalde, D.Verkest, H.De Man, and R.Lauwereins. ADRES: An architecture with tightly coupled vliw processor and coarse-grained reconfigurable matrix. Proc. IEEE Conf. on Field-Programmable Logic and its Applications (FPL), pages 61–70, Lisbon, Portugal, September 2003.
PACT XPP Technologies, 2003. http://www.pactcorp.com.
G.Talavera, M.Jayapala, J.Carrabina, F.Catthoor, “Address Generation Optimization for Embedded High-Performance Processors: a Survey”, J. of Signal Processing Systems, Springer, May 2008 (on-line); Vol.53, No.3, pp.271–284, Sep. 2008.
S.Mathew, M.Anders, R.Krishnamurthy, and S.Borkar. A 4-ghz 130-nm address generation unit with 32-bit sparse-tree adder core. IEEE Journal of Solid-State Circuits, 38(5), may 2003.
S. Hettiaratchi, P. Cheung, and T. Clarke. Performance-area trade-off of address generators for address decoder-decoupled memory. DATE ’02: Proceedings of the conference on Design, automation and test in Europe, page 902, Washington, DC, USA, 2002. IEEE Computer Society.
J.Absar, P.Raghavan, A.Lambrechts, M.Li, M.Jayapala, and F.Catthoor. Locality optimizations in a compiler for wireless applications. Design Automation of Embedded Systems (DAEM), April 2008.
E.Brockmeyer, C.Ghez, W.Baetens, and F.Catthoor. Unified low-power design flow for data-dominated multi-media and telecom applications. Kluwer Acad Publ. Boston, 2000.
P.Marwedel. Embedded System Design. Kluwer Academic Publishers (Springer), Norwell, MA, USA, 2003.
P.Panda, A.Nicolau, and N.Dutt. Memory Issues in Embedded Systems-on-Chip: Optimizations and Exploration. Kluwer Academic Publishers, Norwell, MA, USA, 1998.
J. Kin, M. Gupta, and W. Mangione-Smith. Filtering memory references to increase energy efficiency. IEEE Trans. on Computers, 49(1): 1–15, January 2000.
E. Roternberg, S. Bennett, and J. Smith. Trace cache: A low latency approach to high bandwidth instruction fetching. Proc. of 29th Intnl. Symposium on Microarchitecture (MICRO), December 1996.
N. Jouppi. Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers. Proc. of Intnl. Symposium on Computer Architecture (ISCA), May 1990.
N.Jouppi and S.Wilton. Trade-offs in two level on-chip caching. Proc. of Intnl. Symposium on Computer Architecture (ISCA), May 1994.
N.Kavvadias and S.Nikolaidis. Zero-overhead loop controller that implements multimedia algorithms, July 2005.
Gang-Ryung Uh, Yuhong Wang, David Whalley, Sanjay Jinturkar, Chris Burns, and Vincent Cao. Effective exploitation of a zero overhead loop buffer. LCTES ’99: Proceedings of the ACM SIGPLAN 1999 workshop on Languages, compilers, and tools for embedded systems, pages 10–19, New York, NY, USA, 1999. ACM Press.
T. Vander Aa, M. Jayapala, F. Barat, G. Deconinck, R.Lauwereins, H.Corporaal, and F.Catthoor. Instruction buffering exploration for low energy embedded processors. Journal of Embedded Computing, 1(3), 2004.
M. Jayapala, F. Barat, T. Vander Aa, F. Catthoor, H. Corporaal, and G. Deconinck. Clustered loop buffer organization for low energy VLIW embedded processors. IEEE Trans. on Computers, 54(6): 672–683, June 2005.
S. Debray, W. Evans, R. Muth, and B. De Sutter. Compiler techniques for code compaction. ACM Trans. on Programming Languages and Systems (TOPLAS), 22(2):378–415, March 2000.
W.Tang, R.Gupta, and A.Nicolau. Reducing power with an l0 instruction cache using history-based prediction. Proc. of Intnl. Workshop on Innovative Architecture for Future Generatio High-Performance processors and Systems (IWIA), Jan. 2002.
A.Gordon-Ross and F.Vahid. Dynamic loop caching meets preloaded loop caching – a hybrid approach. Proc. of Intnl. Conf. on Computer Design (ICCD), September 2002.
S.Aditya, S.Mahlke, and B.Rau. Code size minimization and retargetable assembly for custom epic and vliw instruction formats. ACM Trans. Des. Autom. Electron. Syst. (TODAES), 5(4):752–773, 2000.
A.Besdéz, R.Ferenc, T.Gyimttthy, A.Dolenc, and K.Karsisto. Survey of code-size reduction methods. ACM Computing Surveys (CSUR), 35(3):223–267, September 2003.
T.Kogel, R.Leupers, and H.Meyr. Integrated System-Level Modeling of Network-on-Chip enabled Multi-Processor Platforms. Springer, 2006.
G.De Micheli and L.Benini. Networks on Chips: Technology and Tools (Systems on Silicon). Morgan Kaufmann, 2006.
A. Leroy, D. Milojevic, D. Verkest, F. Robert, and F. Catthoor. Concepts and implementation of spatial division multiplexing forguaranteed throughput in networks-on-chip. IEEE Trans. on Computers, 57(9): 1182–1195, September 2008.
A.Papanikolaou. Application-driven software configuration of communication networks and memory organizations. PhD thesis, CS Dept., U.Gent, Belgium, December 2006.
K.Heyrman. Control of Sectioned On-Chip Communication. Doctoral dissertation, CS Dept., U.Gent, Belgium, June 2009.
R.Gonzalez. Xtensa: A configurable and extensible processor. IEEE Micro, volume 20(2), 2002.
Target, http://www.retarget.com. IP Designer, 2008.
M.Jacome, G.de Veciana, and V.Lapinskii. Exploring performance tradeoffs for clustered VLIW ASIPs. Proc. of ICCAD, Nov 2000.
V. Lapinskii, M. Jacome, and G. de Veciana. Application-specific clustered vliw datapaths: Early exploration on a parameterized design space. IEEE Trans. on Computer Aided Design of Integrated Circuits and Systems, 21(8):889–903, August 2002.
Synfora Inc., http://www.synfora.com. PICO Express, 2008.
S.Phillips, A.Sharma, and S.Hauck. Automating the layout of reconfigurable subsystems via template reduction. FCCM, pages 340–341, 2004.
T.Austin, E.Larson, and D.Ernst. Simplescalar: an infrastructure for computer system modeling. IEE Computer Magazine, 35(2):59–67, 2002.
A.Sinha and A.Chandrakasan. Jouletrack - a web based tool for software energy profiling. Proc. of Design Automation Conf. (DAC), June 2001.
L.Benini, D.Bruni, M.Chinosi, C.Silvano, and V.Zaccaria. A power modeling and estimation framework for vliw-based embedded system. ST Journal of System Research, 3(1):110–118, April 2002.
N.Julien, J.Laurent, E.Senn, and E.Martin. Power consumption modeling and characterization of the ti c6201. IEEE Micro, Sep–Oct 2003.
N.Chang, K.Kim, and H.Lee. Cycle-accurate energy consumption measurement and analysis: case study of arm7tdmi. ISLPED ’00: Proceedings of the 2000 international symposium on Low power electronics and design, pages 185–190, 2000.
C.Isci and M.Martonosi. Runtime power monitoring in high-end processors: Methodology and empirical data. MICRO 36: Proceedings of the 36th annual IEEE/ACM Intnl. Symposium on Microarchitecture, pp. 93, 2003.
V.Tiwari, S.Malik, A.Wolfe, and M.Lee. Instruction level power analysis and optimization of software. Journal of VLSI Signal Processing, pages 1–18, 1996.
D.Brooks, V.Tiwari, and M.Martonosi. Wattch: A framework for architectural-level power analysis and optimizations. Proc. of the 27th Intnl. Symposium on Computer Architecture (ISCA), pages 83–94, June 2000.
P.Shivakumar and N.Jouppi. CACTI3.0: A integrated cache timing, power, and area model. Technical report, COMPAQ Western Research Laboratory, Aug. 2001.
S.Askar and M.Ciesielski. Analytical approach to custom datapath design. ICCAD ’99: Proceedings of the 1999 IEEE/ACM international conference on Computer-aided design, pages 98–101, 1999.
AT & T, http://www.att.com. AT&T DSP1600 Microprocessor, 1990.
D.Helms, E.Schmidt, A.Schulz, A.Stammermann, and W.Nebel. An improved power macro-model for arithmetic datapath components. Proceedings of PATMOS, pages 359–372, 2002.
M.Hsiao and J.Patel. Effects of delay models on peak power estimation of vlsi sequential circuits. Proc. Intnl. Conf. on Computer Aided Design, pages 45–51, 1997.
ARM, http://www.arm.com/products/physicalip/memory.html. Artisan Memory Generator.
Virage Logic, http://www.viragelogic.com/render/content.asp?pageid=667. Virage Logic SiWare Memory.
Trimaran 4.0: An Infrastructure for Research in Backend Compilation and Architecture Exploration. http://www.trimaran.org/docs/trimaran4∖_manual.pdf, 2008.
CoWare Inc., www.coware.com/products/processordesigner.php. CoWare Processor Designer, 2008.
EDA Meister, http://www.eda-meister.org. ASIP Meister, 2005.
LSF: Liberty Simulation Framework 1.0. http://liberty.princeton.edu/Software/LSE, 2002.
UNISIM: UNIted SIMulation environment. http://unisim.org/site/, 2005.
CoWare Inc., http://www.coware.com/products/virtualplatform.php. CoWare Virtual Platform, 2008.
University of California Berkeley, http://bee2.eecs.berkeley.edu/. BEE2, 2007.
D.Atienza, P.Del Valle, G.Paci, F.Polett, L.Benini, G.De Micheli, J.M.Mendias, and R.Hermida. Hw-sw emulation framework for temperature-aware design in mpsocs. ACM Trans. Des. Autom. Electron. Syst., 12(3):1–26, 2007.
Mentor Graphics, http://www.mentor.com/products/fv/emulation/vstation∖_pro/. VStationPRO, 2007.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2010 Springer Science+Business Media B.V.
About this chapter
Cite this chapter
Catthoor, F., Raghavan, P., Lambrechts, A., Jayapala, M., Kritikakou, A., Absar, J. (2010). Global State-of-the-Art Overview. In: Ultra-Low Energy Domain-Specific Instruction-Set Processors., vol 0. Springer, Dordrecht. https://doi.org/10.1007/978-90-481-9528-2_2
Download citation
DOI: https://doi.org/10.1007/978-90-481-9528-2_2
Published:
Publisher Name: Springer, Dordrecht
Print ISBN: 978-90-481-9527-5
Online ISBN: 978-90-481-9528-2
eBook Packages: EngineeringEngineering (R0)