Global State-of-the-Art Overview

Catthoor, Francky; Raghavan, Praveen; Lambrechts, Andy; Jayapala, Murali; Kritikakou, Angeliki; Absar, Javed

doi:10.1007/978-90-481-9528-2_2

Francky Catthoor⁷,
Praveen Raghavan⁷,
Andy Lambrechts⁷,
Murali Jayapala⁷,
Angeliki Kritikakou⁸ &
…
Javed Absar⁹

521 Accesses

Abstract

The work presented in this book targets nomadic battery operated embedded systems. In this context, a large amount of related work exists. The goal of this chapter is to present a structured overview of the relevant related work in the design of embedded systems, which forms the broad context. The presented ordering will cover both the architectural as well as the related mapping aspects. An overview will be presented of the state of the art for the different components that form an embedded system. Specific related work and comparisons the individual contributions of this book will be presented in the respective chapters.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

J.Van Praet, G.Goosens, D.Lanneer, H.De Man, Instruction set definition and instruction selection for asips. Proc. 7th ACM/IEEE Intnl. Symp. on High-Level Synthesis, Niagara-on-the-Lake, Canada, May 1994.
Google Scholar
A. Cheng and G. Tyson. An energy efficient instruction set synthesis framework for low power embedded system designs. IEEE Trans. Comput., 54(6):698–713, 2005.
Google Scholar
J.Lee, K.Choi, and N.Dutt. Energy-efficient instruction set synthesis for application-specific processors. ISLPED ’03: Proceedings of the 2003 international symposium on Low power electronics and design, pages 330–333, New York, NY, USA, 2003. ACM Press.
Google Scholar
P. Yu and T. Mitra. Scalable instructions identification for instruction-set extensible processors. Proc. of CASES, September 2004.
Google Scholar
Texas Instruments, Inc, http://www.ti.com. TMS320C6000 CPU and Instruction Set Reference Guide, October 2000.
Philips Research, http://www.siliconhive.com. Philips SiliconHive Avispa Accelerator.
J.van de Waerdt, S.Vassiliadis, S.Das, S.Mirolo, C.Yen, B.Zhong, C.Basto, J.van Itegem, D.Amirtharaj, K.Kalra, P.odriguez, and H.van Antwerpen. The tm3270 media-processor. MICRO ’05: Proceedings of the 38th Annual IEEE/ACM Intnl. Symposium on Microarchitecture (MICRO’05), pages 331–342, Washington, DC, USA, 2005. IEEE Computer Society.
Google Scholar
H. Rong, Z. Tang, R. Govindarajan, A. Douillet, and G. Gao. Single-dimension software pipelining for multidimensional loops. ACM Trans. Archit. Code Optim., 4(1), 2007.
Google Scholar
B.Rau. Iterative modulo scheduling. Technical Report HPL-94–115, HP Laboratories, 1994.
Google Scholar
S.Mahlke, D.Lin, W.Chen, R.Hank, and R.Bringmann. Effective compiler support for predicated execution using the hyperblock. MICRO 25: Proceedings of the 25th annual international symposium on Microarchitecture, pages 45–54. IEEE Computer Society Press, 1992.
Google Scholar
L. Xue, O. Ozturk, and M. Kandemir. A memory-conscious code parallelization scheme. Proc. of the 44th annual conference on design automation, pages 230–233, New York, NY, USA, 2007. ACM Press.
Google Scholar
P.Op de Beeck, C.Ghez, E.Brockmeyer, M.Miranda, F.Catthoor, and G.Deconinck. Background data organisation for the low-power implementation in real-time of a digital audio broadcast receiver on a simd processor. DATE ’03: Proceedings of the conference on Design, Automation and Test in Europe, page 11144, Washington, DC, USA, 2003. IEEE Computer Society.
Google Scholar
S. Rixner, W. Dally, B. Khialany, P.Mattson, U.Kapasi, and J.Owens. Register organization for media processing. Proc. of 26th Intnl. Symposium on High-Performance Computer Architecture (HiPC), pages 375–386, January 2000.
Google Scholar
V. Lapinskii, M. Jacome, and G. de Veciana. Application-specific clustered vliw datapaths: Early exploration on a parameterized design space. IEEE Trans. on Computer Aided Design of Integrated Circuits and Systems, 21(8): 889–903, August 2002.
Article Google Scholar
A. Gangawar, M. Balakrishnan, and A. Kumar. Impact of intercluster communication mechanisms on ilp in clustered VLIW architectures. ACM TODAES, Vol.12, No.1, pages 1–29, Jan. 2007.
Google Scholar
G.Tyso, M.Smelyanskiy, and E.Davidson. Evaluating the use of register queues in software pipelined loops. IEEE Trans. on Computers, pages 769–783, August 2001.
Google Scholar
M.Fernandes, J.Llosa, and N.Topham. Using queues for register file organization in vliw architectures. Internal Report ECS-CSG-29-29, Depart of Computer Science, University of Edinburgh, February 1997.
Google Scholar
J.Zalamea, J.Llosa, E.Ayguade, and M.Valero. Two-level hierarchical register file organization for vliw processors. Microarchitecture, 2000. MICRO-33. Proceedings. 33rd Annual IEEE/ACM Intnl. Symposium on, pages 137–146, 2000.
Google Scholar
U. Kapasi, S. Rixner, W.Dally, B. Khailany, J. Ahn, P. Mattson, and J.Owens. Programmable stream processors. IEEE Computer, pages 54–62, Aug. 2003.
Google Scholar
K.Asanović. Vector Microprocessors. PhD thesis, University of California Berkeley, 1998.
Google Scholar
C. Kozyrakis and D. Patterson. Scalable vector processors for embedded systems. IEEE Micro, 23(6): 36–45, 2003.
Article Google Scholar
Y. Zhang and D. Chen. Efficient global register allocation for minimizing energy consumption. SIGPLAN Not., 37(4): 42–53, 2002.
Article Google Scholar
M. Smith, N. Ramsey, and G. Holloway. A generalized algorithm for graph-coloring register allocation. PLDI ’04: Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation, pages 277–288, New York, NY, USA, 2004.
Google Scholar
G.Chaitin. Register allocation and spilling via graph coloring. Proc. of Compiler Construction, 1982.
Google Scholar
A. Das, W. Dally, and P. Mattson. Compiling for stream processing. PACT ‘06: Proceedings of the 15th international conference on Parallel architectures and compilation techniques, pages 33–42, New York, NY, USA, 2006. ACM.
Google Scholar
T.Schuster, B.Bougard, P.Raghavan, R.Priewasser, D.Novo, L.Vanderperre, and F.Catthoor. Design of a low power pre-synchronization asip for multimode sdr terminals. Proc. of SAMOS, 2007.
Google Scholar
J.Hennessy and D.Patterson. Computer Architecture: A Quantitative Approach. Morgan Kauffman, 1996 (Second Edition).
Google Scholar
Texas Instruments, Inc, http://www.ti.com/. MSP430 Ultra Low Power Microcontrollers, January 2009.
ARM, http://www.arm.com/products/CPUs/families/ARM7Family.html. ARM 7 Family, January 2009.
J.Rabaey, U.C.Berkeley, sensor node publications at website http://www.eecs.berkeley.edu/Pubs/Faculty/rabaey.html.
Texas Instruments, Inc, http://www.ti.com/. TSM320C54x DSP devices, January 2009.
Texas Instruments, Inc, http://www.ti.com/. TMS320C64x/C64x+ DSP CPU and Instruction Set Reference Guide, May 2006.
Silicon Hive, http://www.siliconhive.com. SiliconHive HiveFlex XSP.
Montium TP Processor, http://www.recoresystems.com. Montium Tile Processor Reference Manual, 2005.
U. Ramacher, “Software-Defined Radio Prospects for Multistandard Mobile Phones”, IEEE Computer Magazine, Vol.40, No.10, pp.62–69, Oct. 2007.
Google Scholar
M. Woh, Y. Lin, S. Seo, S. Mahlke, T. Mudge, C. Chakrabarti, R.Bruce, D.Kershaw, A.Reid, M.Wilder, K.Flautner, “From SODA to scotch: The evolution of a wireless baseband processor”, Proc. of Intnl. Symp. on Microarchitecture (MICRO-41), pp.152–163, Nov. 2008.
Google Scholar
J.Glossner, K.Chirca, M.Schulte, H.Wang, N.Nasimzada, D.Har, S.Wang, A.Hoane, G.Nacer, M.Moudgill1, S. Vassiliadis, “Sandblaster Low Power DSP”, Proc. IEEE Custom Integrated Circuits Conf. (CICC), Orlando FL, pp.575–581, Sep. 2004.
Google Scholar
O.Schliebusch, H.Meyr, R.Leupers, “Optimized ASIP Synthesis from Architecture Description Language Models”, ISBN 978-1-4020-5685-7, Springer, Heidelberg, Germany, 2007.
Google Scholar
M.Baron. Cortex a8:high speed, low power. Microprocessor Report, October 2005.
Google Scholar
L.Benini, D.Bruni, M.Chinosi, C.Silvano, V.Zaccaria, and R.Zafalon. A power modeling and estimation framework for vliw-based embedded systems. PATMOS Intnl. Symposium, 2001.
Google Scholar
A.Abbo, R.Kleihorst, V.Choudhary L.Sevat, P.Wielage, S.Mouy, and M.Heijligers. Xetal-ii: A 107 gops, 600mw massively-parallel processor for video scene analysis. Proc. of ISSCC, 2007.
Google Scholar
Nvidia, http://www.nvidia.com/object/geforce∖_8600M.html. NVidia GEForce 8600 Series Processor, January 2009.
ATI, http://ati.amd.com/products/mobilityradeonhd3600/index.html. ATI Radeon 3600 Series Processor, January 2009.
B.Mei, S.Vernalde, D.Verkest, H.De Man, and R.Lauwereins. ADRES: An architecture with tightly coupled vliw processor and coarse-grained reconfigurable matrix. Proc. IEEE Conf. on Field-Programmable Logic and its Applications (FPL), pages 61–70, Lisbon, Portugal, September 2003.
Google Scholar
PACT XPP Technologies, 2003. http://www.pactcorp.com.
G.Talavera, M.Jayapala, J.Carrabina, F.Catthoor, “Address Generation Optimization for Embedded High-Performance Processors: a Survey”, J. of Signal Processing Systems, Springer, May 2008 (on-line); Vol.53, No.3, pp.271–284, Sep. 2008.
Google Scholar
S.Mathew, M.Anders, R.Krishnamurthy, and S.Borkar. A 4-ghz 130-nm address generation unit with 32-bit sparse-tree adder core. IEEE Journal of Solid-State Circuits, 38(5), may 2003.
Google Scholar
S. Hettiaratchi, P. Cheung, and T. Clarke. Performance-area trade-off of address generators for address decoder-decoupled memory. DATE ’02: Proceedings of the conference on Design, automation and test in Europe, page 902, Washington, DC, USA, 2002. IEEE Computer Society.
Google Scholar
J.Absar, P.Raghavan, A.Lambrechts, M.Li, M.Jayapala, and F.Catthoor. Locality optimizations in a compiler for wireless applications. Design Automation of Embedded Systems (DAEM), April 2008.
Google Scholar
E.Brockmeyer, C.Ghez, W.Baetens, and F.Catthoor. Unified low-power design flow for data-dominated multi-media and telecom applications. Kluwer Acad Publ. Boston, 2000.
Google Scholar
P.Marwedel. Embedded System Design. Kluwer Academic Publishers (Springer), Norwell, MA, USA, 2003.
Google Scholar
P.Panda, A.Nicolau, and N.Dutt. Memory Issues in Embedded Systems-on-Chip: Optimizations and Exploration. Kluwer Academic Publishers, Norwell, MA, USA, 1998.
Google Scholar
J. Kin, M. Gupta, and W. Mangione-Smith. Filtering memory references to increase energy efficiency. IEEE Trans. on Computers, 49(1): 1–15, January 2000.
Article Google Scholar
E. Roternberg, S. Bennett, and J. Smith. Trace cache: A low latency approach to high bandwidth instruction fetching. Proc. of 29th Intnl. Symposium on Microarchitecture (MICRO), December 1996.
Google Scholar
N. Jouppi. Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers. Proc. of Intnl. Symposium on Computer Architecture (ISCA), May 1990.
Google Scholar
N.Jouppi and S.Wilton. Trade-offs in two level on-chip caching. Proc. of Intnl. Symposium on Computer Architecture (ISCA), May 1994.
Google Scholar
N.Kavvadias and S.Nikolaidis. Zero-overhead loop controller that implements multimedia algorithms, July 2005.
Google Scholar
Gang-Ryung Uh, Yuhong Wang, David Whalley, Sanjay Jinturkar, Chris Burns, and Vincent Cao. Effective exploitation of a zero overhead loop buffer. LCTES ’99: Proceedings of the ACM SIGPLAN 1999 workshop on Languages, compilers, and tools for embedded systems, pages 10–19, New York, NY, USA, 1999. ACM Press.
Google Scholar
T. Vander Aa, M. Jayapala, F. Barat, G. Deconinck, R.Lauwereins, H.Corporaal, and F.Catthoor. Instruction buffering exploration for low energy embedded processors. Journal of Embedded Computing, 1(3), 2004.
Google Scholar
M. Jayapala, F. Barat, T. Vander Aa, F. Catthoor, H. Corporaal, and G. Deconinck. Clustered loop buffer organization for low energy VLIW embedded processors. IEEE Trans. on Computers, 54(6): 672–683, June 2005.
Article Google Scholar
S. Debray, W. Evans, R. Muth, and B. De Sutter. Compiler techniques for code compaction. ACM Trans. on Programming Languages and Systems (TOPLAS), 22(2):378–415, March 2000.
Google Scholar
W.Tang, R.Gupta, and A.Nicolau. Reducing power with an l0 instruction cache using history-based prediction. Proc. of Intnl. Workshop on Innovative Architecture for Future Generatio High-Performance processors and Systems (IWIA), Jan. 2002.
Google Scholar
A.Gordon-Ross and F.Vahid. Dynamic loop caching meets preloaded loop caching – a hybrid approach. Proc. of Intnl. Conf. on Computer Design (ICCD), September 2002.
Google Scholar
S.Aditya, S.Mahlke, and B.Rau. Code size minimization and retargetable assembly for custom epic and vliw instruction formats. ACM Trans. Des. Autom. Electron. Syst. (TODAES), 5(4):752–773, 2000.
Article Google Scholar
A.Besdéz, R.Ferenc, T.Gyimttthy, A.Dolenc, and K.Karsisto. Survey of code-size reduction methods. ACM Computing Surveys (CSUR), 35(3):223–267, September 2003.
Article Google Scholar
T.Kogel, R.Leupers, and H.Meyr. Integrated System-Level Modeling of Network-on-Chip enabled Multi-Processor Platforms. Springer, 2006.
Google Scholar
G.De Micheli and L.Benini. Networks on Chips: Technology and Tools (Systems on Silicon). Morgan Kaufmann, 2006.
Google Scholar
A. Leroy, D. Milojevic, D. Verkest, F. Robert, and F. Catthoor. Concepts and implementation of spatial division multiplexing forguaranteed throughput in networks-on-chip. IEEE Trans. on Computers, 57(9): 1182–1195, September 2008.
Article MathSciNet Google Scholar
A.Papanikolaou. Application-driven software configuration of communication networks and memory organizations. PhD thesis, CS Dept., U.Gent, Belgium, December 2006.
Google Scholar
K.Heyrman. Control of Sectioned On-Chip Communication. Doctoral dissertation, CS Dept., U.Gent, Belgium, June 2009.
Google Scholar
R.Gonzalez. Xtensa: A configurable and extensible processor. IEEE Micro, volume 20(2), 2002.
Google Scholar
Target, http://www.retarget.com. IP Designer, 2008.
M.Jacome, G.de Veciana, and V.Lapinskii. Exploring performance tradeoffs for clustered VLIW ASIPs. Proc. of ICCAD, Nov 2000.
Google Scholar
V. Lapinskii, M. Jacome, and G. de Veciana. Application-specific clustered vliw datapaths: Early exploration on a parameterized design space. IEEE Trans. on Computer Aided Design of Integrated Circuits and Systems, 21(8):889–903, August 2002.
Google Scholar
Synfora Inc., http://www.synfora.com. PICO Express, 2008.
S.Phillips, A.Sharma, and S.Hauck. Automating the layout of reconfigurable subsystems via template reduction. FCCM, pages 340–341, 2004.
Google Scholar
T.Austin, E.Larson, and D.Ernst. Simplescalar: an infrastructure for computer system modeling. IEE Computer Magazine, 35(2):59–67, 2002.
Google Scholar
A.Sinha and A.Chandrakasan. Jouletrack - a web based tool for software energy profiling. Proc. of Design Automation Conf. (DAC), June 2001.
Google Scholar
L.Benini, D.Bruni, M.Chinosi, C.Silvano, and V.Zaccaria. A power modeling and estimation framework for vliw-based embedded system. ST Journal of System Research, 3(1):110–118, April 2002.
Google Scholar
N.Julien, J.Laurent, E.Senn, and E.Martin. Power consumption modeling and characterization of the ti c6201. IEEE Micro, Sep–Oct 2003.
Google Scholar
N.Chang, K.Kim, and H.Lee. Cycle-accurate energy consumption measurement and analysis: case study of arm7tdmi. ISLPED ’00: Proceedings of the 2000 international symposium on Low power electronics and design, pages 185–190, 2000.
Google Scholar
C.Isci and M.Martonosi. Runtime power monitoring in high-end processors: Methodology and empirical data. MICRO 36: Proceedings of the 36th annual IEEE/ACM Intnl. Symposium on Microarchitecture, pp. 93, 2003.
Google Scholar
V.Tiwari, S.Malik, A.Wolfe, and M.Lee. Instruction level power analysis and optimization of software. Journal of VLSI Signal Processing, pages 1–18, 1996.
Google Scholar
D.Brooks, V.Tiwari, and M.Martonosi. Wattch: A framework for architectural-level power analysis and optimizations. Proc. of the 27th Intnl. Symposium on Computer Architecture (ISCA), pages 83–94, June 2000.
Google Scholar
P.Shivakumar and N.Jouppi. CACTI3.0: A integrated cache timing, power, and area model. Technical report, COMPAQ Western Research Laboratory, Aug. 2001.
Google Scholar
S.Askar and M.Ciesielski. Analytical approach to custom datapath design. ICCAD ’99: Proceedings of the 1999 IEEE/ACM international conference on Computer-aided design, pages 98–101, 1999.
Google Scholar
AT & T, http://www.att.com. AT&T DSP1600 Microprocessor, 1990.
D.Helms, E.Schmidt, A.Schulz, A.Stammermann, and W.Nebel. An improved power macro-model for arithmetic datapath components. Proceedings of PATMOS, pages 359–372, 2002.
Google Scholar
M.Hsiao and J.Patel. Effects of delay models on peak power estimation of vlsi sequential circuits. Proc. Intnl. Conf. on Computer Aided Design, pages 45–51, 1997.
Google Scholar
ARM, http://www.arm.com/products/physicalip/memory.html. Artisan Memory Generator.
Virage Logic, http://www.viragelogic.com/render/content.asp?pageid=667. Virage Logic SiWare Memory.
Trimaran 4.0: An Infrastructure for Research in Backend Compilation and Architecture Exploration. http://www.trimaran.org/docs/trimaran4∖_manual.pdf, 2008.
CoWare Inc., www.coware.com/products/processordesigner.php. CoWare Processor Designer, 2008.
EDA Meister, http://www.eda-meister.org. ASIP Meister, 2005.
LSF: Liberty Simulation Framework 1.0. http://liberty.princeton.edu/Software/LSE, 2002.
UNISIM: UNIted SIMulation environment. http://unisim.org/site/, 2005.
CoWare Inc., http://www.coware.com/products/virtualplatform.php. CoWare Virtual Platform, 2008.
University of California Berkeley, http://bee2.eecs.berkeley.edu/. BEE2, 2007.
D.Atienza, P.Del Valle, G.Paci, F.Polett, L.Benini, G.De Micheli, J.M.Mendias, and R.Hermida. Hw-sw emulation framework for temperature-aware design in mpsocs. ACM Trans. Des. Autom. Electron. Syst., 12(3):1–26, 2007.
Article Google Scholar
Mentor Graphics, http://www.mentor.com/products/fv/emulation/vstation∖_pro/. VStationPRO, 2007.

Download references

Author information

Authors and Affiliations

Interuniversity MicroElectronics Center IMEC, Kapeldreef 75, 3001, Leuven, Belgium
Prof. Dr. Francky Catthoor, Dr. Praveen Raghavan, Dr. Andy Lambrechts & Dr. Murali Jayapala
VLSI Design Lab, Univ. Patras, 26110, Rio, Patras, Greece
Eng. Angeliki Kritikakou
Samsung India Software Operations Pvt. Ltd, No. 66/1, Bagmane Tech Park C.V. Raman Nagar, Bangalore, 560 093, India
Dr. Javed Absar

Authors

Prof. Dr. Francky Catthoor
View author publications
You can also search for this author in PubMed Google Scholar
Dr. Praveen Raghavan
View author publications
You can also search for this author in PubMed Google Scholar
Dr. Andy Lambrechts
View author publications
You can also search for this author in PubMed Google Scholar
Dr. Murali Jayapala
View author publications
You can also search for this author in PubMed Google Scholar
Eng. Angeliki Kritikakou
View author publications
You can also search for this author in PubMed Google Scholar
Dr. Javed Absar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Francky Catthoor .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Catthoor, F., Raghavan, P., Lambrechts, A., Jayapala, M., Kritikakou, A., Absar, J. (2010). Global State-of-the-Art Overview. In: Ultra-Low Energy Domain-Specific Instruction-Set Processors., vol 0. Springer, Dordrecht. https://doi.org/10.1007/978-90-481-9528-2_2

Download citation

DOI: https://doi.org/10.1007/978-90-481-9528-2_2
Published: 03 July 2010
Publisher Name: Springer, Dordrecht
Print ISBN: 978-90-481-9527-5
Online ISBN: 978-90-481-9528-2
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics