The Design of a Dataflow Coprocessor for Low Power Embedded Hierarchical Processing

  • Yijun Liu
  • Steve Furber
  • Zhenkun Li
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4148)


Power consumption has become one of the most important concerns in the design of embedded processor; the power dissipation of microprocessors grows rapidly as the development of CMOS technology packs more transistors per unit area. However, the potential for further power saving in microprocessors with a conventional architecture is limited because of their unified architectures and mature low-power techniques. An alternative approach to save power is proposed in this paper — embedding a dataflow coprocessor in a conventional RISC processor. The dataflow coprocessor is designed to execute short code segments, such as small loops, function calls and long equation evaluations, very efficiently. We demonstrate a factor of 7 improvement in power-efficiency over current general-purpose processors. Dataflow techniques are not new, but we apply the concept to address a new problem — to improve the power-efficiency of conventional processors.


Clock Cycle Function Block Hierarchical Processing Embed Processor Program Segment 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Soudris, D., Piguet, C., Goutis, C. (eds.): Designing CMOS Circuits for Low Power. Kluwer academic publishers, Dordrecht (2002)Google Scholar
  2. 2.
    Burks, A.W., Goldstine, H.H., von Neumann, J.: Preliminary discussion of the logical design of an electronic computing instrument. In: John von Neumann Collected Works, vol. V, pp. 34–79. The Macmillan Co., New York (1963)Google Scholar
  3. 3.
    Montanaro, J., Witec, R.T., et al.: A 160-MHz, 32-b, 0.5-W CMOS RISC Microprocessor. IEEE Journal of Solid-State Circuits 31(11) (November 1996)Google Scholar
  4. 4.
    Bajwa, R.S., et al.: Instruction buffering to reduce power in processors for signal processing. IEEE transition on VLSI (1997)Google Scholar
  5. 5.
    Lee, L.H., Moyer, W., Arends, J.: Instruction fetch energy reduction using loop caches for embedded applications with small tight loops. In: Proceedings ISLPED 1999, pp. 267–269 (August 1999)Google Scholar
  6. 6.
    Hennessy, J.L., Patterson, D.A.: Computer Architecture: A Quantitative approach, 3rd edn. Morgan Kaufmann Publishers, San Francisco (2003)Google Scholar
  7. 7.
    Veen, A.H.: Dataflow machine architecture. ACM Computing Surveys (CSUR) 18(4) (December 1986)Google Scholar
  8. 8.
    Watson, I., Gurd, J.R.: A Practical Dataflow Computer. IEEE Journal of Computer 15(2), 51–57 (1982)Google Scholar
  9. 9.
    Sparsø, J., Furber, S. (eds.): Principles of Asynchronous Circuit Design: A systems Perspective. Kluwer Academic Publishers, Dordrecht (2001)Google Scholar
  10. 10.
    AT91R40008 Electrical Characteristics, Atmel Co. (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Yijun Liu
    • 1
  • Steve Furber
    • 2
  • Zhenkun Li
    • 1
  1. 1.The Sensor Network Group, The Faculty of ComputerGuangdong University of TechnologyGuangzhouChina
  2. 2.The Advanced Processor Technologies Group, The School of Computer ScienceThe University of ManchesterManchesterUK

Personalised recommendations