Design Automation for Embedded Systems

, Volume 17, Issue 1, pp 27–51 | Cite as

Implementation and validation of architectural space exploration techniques for domain-specific reconfigurable computing



Domain specific coarse-grained reconfigurable architectures (CGRAs) have great promise for energy-efficient flexible designs for a suite of applications. Designing such a reconfigurable device for an application domain is very challenging because the needs of different applications must be carefully balanced to achieve the targeted design goals. It requires the evaluation of many potential architectural options to select an optimal solution. Exploring the design space manually would be very time consuming and may not even be feasible for very large designs. Even mapping one algorithm onto a customized architecture can require time ranging from minutes to hours. Running a full power simulation on a complete suite of benchmarks for various architectural options require several days. Finding the optimal point in a design space could require a very long time. We have designed a framework/tool that made such design space exploration (DSE) feasible. The resulting framework allows testing a family of algorithms and architectural options in minutes rather than days and can allow rapid selection of architectural choices. In this paper, we describe our DSE framework for domain specific reconfigurable computing where the needs of the application domain drive the construction of the device architecture. The framework has been developed to automate design space case studies, allowing application developers to explore architectural tradeoffs efficiently and reach solutions quickly. We selected some of the core signal processing benchmarks from the MediaBench benchmark suite and some edge-detection benchmarks from the image processing domain for our case studies. We describe two search algorithms: a stepped search algorithm motivated by our manual design studies and a more traditional gradient based optimization. Approximate energy models are developed in each case to guide the search toward a minimal energy solution. We validate our search results by comparing the architectural solutions selected by our tool to an architecture optimized manually and by performing sensitivity tests to evaluate the ability of our algorithms to find good quality minima in the design space. All selected fabric architectures were synthesized on 130 nm cell-based ASIC fabrication process from IBM. These architectures consume almost same amount of energy on average, but the gradient based approach is more general and promises to extend well to new problem domains. We expect these or similar heuristics and the overall design flow of the system to be useful for a wide range of architectures, including mesh based and other commonly used architectures for CGRAs.


Domain specific reconfigurable computing Coarse-grained reconfigurable architectures Design space exploration 


  1. 1.
    Monaghan S, Cowen C, Noakes PD (1993) Using fpgas to implement reconfigurable dsp architectures. In: IEE colloquium on field programmable gate arrays—technology and applications Google Scholar
  2. 2.
    Fawcett BK (1995) Fpgas in reconfigurable computing applications. In: WESCON Google Scholar
  3. 3.
    Kramberger I (1999) Dsp acceleration using a reconfigurable fpga. In: Proc of IEEE international symposium on industrial electronics Google Scholar
  4. 4.
    Katona M, Krajacevic Z, Teslic N, Kovacevic V (2005) Signal processing algorithms implementation with fpgas. In: 7th international conference on telecommunications in modern satellite, cable and broadcasting services 2005, vol 1, pp 127–130. doi: 10.1109/TELSKS.2005.1572078 Google Scholar
  5. 5.
    Baz M (2008) Optimization of mapping onto a flexible low-power electronic fabric architecture. PhD Dissertation, University of Pittsburgh Google Scholar
  6. 6.
    Levine B, Schmit H (2002) Piperench: power and performance evaluation of a programmable pipelined datapath. In: Presented at hot chips, vol 14 Google Scholar
  7. 7.
    Levine B (2005) Kilocore: scalable, high-performance, and power efficient coarse-grained reconfigurable fabrics. In: International symposium on advanced reconfigurable systems Google Scholar
  8. 8.
    Mehta G, Stander J, Lucas J, Hoare RR, Hunsaker B, Jones AK (2006) A low-energy reconfigurable fabric for the supercisc architecture. J Low Power Electron 2(2):148–164 CrossRefGoogle Scholar
  9. 9.
    Mehta G, Stander J, Baz M, Hunsaker B, Jones AK (2009) Interconnect customization for a hardware fabric. ACM Trans Design Autom Electron Syst 14(1):11, 32 pages, doi: 10.1145/1455229.1455240 CrossRefGoogle Scholar
  10. 10.
    Mehta G, Hoare RR, Stander J, Jones AK (2006) Design space exploration for low-power reconfigurable fabrics. In: Proc of the reconfigurable architectures workshop (RAW) Google Scholar
  11. 11.
    Mehta G, Stander J, Baz M, Hunsaker B, Jones AK (2007) Interconnect customization for a coarse-grained reconfigurable fabric. In: Proc of the IPDPS reconfigurable architecture workshop (RAW), pp 165.1–165.8 Google Scholar
  12. 12.
    Mehta G, Ihrig CJ, Jones AK (2008) Reducing energy by exploring heterogeneity in a coarse-grain fabric. In: Proc of the IPDPS reconfigurable architecture workshop (RAW) Google Scholar
  13. 13.
    Benoit P, Sassatelli G, Torres L, Demigny D, Robert M, Cambon G (2003) Metrics for reconfigurable architectures characterization: remanence and scalability. In: IEEE IPDPS reconfigurable architecture workshop Google Scholar
  14. 14.
    Enzler R, Jeger T, Cottet D, Troster G (2000) High-level area and performance estimation of hardware building blocks on FPGAs. In: Field-programmable logic and applications forum on design language Google Scholar
  15. 15.
    Bilavarn S, Gogniat G, Philippe JL, Bossuet L (2003) Fast prototyping of reconfigurable architectures from a C program. In: IEEE symposium on circuits and systems Google Scholar
  16. 16.
    Zabel M, Kohler S, Zimmerling M, Preuber T, Spallek R (2005) Design space exploration of coarse-grain reconfigurable dsps. In: International conference on reconfigurable computing and FPGAs. ReConFig 2005, pp 8–15. doi: 10.1109/RECONFIG.2005.15 Google Scholar
  17. 17.
    Mehdipour F, Noori H, Zamani M, Inoue K, Murakami K (2008) Design space exploration for a coarse grain accelerator. In: Design automation conference, 2008. ASPDAC 2008. Asia and South pacific, pp 685–690. doi: 10.1109/ASPDAC.2008.4484039 CrossRefGoogle Scholar
  18. 18.
    Shehan B, Jahr R, Uhrig S, Ungerer T (2010) Reconfigurable grid alu processor: optimization and design space exploration. In: 13th Euromicro conference on digital system design: architectures, methods and tools (DSD), 2010, pp 71–79. doi: 10.1109/DSD.2010.28 CrossRefGoogle Scholar
  19. 19.
    Bossuet L, Gogniat G, Philippe JL (2005) Generic design space exploration for reconfigurable architectures. In: IEEE IPDPS reconfigurable architectures workshop (RAW) Google Scholar
  20. 20.
    Kim Y, Mahapatra R, Choi K (2010) Design space exploration for efficient resource utilization in coarse-grained reconfigurable architecture. IEEE Trans Very Large Scale Integr (VLSI) Syst 18(10):1471–1482. doi: 10.1109/TVLSI.2009.2025280 CrossRefGoogle Scholar
  21. 21.
    Sotiropoulou CL, Nikolaidis S (2010) Design space exploration for fpga-based multiprocessing systems. In: 17th IEEE international conference on electronics, circuits, and systems (ICECS), pp 1164–1167. 2010. doi: 10.1109/ICECS.2010.5724724 Google Scholar
  22. 22.
    Irturk A, Benson B, Mirzaei S, Kastner R (2008) An fpga design space exploration tool for matrix inversion architectures. In: Symposium on application specific processors, 2008. SASP 2008, pp 42–47. doi: 10.1109/SASP.2008.4570784 CrossRefGoogle Scholar
  23. 23.
    Karuri K, Chattopadhyay A, Chen X, Kammler D, Hao L, Leupers R, Meyr H, Ascheid G (2008) A design flow for architecture exploration and implementation of partially reconfigurable processors. IEEE Trans Very Large Scale Integr (VLSI) Syst 16(10):1281–1294. doi: 10.1109/TVLSI.2008.2002685 CrossRefGoogle Scholar
  24. 24.
    Chattopadhyay A, Chen X, Ishebabi H, Leupers R, Ascheid G, Meyr H (2008) High-level modelling and exploration of coarse-grained re-configurable architectures. In: Design, automation and test in Europe, 2008. DATE ’08, pp 1334–1339. doi: 10.1109/DATE.2008.4484864 CrossRefGoogle Scholar
  25. 25.
    Bauer L, Shafique M, Henkel J (2009) Cross-architectural design space exploration tool for reconfigurable processors. In: Design, automation test in Europe conference exhibition, 2009. DATE ’09, pp 958–963 CrossRefGoogle Scholar
  26. 26.
    Mei B, Lambrechts A, Verkest D, Mignolet JY, Lauwereins R (2005) Architecture exploration for a reconfigurable architecture template. IEEE Des Test 22:90–101. doi: 10.1109/MDT.2005.27 CrossRefGoogle Scholar
  27. 27.
    Bouwens F, Berekovic M, Kanstein A, Gaydadjiev G (2007) Architectural exploration of the adres coarse-grained reconfigurable array. In: Proceedings of the 3rd international conference on reconfigurable computing: architectures, tools and applications, ARC’07. Springer, Berlin, pp 1–13. CrossRefGoogle Scholar
  28. 28.
    Sun K, Pan X, Wang J, Ping L (2007) Pad: a design space exploration model for reconfigurable systems. In: Fourth international conference on information technology, 2007, ITNG ’07, pp 964–965. doi: 10.1109/ITNG.2007.146 Google Scholar
  29. 29.
    Miramond B, Delosme JM (2005) Design space exploration for dynamically reconfigurable architectures. In: Proceedings design, automation and test in Europe, 2005, vol 1, pp 366–371. doi: 10.1109/DATE.2005.118 Google Scholar
  30. 30.
    Clark N, Blome J, Chu M, Mahlke S, Biles S, Flautner K (2005) An architecture framework for transparent instruction set customization in embedded processors. SIGARCH Comput Archit News 33(2):272–283. doi: 10.1145/1080695.1069993. CrossRefGoogle Scholar
  31. 31.
    Wirthlin MJ, Hutchings BL (1995) A dynamic instruction set computer. In: Proc of FCCM Google Scholar
  32. 32.
    Cong J, Fan Y, Han G, Zhang Z (2004) Application-specific instruction generation for configurable processor architectures. In: Proc of ISFPGA Google Scholar
  33. 33.
    Mbaye M, Belanger N, Savaria Y, Pierre S (2005) Application specific instruction-set processor generation for video processing based on loop optimization. In: International symposium on circuits and systems (ISCAS 2005). IEEE Press, New York, pp 515–3518 Google Scholar
  34. 34.
    Mbaye M, Belanger N, Savaria Y, Pierre S (2007) A novel application-specific instruction-set processor design approach for video processing acceleration. J VLSI Signal Process Syst 47(3):297–315 CrossRefGoogle Scholar
  35. 35.
    Vogt T, Wehn N (2008) A reconfigurable application specific instruction set processor for convolutional and turbo decoding in a sdr environment. In: Design, automation and test in Europe, DATE 2008. IEEE Press, New York, pp 38–43 CrossRefGoogle Scholar
  36. 36.
    Guan X, Fei Y, Lin H (2011) Hierarchical design of an application-specific instruction set processor for high-throughput and scalable fft processing. IEEE Trans Very Large Scale Integr (VLSI) Syst PP(99):1–13. doi: 10.1109/TVLSI.2011.2105512 Google Scholar
  37. 37.
    Shen Z, He H, Zhang Y, Sun Y (2007) A video specific instruction set architecture for asip design. VLSI Des 2007(2):1–7. doi: 10.1155/2007/58431 CrossRefGoogle Scholar
  38. 38.
    Fanucci L, Cassiano M, Saponara S, Kammler D, Witte EM, Schliebusch O, Ascheid G, Leupers R, Meyr H (2006) Asip design and synthesis for non linear filtering in image processing. In: Proceedings of the conference on design, automation and test in Europe (DATE), Leuven, Belgium. European Design and Automation Association, Grenoble, pp 233–238 Google Scholar
  39. 39.
    Brisk P, Verma AK, Ienne P (2007) Optimal polynomial-time interprocedural register allocation for high-level synthesis and asip design. In: Proc of the international conference on computer-aided design (CCAD). IEEE Press, Piscataway, pp 172–179 Google Scholar
  40. 40.
    Dinh Q, Chen D, Wong MDF (2008) Efficient asip design for configurable processors with fine-grained resource sharing. In: Proceedings of the international symposium on field programmable gate arrays (ISFPGA). ACM, New York, pp 99–106. Google Scholar
  41. 41.
    Mehta G, Jones A (2010) An architectural space exploration tool for domain specific reconfigurable computing. In: IEEE international symposium on parallel distributed processing, workshops and phd forum (IPDPSW), 2010, pp 1–8. doi: 10.1109/IPDPSW.2010.5470735 CrossRefGoogle Scholar
  42. 42.
    Micheli GD (1994) Synthesis and optimization of digital circuits. McGraw-Hill, New York Google Scholar
  43. 43.
    Hoare R, Jones AK, Kusic D, Fazekas J, Foster J, Tung S, McCloud M (2006) Rapid VLIW processor customization for signal processing applications using combinational hardware functions. EURASIP J Appl Signal Process 46:472 (23 pages) Google Scholar
  44. 44.
    Bray T, Paoli J, Sperberg-McQueen CM, Maler E, Yergeau F (2006) Extensible markup language (xml) 1.0 (fourth edition)—origin and goals. Tech Rep 20060816, World Wide Web Consortium Google Scholar
  45. 45.
    Ihrig CJ, Baz M, Stander J, Hoare RR, Norman BA, Prokopyev O, Hunsaker B, Jones AK (2008) Greedy algorithms for mapping onto a coarse-grained reconfigurable fabric. I-Tech Education and Publishing, Vienna Google Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  1. 1.University of North TexasDentonUSA
  2. 2.University of PittsburghPittsburghUSA

Personalised recommendations