Design Exploration Methodology for Microprocessor and HW Accelerators

  • Angeliki Kritikakou
  • Francky Catthoor
  • Costas Goutis


Embedded systems usually have hard real-time constraints , which require custom HW designs. Although, they improve the performance, they have a high design cost and very limited flexibility, even when they are made partly configurable. The SW designs provide the required flexibility for a wide range of applications at the cost of reduced performance.


  1. 3.
    Ahn, Y., et al.: Socdal: System-on-chip design accelerator. Trans. Des. Autom. Electron. Syst. 13(1), 17:1–17:38 (2008)Google Scholar
  2. 12.
    Bjerregaard, T., et al.: A survey of research and practices of network-on-chip. ACM Comput. Surv. 38(1), 1–51 (2006)CrossRefMathSciNetGoogle Scholar
  3. 16.
    Callahan, T., et al.: The garp architecture and c compiler. J. Computer 33(4), 62–69 (2000)CrossRefGoogle Scholar
  4. 17.
    Capitanio, A., et al.: A hypergraph-based model for port allocation on multiple-register-file vliw architectures. Int’l J. Parallel Programming 23, 499–513 (1995)CrossRefGoogle Scholar
  5. 33.
    Compton, K., et al.: Reconfigurable computing: a survey of systems and software. ACM Comput. Surv. 34, 171–210 (2002)CrossRefGoogle Scholar
  6. 35.
    Cooper, K.D., et al.: Operator strength reduction. ACM Trans. Program. Lang. Syst. 23(5), 603–625 (2001)CrossRefGoogle Scholar
  7. 37.
    Criticalblue: Criticalblue cascade, programmable application coprocessor generation. (2012)
  8. 48.
    Diguet, J.P., et al.: A framework for high level estimations of signal processing vlsi implementations. J. VLSI Signal Process. Syst. 25(3), 261–284 (2000)CrossRefMATHGoogle Scholar
  9. 49.
    Dimond, R., et al.: Custard – a customisable threaded fpga soft processor and tools. In: Proc. Int’l Conf. Field Progr.Logic&Applic., pp. 1–6. IEEE, USA (2005)Google Scholar
  10. 54.
    Ferrandi, F., et al.: An evolutionary approach to area-time optimization of fpga designs. In: Proc. Int’l Conf. Embedded Computer SAMOS, pp. 145–152. IEEE, USA (2007)Google Scholar
  11. 55.
    Flatt, H., et al.: Mapping of a real-time object detection application to a configurable risc/ coprocessor architecture at full hd resolution. In: ReConFig&FPGAs, pp. 452–457. IEEE, USA (2010)Google Scholar
  12. 57.
    Gajski, D., et al.: Specsyn: an environment supporting the specify-explore-refine paradigm for hardware/software system design. Trans. VLSI 6(1), 84–100 (1998)CrossRefGoogle Scholar
  13. 64.
    Guo, Z., et al.: Efficient hardware code generation for fpgas. ACM Trans. Archit. Code Optim. 5(1), 6:1–6:26 (2008)Google Scholar
  14. 72.
    Hennessy, J., et al.: Computer Architecture, Fourth Edition: A Quantitative Approach. Mor.Kaufmann Pub., San Francisco, CA, USA (2006)Google Scholar
  15. 75.
    Huang, C., et al.: Scalable object detection accelerators on fpgas using custom design space exploration. In: Proc. Symp. Application Specific Processors, pp. 115–121. IEEE, USA (2011)Google Scholar
  16. 82.
    Jozwiak, L., et al.: Multi-objective optimal controller synthesis for heterogeneous embedded systems. In: Proc. Int’l Conf. EC-SAMOS, pp. 177–184. IEEE, USA (2006)Google Scholar
  17. 88.
    Kim, Y., et al.: Improving performance of nested loops on reconfigurable array processors. ACM Trans. Archit. Code Optim. 8(4), 32:1–32:23 (2012)Google Scholar
  18. 92.
    Koes, D.R., et al.: Near-optimal instruction selection on dags. In: Proc. Int’l Symp. Code Generation & Optimization, pp. 45–54. ACM, New York, NY, USA (2008)Google Scholar
  19. 93.
    Kornaros, G.: A soft multi-core architecture for edge detection & data analysis of microarray images. J. Syst. Archit. 56, 48–62 (2010)CrossRefGoogle Scholar
  20. 106.
    Lam, M.: Software pipelining: an effective scheduling technique for vliw machines. SIGPLAN Not. 23(7), 318–328 (1988)CrossRefGoogle Scholar
  21. 112.
    Liao, J., et al.: A model for hardware realization of kernel loops. In: Proc. Int’l Conf. FPGA, vol. 2778, pp. 334–344. Springer, Berlin, Germany (2003)Google Scholar
  22. 126.
    Melpignano, D., et al.: Platform 2012, a many-core computing accelerator for embedded socs: performance evaluation of visual analytics applications. In: DAC, pp. 1137–1142. ACM, USA (2012)Google Scholar
  23. 129.
    Milder, P., et al.: Computer generation of hardware for linear digital signal processing transforms. ACM Trans. Des. Autom. Electron. Syst. 17(2), 15:1–15:33 (2012)Google Scholar
  24. 138.
    Neumann, B., et al.: Design flow for embedded fpgas based on a flexible architecture template. In: Proc. Conf. Design, Automation & Test in Europe, pp. 56–61. ACM, USA (2008)Google Scholar
  25. 140.
    Novo, D., et al.: Ultra low energy domain specific instruction-set processor for on-line surveillance. In: Proc. SASP, pp. 30–35. IEEE Computer Society Press, Los Alamitos, CA, USA (2010)Google Scholar
  26. 145.
    Palermo, G., et al.: Multi-objective design space exploration of embedded systems. J. Embedded Comput. 1, 305–316 (2005)Google Scholar
  27. 156.
    Poletto, M., et al.: Linear scan register allocation. ACM Trans. Program. Lang. Syst. 21(5), 895–913 (1999)CrossRefGoogle Scholar
  28. 158.
    Pouchet, L.N., et al.: Polybenchmarks benchmark suite. (2012)
  29. 168.
    Sant’Anna, R., et al.: A left-edge algorithm approach for scheduling & allocation of hw contexts in dynamically reconfigurable architectures. In: FPGA, pp. 259–259. ACM, USA (2004)Google Scholar
  30. 172.
    Shahzad, M., et al.: Image coprocessor: A real-time approach towards object tracking. In: Proc. Int’l Conf. Digital Image Processing, pp. 220–224. IEEE, USA (2009)Google Scholar
  31. 173.
    Sheldon, D., et al.: Application-specific customization of parameterized fpga soft-core processors. In: Proc. Int’l Conf. Computer-Aided Design, pp. 261–268. IEEE, USA (2006)Google Scholar
  32. 174.
    Sheldon, D., et al.: Making good points: application-specific pareto-point generation for dse using statistical methods. In: Proc. Int’l Symp. FPGA, pp. 123–132. ACM, NY, USA (2009)Google Scholar
  33. 187.
    Synopsys: Synopsys synphony – high level synthesis solution. (2012)
  34. 193.
    Vassiliadis, N., et al.: The arise approach for extending embedded processors with arbitrary hardware accelerators. Trans. VLSI 17(2), 221–233 (2009)CrossRefMathSciNetGoogle Scholar
  35. 204.
    Xilinx: Logicore ip multi-port memory controller (2011)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Angeliki Kritikakou
    • 1
  • Francky Catthoor
    • 2
  • Costas Goutis
    • 3
  1. 1.University of PatrasPiraeusGreece
  2. 2.IMECLeuvenBelgium
  3. 3.University of PatrasPatrasGreece

Personalised recommendations