The Journal of Supercomputing

, Volume 59, Issue 2, pp 636–657 | Cite as

Efficient datapath merging for the overhead reduction of run-time reconfigurable systems

  • Mahmood FazlaliEmail author
  • Ali Zakerolhosseini
  • Georgi Gaydadjiev


High latencies in FPGA reconfiguration are known as a major overhead in run-time reconfigurable systems. This overhead can be reduced by merging multiple data flow graphs representing different kernels of the original program into a single (merged) datapath that will be configured less often compared to the separate datapaths scenario. However, the additional hardware introduced by this technique increases the kernels execution time. In this paper, we present a novel datapath merging technique that reduces both the configuration and execution times of kernels mapped on the reconfigurable fabric. Experimental results show up to 13% reduction in the configuration and execution times of kernels from media-bench workloads, compared to previous art on datapath merging. When compared to conventional high-level synthesis algorithms, our proposal reduces kernels configuration and execution times by up to 48%.


Reconfigurable computing Run-time reconfigurable systems Datapath merging 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Woods N (2007) Integrating FPGAs in high-performance computing: the architecture and implementation perspective. In: Fifteenth ACM/SIGDA international symposium on field-programmable gate arrays (FPGA), pp 132–137 Google Scholar
  2. 2.
    El-Ghazawi T, El-Araby E, Huang M, Gaj K, Kindratenko V, Buell D (2008) The promise of high-performance reconfigurable computing. Computer 41(2):69–76 CrossRefGoogle Scholar
  3. 3.
    Compton K, Hauck S (2002) Reconfigurable computing: a survey of systems and software. ACM Comput Surv 34(2):171–210 CrossRefGoogle Scholar
  4. 4.
    Li Z (2002) Configuration management techniques for reconfigurable computing. PhD thesis Northwestern University Google Scholar
  5. 5.
    Rollmann M, Merker R (2008) A cost model for partial dynamic reconfiguration. In: International conference on embedded computer systems: architectures, modeling, and simulation (SAMOS). July 2008, Samos, Greece, pp 182–186 Google Scholar
  6. 6.
    Huang Z, Malik S (2001) Managing dynamic reconfiguration overhead in systems-on-a-chip design using reconfigurable datapaths and optimized interconnection networks. In: Proceeding of design automation and test in Europe (DATE), March 2001, Munich, Germany, pp 735–740 Google Scholar
  7. 7.
    Coussy Ph, Morawiec A (2008) High-level synthesis from algorithm to digital circuit. Springer, Berlin Google Scholar
  8. 8.
    Fazlali M, Zakerolhosseini A, Sabeghi M, Bertels K, Gaydadjiev G (2009) Datapath configuration time reduction for run-time reconfigurable systems. In: International conference on engineering of reconfigurable systems and algorithms (ERSA), July 2009, Nevada, USA, pp 323–327 Google Scholar
  9. 9.
    Qu Y, Tiensyrj K, Soininen J-P, Nurmi J (2008) Design flow instantiation for run-time reconfigurable systems. EURASIP J Embed Syst 2(11):1–9 CrossRefGoogle Scholar
  10. 10.
    Fazlali M, Zakerolhosseini A, Sahhbahrami A, Gaydadjiev G (2009) High speed merged datapath design for run-time reconfigurable systems. In: International conference on field-programmable technology (FPT), December 2009, Sydney, Australia, pp 339–342 Google Scholar
  11. 11.
    Kumlander D (2001) A new exact algorithm for the maximum-weight clique problem based on a heuristic vertex-coloring and a backtrack search. In: European congress of mathematics (4ECM), June–July 2001, Stockholm, Sweden, pp 202–208 Google Scholar
  12. 12.
    Farshadjam F, Dehghan M, Fathy M, Ahmadi M (2006) A new compression based approach for reconfiguration overhead reduction in virtex based RTR systems. Comput Electr Eng 32(4):322–347 CrossRefzbMATHGoogle Scholar
  13. 13.
    Chavet C, Andriamisaina C, Coussy Ph, Casseau E, Juin E, Urard P, Martin E (2007) A design flow dedicated to multimode architectures for DSP applications. In: International conference on computer-aided design (ICCAD), November 2007, San Jose, CA, USA, pp 604–611 Google Scholar
  14. 14.
    Boden M, Fiebig T, Meißner T, Rülke S, Becker JA (2007) High-level synthesis of HW tasks targeting run-time reconfigurable FPGAs. In: IEEE international symposium on parallel and distributed processing (IPDPS 2007), March 2007, CA, USA, pp 1–8 Google Scholar
  15. 15.
    Chiou L, Bhunia S, Roy K (2005) Synthesis of application-specific highly efficient multi-mode cores for embedded systems. ACM Trans Embed Syst Comput 4(1):168–188 CrossRefGoogle Scholar
  16. 16.
    Zuluaga M, Topham N (2008) Resource sharing in custom instruction set extensions. In: Symposium on application specific processors (SASP), June 2008, Wellington, DC, USA, pp 7–13 Google Scholar
  17. 17.
    Moreano N, Borin E, de Souza C, Araujo G (2005) Efficient datapath merging for partially reconfigurable architectures. IEEE Trans Comput-Aided Des 24(7):969–980 CrossRefGoogle Scholar
  18. 18.
    Fazlali M Fallah KF, Zolghadr M, Zakerolhosseini A (2009) A new datapath merging method for reconfigurable systems. In: 5th international workshop on applied reconfigurable computing (ARC), March 2009, Karlsruhe, Germany, pp 157–168 Google Scholar
  19. 19.
    Economakos G (2006) High-level synthesis with reconfigurable datapath components. In: IEEE international conference on parallel and distributed processing symposium (IPDPS), April 2006, Rhodes Island, Greece Google Scholar
  20. 20.
    Ghiasi S, Nahapetian A, Sarrafzadeh M (2004) An optimal algorithm for minimizing run-time reconfiguration delay. ACM Trans Embed Comput Syst 3(2):237–256 CrossRefGoogle Scholar
  21. 21.
    Mehdipour F, Saheb-Zamani M, Ahmadifar HR, Sedighi M, Murakami K (2006) Reducing reconfiguration time of reconfigurable computing systems in integrated temporal partitioning and physical design framework. In: IEEE international parallel and distributed processing symposium (IPDPS), April 2006, Rhodes Island, Greece, pp 219–230 Google Scholar
  22. 22.
    Cordone R, Redaelli F, Redaelli MA, Santambrogio MD, Sciuto D (2009) Partitioning and scheduling of task graphs on partially dynamically reconfigurable FPGA. IEEE Trans Comput-Aided Des Integr Circuits Syst 28(5):662–675 CrossRefGoogle Scholar
  23. 23.
    Brisk P, Kaplan A, Sarrafzadeh M (2004) Area-efficient instruction set synthesis for reconfigurable system-on-chip designs. In: Annual conference on design automation (DAC), June 2004, San Diego, CA, USA, pp 395–400 Google Scholar
  24. 24.
    Boden M, Fiebig T, Reiband M, Reichel P, Rulke S (2008) GePaRD a high-level generation flow for partially reconfigurable designs. In: IEEE computer society annual symposium on VLSI (ISVLSI), April 2008, France, pp 298–303 Google Scholar
  25. 25.
    Shannon K, Diessel O (2007) Module graph merging and placement to reduce reconfiguration overheads in paged FPGA devices. In: international conference on field programmable logic and applications (FPL), August 2007, Amsterdam, Netherlands, pp 293–298 Google Scholar
  26. 26.
    Fu W, Compton K (2005) An execution environment for reconfigurable computing. In: Annual IEEE symposium on field-programmable custom computing machines (FCCM), April 2005, CA, USA, pp 149–158 Google Scholar
  27. 27.
    de Souza C, Lima AM, Moreano N, Araujo G (2005) The datapath merging problem in reconfigurable systems: Lower bounds and heuristic evaluation. ACM J Exp Algorithmic 10(2):1 Google Scholar
  28. 28.
    Garey M, Johnson DS (1979) Computers and intractability-a guide to the theory of NP-completeness. Freeman, San Francisco zbMATHGoogle Scholar
  29. 29.
    Ostergard PRJ (2002) A fast algorithm for the maximum-weighted clique problem. Discrete Appl Math 120(1–3):197–207 CrossRefMathSciNetGoogle Scholar
  30. 30.
    Lee C, Potkonjak M, Mangione WS (1997) Media-bench: a tool for evaluating and synthesizing multimedia and communication systems. In: Annual IEEE/ACM international symposium on micro-architecture (MICRO), December 1997, California, USA, pp 330–335 Google Scholar
  31. 31.
    GNU compiler collection internals.

Copyright information

© Springer Science+Business Media, LLC 2010

Authors and Affiliations

  • Mahmood Fazlali
    • 1
    • 2
    Email author
  • Ali Zakerolhosseini
    • 1
  • Georgi Gaydadjiev
    • 2
  1. 1.Department of Computer EngineeringShahid Beheshti University G.CTeheranIran
  2. 2.Computer Engineering Lab.Delft University of TechnologyDelftThe Netherlands

Personalised recommendations