Efficient datapath merging for the overhead reduction of run-time reconfigurable systems

Abstract

High latencies in FPGA reconfiguration are known as a major overhead in run-time reconfigurable systems. This overhead can be reduced by merging multiple data flow graphs representing different kernels of the original program into a single (merged) datapath that will be configured less often compared to the separate datapaths scenario. However, the additional hardware introduced by this technique increases the kernels execution time. In this paper, we present a novel datapath merging technique that reduces both the configuration and execution times of kernels mapped on the reconfigurable fabric. Experimental results show up to 13% reduction in the configuration and execution times of kernels from media-bench workloads, compared to previous art on datapath merging. When compared to conventional high-level synthesis algorithms, our proposal reduces kernels configuration and execution times by up to 48%.

This is a preview of subscription content, log in to check access.

References

  1. 1.

    Woods N (2007) Integrating FPGAs in high-performance computing: the architecture and implementation perspective. In: Fifteenth ACM/SIGDA international symposium on field-programmable gate arrays (FPGA), pp 132–137

  2. 2.

    El-Ghazawi T, El-Araby E, Huang M, Gaj K, Kindratenko V, Buell D (2008) The promise of high-performance reconfigurable computing. Computer 41(2):69–76

    Article  Google Scholar 

  3. 3.

    Compton K, Hauck S (2002) Reconfigurable computing: a survey of systems and software. ACM Comput Surv 34(2):171–210

    Article  Google Scholar 

  4. 4.

    Li Z (2002) Configuration management techniques for reconfigurable computing. PhD thesis Northwestern University

  5. 5.

    Rollmann M, Merker R (2008) A cost model for partial dynamic reconfiguration. In: International conference on embedded computer systems: architectures, modeling, and simulation (SAMOS). July 2008, Samos, Greece, pp 182–186

  6. 6.

    Huang Z, Malik S (2001) Managing dynamic reconfiguration overhead in systems-on-a-chip design using reconfigurable datapaths and optimized interconnection networks. In: Proceeding of design automation and test in Europe (DATE), March 2001, Munich, Germany, pp 735–740

  7. 7.

    Coussy Ph, Morawiec A (2008) High-level synthesis from algorithm to digital circuit. Springer, Berlin

    Google Scholar 

  8. 8.

    Fazlali M, Zakerolhosseini A, Sabeghi M, Bertels K, Gaydadjiev G (2009) Datapath configuration time reduction for run-time reconfigurable systems. In: International conference on engineering of reconfigurable systems and algorithms (ERSA), July 2009, Nevada, USA, pp 323–327

  9. 9.

    Qu Y, Tiensyrj K, Soininen J-P, Nurmi J (2008) Design flow instantiation for run-time reconfigurable systems. EURASIP J Embed Syst 2(11):1–9

    Article  Google Scholar 

  10. 10.

    Fazlali M, Zakerolhosseini A, Sahhbahrami A, Gaydadjiev G (2009) High speed merged datapath design for run-time reconfigurable systems. In: International conference on field-programmable technology (FPT), December 2009, Sydney, Australia, pp 339–342

  11. 11.

    Kumlander D (2001) A new exact algorithm for the maximum-weight clique problem based on a heuristic vertex-coloring and a backtrack search. In: European congress of mathematics (4ECM), June–July 2001, Stockholm, Sweden, pp 202–208

  12. 12.

    Farshadjam F, Dehghan M, Fathy M, Ahmadi M (2006) A new compression based approach for reconfiguration overhead reduction in virtex based RTR systems. Comput Electr Eng 32(4):322–347

    Article  MATH  Google Scholar 

  13. 13.

    Chavet C, Andriamisaina C, Coussy Ph, Casseau E, Juin E, Urard P, Martin E (2007) A design flow dedicated to multimode architectures for DSP applications. In: International conference on computer-aided design (ICCAD), November 2007, San Jose, CA, USA, pp 604–611

  14. 14.

    Boden M, Fiebig T, Meißner T, Rülke S, Becker JA (2007) High-level synthesis of HW tasks targeting run-time reconfigurable FPGAs. In: IEEE international symposium on parallel and distributed processing (IPDPS 2007), March 2007, CA, USA, pp 1–8

  15. 15.

    Chiou L, Bhunia S, Roy K (2005) Synthesis of application-specific highly efficient multi-mode cores for embedded systems. ACM Trans Embed Syst Comput 4(1):168–188

    Article  Google Scholar 

  16. 16.

    Zuluaga M, Topham N (2008) Resource sharing in custom instruction set extensions. In: Symposium on application specific processors (SASP), June 2008, Wellington, DC, USA, pp 7–13

  17. 17.

    Moreano N, Borin E, de Souza C, Araujo G (2005) Efficient datapath merging for partially reconfigurable architectures. IEEE Trans Comput-Aided Des 24(7):969–980

    Article  Google Scholar 

  18. 18.

    Fazlali M Fallah KF, Zolghadr M, Zakerolhosseini A (2009) A new datapath merging method for reconfigurable systems. In: 5th international workshop on applied reconfigurable computing (ARC), March 2009, Karlsruhe, Germany, pp 157–168

  19. 19.

    Economakos G (2006) High-level synthesis with reconfigurable datapath components. In: IEEE international conference on parallel and distributed processing symposium (IPDPS), April 2006, Rhodes Island, Greece

  20. 20.

    Ghiasi S, Nahapetian A, Sarrafzadeh M (2004) An optimal algorithm for minimizing run-time reconfiguration delay. ACM Trans Embed Comput Syst 3(2):237–256

    Article  Google Scholar 

  21. 21.

    Mehdipour F, Saheb-Zamani M, Ahmadifar HR, Sedighi M, Murakami K (2006) Reducing reconfiguration time of reconfigurable computing systems in integrated temporal partitioning and physical design framework. In: IEEE international parallel and distributed processing symposium (IPDPS), April 2006, Rhodes Island, Greece, pp 219–230

  22. 22.

    Cordone R, Redaelli F, Redaelli MA, Santambrogio MD, Sciuto D (2009) Partitioning and scheduling of task graphs on partially dynamically reconfigurable FPGA. IEEE Trans Comput-Aided Des Integr Circuits Syst 28(5):662–675

    Article  Google Scholar 

  23. 23.

    Brisk P, Kaplan A, Sarrafzadeh M (2004) Area-efficient instruction set synthesis for reconfigurable system-on-chip designs. In: Annual conference on design automation (DAC), June 2004, San Diego, CA, USA, pp 395–400

  24. 24.

    Boden M, Fiebig T, Reiband M, Reichel P, Rulke S (2008) GePaRD a high-level generation flow for partially reconfigurable designs. In: IEEE computer society annual symposium on VLSI (ISVLSI), April 2008, France, pp 298–303

  25. 25.

    Shannon K, Diessel O (2007) Module graph merging and placement to reduce reconfiguration overheads in paged FPGA devices. In: international conference on field programmable logic and applications (FPL), August 2007, Amsterdam, Netherlands, pp 293–298

  26. 26.

    Fu W, Compton K (2005) An execution environment for reconfigurable computing. In: Annual IEEE symposium on field-programmable custom computing machines (FCCM), April 2005, CA, USA, pp 149–158

  27. 27.

    de Souza C, Lima AM, Moreano N, Araujo G (2005) The datapath merging problem in reconfigurable systems: Lower bounds and heuristic evaluation. ACM J Exp Algorithmic 10(2):1

    Google Scholar 

  28. 28.

    Garey M, Johnson DS (1979) Computers and intractability-a guide to the theory of NP-completeness. Freeman, San Francisco

    Google Scholar 

  29. 29.

    Ostergard PRJ (2002) A fast algorithm for the maximum-weighted clique problem. Discrete Appl Math 120(1–3):197–207

    Article  MathSciNet  Google Scholar 

  30. 30.

    Lee C, Potkonjak M, Mangione WS (1997) Media-bench: a tool for evaluating and synthesizing multimedia and communication systems. In: Annual IEEE/ACM international symposium on micro-architecture (MICRO), December 1997, California, USA, pp 330–335

  31. 31.

    GNU compiler collection internals. http://gcc.gnu.org/onlinedocs

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Mahmood Fazlali.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Fazlali, M., Zakerolhosseini, A. & Gaydadjiev, G. Efficient datapath merging for the overhead reduction of run-time reconfigurable systems. J Supercomput 59, 636–657 (2012). https://doi.org/10.1007/s11227-010-0458-3

Download citation

Keywords

  • Reconfigurable computing
  • Run-time reconfigurable systems
  • Datapath merging