An efficient framework for dynamic reconfiguration of instruction-set customization

Abstract

We present an efficient framework for dynamic reconfiguration of application-specific custom instructions. A key component of this framework is an iterative algorithm for temporal and spatial partitioning of the loop kernels. Our algorithm maximizes the performance gain of an application while taking into consideration the dynamic reconfiguration cost. It selects the appropriate custom instructions for the loops and clubs them into one or more configurations. We model the temporal partitioning problem as a k-way graph partitioning problem. A dynamic programming based solution is used for the spatial partitioning. Comprehensive experimental results indicate that our iterative partitioning algorithm is highly scalable while producing optimal or near-optimal (99% of the optimal) performance gain.

This is a preview of subscription content, access via your institution.

References

  1. 1.

    ARC International. http://www.arc.com/configurablecores/

  2. 2.

    Arnold JM (2005) S5: The architecture and development flow of a software configurable processor. In: Proceedings of the IEEE international conference on field-programmable technology (FPT), pp 121–128

  3. 3.

    Arnold M, Corporaal H (2001) Designing domain-specific processors. In: Proceedings of the 9th ACM international symposium on hardware/software codesign (CODES), pp 61–66

  4. 4.

    Atasu K, Pozzi L, Ienne P (2003) Automatic application-specific instruction-set extensions under microarchitectural constraints. In: Proceedings of the 40th ACM/IEEE design automation conference (DAC), pp 256–261

  5. 5.

    Banerjee S, Bozorgzadeh E, Dutt N (2005) Physically-aware HW-SW partitioning for reconfigurable architectures with partial dynamic reconfiguration. In: Proceedings of the 42nd ACM/IEEE design automation conference (DAC), pp 335–340

  6. 6.

    Bauer L, Shafique M, Kramer S, Henkel J (2007) RISPP: Rotating instruction set processing platform. In: Proceedings of the 44th ACM/IEEE design automation conference (DAC), pp 791–796

  7. 7.

    Bondalapati K, Prasanna VK (1998) Mapping loops onto reconfigurable architectures. In: Proceedings of the 8th international workshop on field-programmable logic and applications (FPL), pp 268–277

  8. 8.

    Chatha KS, Vemuri R (1999) Hardware-software codesign for dynamically reconfigurable architectures. In: Proceedings of the 9th international workshop on field-programmable logic and applications (FPL), pp 175–184

  9. 9.

    Cheung N, Parameswaran S, Henkel J (2002) INSIDE: INstruction Selection/Identification & Design Exploration for extensible processors. In: Proceedings of the IEEE/ACM international conference on computer-aided design (ICCAD), pp 291–297

  10. 10.

    Clark N, Zhong H, Mahlke S (2003) Processor acceleration through automated instruction set customization. In: Proceedings of the 36th IEEE/ACM international symposium on microarchitecture (MICRO), pp 129–140

  11. 11.

    Cong J, Fan Y, Han G, Zhang Z (2004) Application-specific instruction generation for configurable processor architectures. In: Proceedings of the ACM/SIGDA 12th international symposium on field programmable gate arrays (FPGA), pp 183–189

  12. 12.

    de Bruijn NG (1981) Asymptotic methods in analysis. Dover, New York

    MATH  Google Scholar 

  13. 13.

    Faraboschi P, Brown G, Fisher JA, Desoli G, Homewood F (2000) Lx: A technology platform for customizable VLIW embedded processing. In: Proceedings of the 27th ACM/IEEE annual international symposium on computer architecture (ISCA), pp 203–213

  14. 14.

    Gonzalez RE (2000) Xtensa: A configurable and extensible processor. IEEE Micro 20(2):60–70

    Article  Google Scholar 

  15. 15.

    Hardnett C, Palem KV, Chobe Y (2006) Compiler optimization of embedded applications for an adaptive SoC architecture. In: Proceedings of the ACM international conference on compilers, architecture and synthesis for embedded systems (CASES), pp 312–322

  16. 16.

    Karypis G, Kumar V (1998) A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J Sci Comput 20(1):359–392

    Article  MathSciNet  Google Scholar 

  17. 17.

    Karypis G, Kumar V (1998) Multilevel k-way partitioning scheme for irregular graphs. J Parallel Distributed Comput 48(1):96–129

    Article  MathSciNet  Google Scholar 

  18. 18.

    Kastner R, Kaplan A, Memik SO, Bozorgzadeh E (2002) Instruction generation for hybrid reconfigurable systems. ACM Trans Des Autom Electron Syst (TODAES) 7(4):605–627

    Article  Google Scholar 

  19. 19.

    Kaul M, Vemuri R, Govindarajan S, Ouaiss I (1999) An automated temporal partitioning and loop fission approach for FPGA based reconfigurable synthesis of DSP applications. In: Proceedings of the 36th ACM/IEEE design automation conference (DAC), pp 616–622

  20. 20.

    Kernighan BW, Lin S (1970) An efficient heuristic procedure for partitioning graphs. Bell Syst Tech J 49(2):291–307

    Google Scholar 

  21. 21.

    Kreher DL, Stinson DR (1998) Combinatorial algorithms generation, enumeration and search. CRC Press, Boca Raton

    Google Scholar 

  22. 22.

    Lee J, Choi K, Dutt N (2002) Efficient instruction encoding for automatic instruction set design of configurable ASIPs. In: Proceedings of the IEEE/ACM international conference on computer-aided design (ICCAD), pp 649–654

  23. 23.

    Li Y, Callahan T, Darnell E, Harr R, Kurkure U, Stockwood J (2000) Hardware-software co-design of embedded reconfigurable architectures. In: Proceedings of the 37th ACM/IEEE design automation conference (DAC), pp 507–512

  24. 24.

    Li Z, Hauck S (2002) Configuration prefetching techniques for partial reconfigurable coprocessor with relocation and defragmentation. In: Proceedings of the ACM/SIGDA 10th international symposium on field-programmable gate arrays (FPGA), pp 187–195

  25. 25.

    Nevill-Manning CG, Witten IH (1997) Identifying hierarchical structure in sequences: A linear-time algorithm. J Artif Intell Res 7:67–82

    MATH  Google Scholar 

  26. 26.

    OpenIMPACT Compiler. http://www.gelato.uiuc.edu/

  27. 27.

    Purna KMG, Bhatia D (1999) Temporal partitioning and scheduling data flow graphs for reconfigurable computers. IEEE Trans Comput 48(6):579–590

    Article  Google Scholar 

  28. 28.

    Stretch Inc. Stretch S5530 software configurable processor. http://www.stretchinc.com/products/s5000.php

  29. 29.

    Yu P, Mitra T (2004) Characterizing embedded applications for instruction-set extensible processors. In: Proceedings of the 41st ACM/IEEE design automation conference (DAC), pp 723–728

  30. 30.

    Yu P, Mitra T (2004) Scalable custom instructions identification for instruction-set extensible processors. In: Proceedings of the ACM international conference on compilers, architecture, and synthesis for embedded systems (CASES), pp 69–78

  31. 31.

    Yu P, Mitra T (2007) Disjoint pattern enumeration for custom instructions identification. In: Proceedings of the 17th IEEE international conference on field programmable logic and applications (FPL), pp 273–278

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Huynh Phung Huynh.

Additional information

A preliminary version of this paper appeared in the proceedings of the 7th ACM/IEEE International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES) 2007.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Huynh, H.P., Sim, J.E. & Mitra, T. An efficient framework for dynamic reconfiguration of instruction-set customization. Des Autom Embed Syst 13, 91–113 (2009). https://doi.org/10.1007/s10617-008-9035-x

Download citation

Keywords

  • Customizable processors
  • Instruction-set extensions
  • Dynamic reconfiguration
  • Temporal partitioning
  • Runtime reconfiguration
  • Custom instructions