Skip to main content
Log in

An architecture framework for an adaptive extensible processor

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

To improve the performance of embedded processors, an effective technique is collapsing critical computation subgraphs as application-specific instruction set extensions and executing them on custom functional units. The problem with this approach is the immense cost and the long times required to design a new processor for each application. As a solution to this issue, we propose an adaptive extensible processor in which custom instructions (CIs) are generated and added after chip-fabrication. To support this feature, custom functional units are replaced by a reconfigurable matrix of functional units (FUs). A systematic quantitative approach is used for determining the appropriate structure of the reconfigurable functional unit (RFU). We also introduce an integrated framework for generating mappable CIs on the RFU. Using this architecture, performance is improved by up to 1.33, with an average improvement of 1.16, compared to a 4-issue in-order RISC processor. By partitioning the configuration memory, detecting similar/subset CIs and merging small CIs, the size of the configuration memory is reduced by 40%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Altman E, Ebcioglu K (200) DAISY dynamic binary translation software. In: Software manual for DAISY open source release

  2. Altman E, Gschwind M (2004) BOA: a second generation DAISY architecture. In: 31st international symposium on computer architecture

  3. Arnold M, Corporaal H (2001) Designing domain specific processors. In: Proceedings of the 9th international workshop on hardware/software codesign, Copenhagen, April 2001, pp 61–66

  4. Atasu K, Pozzi L, Ienne P (2003) Automatic application-specific instruction-set extension under microarchitectural constraints. In: 40th design automation conference

  5. Black B, Shen JP (2000) Turboscalar: a high frequency high IPC microarchitecture. 27th international symposium on computer architecture

  6. Bobda C (2003) Synthesis of dataflow graphs for reconfigurable systems using temporal partitioning and temporal placement. Ph.D. thesis, Faculty of Computer Science, Electrical Engineering and Mathematics, University of Paderborn

  7. Carrillo JE, Chow P (2002) The effect of reconfigurable units in superscalar processors. In: ACM/SIGDA symposium on field programmable gate arrays, pp 141–150

  8. Clark N, Zhong H, Mahlke S (2003) Processor acceleration through automated instruction set customization. In: The 36th annual IEEE/ACM international symposium on microarchitecture

  9. Clark N, Kudlur M, Park H, Mahlke S, Flautner K (2004) Application-specific processing on a general-purpose core via transparent instruction set customization. In: The 37th annual IEEE/ACM international symposium on microarchitecture

  10. Clark N, Blome J, Chu M, Mahlke S, Biles S, Flautner K (2005) An architecture framework for transparent instruction set customization in embedded processors. In: International symposium on computer architecture

  11. Cong J, Fan Y, Han G, Zhang Z (2004) Application-specific instruction generation for configurable processor architectures. In: ACM/SIGDA symposium on field programmable gate arrays

  12. Goodwin D, Petkov D (2003) Automatic generation of application specific processors. In: International conference on compilers, architecture, and synthesis for embedded systems

  13. Hauck S, Fry T, Hosler M, Kao J (1997) The Chimaera reconfigurable functional unit. In: IEEE symposium FPGAs for custom computing machines, April 1997, pp 87–96

  14. Hauser JR, Wawrzynek J (1997) GARP: A MIPS processor with a reconfigurable processor. In: IEEE symposium on FPGAs for custom computing machines, April 1997

  15. Kastner R, Kaplan A, Memik S, Bozorgzadeh E (2002) Instruction generation for hybrid reconfigurable systems. In: ACM transactions on design automation of embedded systems (TODAES), October 2002

  16. Klaiber A (2000) The technology behind Crusoe processors. Transmeta Technical Report

  17. Lee MH, Singh H, Lu G, Bagherzadeh N, Kurdahi FJ (2000) Design and implementation of the MorphoSys reconfigurable computing processor. J VLSI Signal Process Syst Signal Image Video Technol

  18. Lodi A, Toma M, Campi F, Cappelli A, Canegallo R, Guerrieri R (2003) A VLIW processor with reconfigurable instruction set for embedded applications. IEEE J Solid-State Circuits 38:1876–1886

    Article  Google Scholar 

  19. Mehdipour F, Saheb Zamani M, Sedighi M (2006) An integrated temporal partitioning and physical design framework for static compilation of reconfigurable computing systems. Microprocess Microsyst 30:52–62

    Article  Google Scholar 

  20. Mehdipour F, Noori H, Saheb Zamani M, Murakami K, Inoue K, Sedighi M (2006) Custom instruction generation using temporal partitioning techniques for a reconfigurable functional unit. In: IFIP international conference on embedded and ubiquitous computing (EUC’06)

  21. Mehdipour F, Noori H, Saheb Zamani M, Murakami K, Sedighi M, Inoue K (2006) An integrated temporal partitioning and mapping framework for handling custom instructions on a reconfigurable functional unit. In: 11th Asia-Pacific computer systems architecture conference (ACSAC 2006)

  22. Mei B, Vernalde S, Verkest D, Lauwereinsg R (2004) Design methodology for a tightly coupled vliw/reconfigurable matrix architecture: a case study. In: Proc. design, automation and test in Europe

  23. Miyamori T, Olukotun K (1998) A quantitative analysis of reconfigurable coprocessors for multimedia applications. In: Symposium on FPGAs for custom computing machines

  24. Micheli GD (1994) Synthesis and optimization of digital circuits. McGraw–Hill, New York

    Google Scholar 

  25. Noori H, Murakami K, Inoue K (2006) A general overview of an adaptive dynamic extensible processor. In: Proc. workshop on introspective architecture

  26. Noori H, Mehdipour F, Murakami K, Inoue K, Saheb Zamani M (2006) A reconfigurable functional unit for an adaptive dynamic extensible processor. In: 16th IEEE international conference on field programmable logic and applications (FPL 2006), pp 781–784

  27. Patel S, Lumetta S (2000) rePLay: a hardware framework for dynamic optimization. IEEE Trans Comput 50:590–608

    Article  Google Scholar 

  28. Peymandoust A et al (2003) Automatic instruction set extension and utilization for embedded processors, application-specific systems, architectures, and processors

  29. Razdan R, Smith M (1994) A high-performance microarchitecture with hardware-programmable functional units. In: The 27th annual IEEE/ACM international symposium on microarchitecture, pp 172–180

  30. Rosner R, Almog Y, Moffie M, Schwartz N, Mendelson A (2004) Power awareness through selective dynamically optimized traces. In: 31st international symposium on computer architecture

  31. Sassone PG, Scott Wills D (2004) Dynamic strand: collapsing speculative dependence chains for reducing pipeline communication. In: The 37th annual IEEE/ACM international symposium on microarchitecture

  32. Sun F et al (2002) Synthesis of custom processors based on extensible platforms. In: International conference on computer aided design

  33. Vassiliadis S, Wong S, Gaydadjiev G, Bertels K, Kuzmanov G, Panainte EM (2004) The MOLEN polymorphic processor. IEEE Trans Comput 53:1363–1375

    Article  Google Scholar 

  34. Yu P, Mitra T (2004) Scalable custom instructions identification for instruction-set extensible processors. In: International conference on compilers, architecture, and synthesis for embedded systems

  35. Yu P, Mitra T (2004) Characterizing embedded applications for instruction-set extensible processors. In: Design automation conference

  36. 3DSP Corp. http://www.3dsp.com

  37. Altera Corp. http://www.altera.com

  38. ARC International http://www.arc.com

  39. CoWare Inc. http://www.coware.com

  40. Improv Systems Inc. http://www.improvsys.com

  41. Mibench www.eecs.umich.edu/mibench

  42. Simplescalar www.simplescalar.com

  43. Stretch Inc. http://www.stretchinc.com

  44. Tensilica Inc. http://www.tensilica.com

  45. Warp Processors http://www.cs.ucr.edu/~vahid/warp/

  46. Xilinx Inc. http://www.xilinx.com

  47. Synopsys Inc. http://www.synopsys.com/products/logic/design_compiler.html

  48. Tarjan D, Thoziyoor Sh, Jouppi NP (2006) Cacti 4.0, HP laboratories. Technical Report

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hamid Noori.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Noori, H., Mehdipour, F., Murakami, K. et al. An architecture framework for an adaptive extensible processor. J Supercomput 45, 313–340 (2008). https://doi.org/10.1007/s11227-008-0174-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-008-0174-4

Keywords

Navigation