Abstract
To improve the performance of embedded processors, an effective technique is collapsing critical computation subgraphs as application-specific instruction set extensions and executing them on custom functional units. The problem with this approach is the immense cost and the long times required to design a new processor for each application. As a solution to this issue, we propose an adaptive extensible processor in which custom instructions (CIs) are generated and added after chip-fabrication. To support this feature, custom functional units are replaced by a reconfigurable matrix of functional units (FUs). A systematic quantitative approach is used for determining the appropriate structure of the reconfigurable functional unit (RFU). We also introduce an integrated framework for generating mappable CIs on the RFU. Using this architecture, performance is improved by up to 1.33, with an average improvement of 1.16, compared to a 4-issue in-order RISC processor. By partitioning the configuration memory, detecting similar/subset CIs and merging small CIs, the size of the configuration memory is reduced by 40%.
Similar content being viewed by others
References
Altman E, Ebcioglu K (200) DAISY dynamic binary translation software. In: Software manual for DAISY open source release
Altman E, Gschwind M (2004) BOA: a second generation DAISY architecture. In: 31st international symposium on computer architecture
Arnold M, Corporaal H (2001) Designing domain specific processors. In: Proceedings of the 9th international workshop on hardware/software codesign, Copenhagen, April 2001, pp 61–66
Atasu K, Pozzi L, Ienne P (2003) Automatic application-specific instruction-set extension under microarchitectural constraints. In: 40th design automation conference
Black B, Shen JP (2000) Turboscalar: a high frequency high IPC microarchitecture. 27th international symposium on computer architecture
Bobda C (2003) Synthesis of dataflow graphs for reconfigurable systems using temporal partitioning and temporal placement. Ph.D. thesis, Faculty of Computer Science, Electrical Engineering and Mathematics, University of Paderborn
Carrillo JE, Chow P (2002) The effect of reconfigurable units in superscalar processors. In: ACM/SIGDA symposium on field programmable gate arrays, pp 141–150
Clark N, Zhong H, Mahlke S (2003) Processor acceleration through automated instruction set customization. In: The 36th annual IEEE/ACM international symposium on microarchitecture
Clark N, Kudlur M, Park H, Mahlke S, Flautner K (2004) Application-specific processing on a general-purpose core via transparent instruction set customization. In: The 37th annual IEEE/ACM international symposium on microarchitecture
Clark N, Blome J, Chu M, Mahlke S, Biles S, Flautner K (2005) An architecture framework for transparent instruction set customization in embedded processors. In: International symposium on computer architecture
Cong J, Fan Y, Han G, Zhang Z (2004) Application-specific instruction generation for configurable processor architectures. In: ACM/SIGDA symposium on field programmable gate arrays
Goodwin D, Petkov D (2003) Automatic generation of application specific processors. In: International conference on compilers, architecture, and synthesis for embedded systems
Hauck S, Fry T, Hosler M, Kao J (1997) The Chimaera reconfigurable functional unit. In: IEEE symposium FPGAs for custom computing machines, April 1997, pp 87–96
Hauser JR, Wawrzynek J (1997) GARP: A MIPS processor with a reconfigurable processor. In: IEEE symposium on FPGAs for custom computing machines, April 1997
Kastner R, Kaplan A, Memik S, Bozorgzadeh E (2002) Instruction generation for hybrid reconfigurable systems. In: ACM transactions on design automation of embedded systems (TODAES), October 2002
Klaiber A (2000) The technology behind Crusoe processors. Transmeta Technical Report
Lee MH, Singh H, Lu G, Bagherzadeh N, Kurdahi FJ (2000) Design and implementation of the MorphoSys reconfigurable computing processor. J VLSI Signal Process Syst Signal Image Video Technol
Lodi A, Toma M, Campi F, Cappelli A, Canegallo R, Guerrieri R (2003) A VLIW processor with reconfigurable instruction set for embedded applications. IEEE J Solid-State Circuits 38:1876–1886
Mehdipour F, Saheb Zamani M, Sedighi M (2006) An integrated temporal partitioning and physical design framework for static compilation of reconfigurable computing systems. Microprocess Microsyst 30:52–62
Mehdipour F, Noori H, Saheb Zamani M, Murakami K, Inoue K, Sedighi M (2006) Custom instruction generation using temporal partitioning techniques for a reconfigurable functional unit. In: IFIP international conference on embedded and ubiquitous computing (EUC’06)
Mehdipour F, Noori H, Saheb Zamani M, Murakami K, Sedighi M, Inoue K (2006) An integrated temporal partitioning and mapping framework for handling custom instructions on a reconfigurable functional unit. In: 11th Asia-Pacific computer systems architecture conference (ACSAC 2006)
Mei B, Vernalde S, Verkest D, Lauwereinsg R (2004) Design methodology for a tightly coupled vliw/reconfigurable matrix architecture: a case study. In: Proc. design, automation and test in Europe
Miyamori T, Olukotun K (1998) A quantitative analysis of reconfigurable coprocessors for multimedia applications. In: Symposium on FPGAs for custom computing machines
Micheli GD (1994) Synthesis and optimization of digital circuits. McGraw–Hill, New York
Noori H, Murakami K, Inoue K (2006) A general overview of an adaptive dynamic extensible processor. In: Proc. workshop on introspective architecture
Noori H, Mehdipour F, Murakami K, Inoue K, Saheb Zamani M (2006) A reconfigurable functional unit for an adaptive dynamic extensible processor. In: 16th IEEE international conference on field programmable logic and applications (FPL 2006), pp 781–784
Patel S, Lumetta S (2000) rePLay: a hardware framework for dynamic optimization. IEEE Trans Comput 50:590–608
Peymandoust A et al (2003) Automatic instruction set extension and utilization for embedded processors, application-specific systems, architectures, and processors
Razdan R, Smith M (1994) A high-performance microarchitecture with hardware-programmable functional units. In: The 27th annual IEEE/ACM international symposium on microarchitecture, pp 172–180
Rosner R, Almog Y, Moffie M, Schwartz N, Mendelson A (2004) Power awareness through selective dynamically optimized traces. In: 31st international symposium on computer architecture
Sassone PG, Scott Wills D (2004) Dynamic strand: collapsing speculative dependence chains for reducing pipeline communication. In: The 37th annual IEEE/ACM international symposium on microarchitecture
Sun F et al (2002) Synthesis of custom processors based on extensible platforms. In: International conference on computer aided design
Vassiliadis S, Wong S, Gaydadjiev G, Bertels K, Kuzmanov G, Panainte EM (2004) The MOLEN polymorphic processor. IEEE Trans Comput 53:1363–1375
Yu P, Mitra T (2004) Scalable custom instructions identification for instruction-set extensible processors. In: International conference on compilers, architecture, and synthesis for embedded systems
Yu P, Mitra T (2004) Characterizing embedded applications for instruction-set extensible processors. In: Design automation conference
3DSP Corp. http://www.3dsp.com
Altera Corp. http://www.altera.com
ARC International http://www.arc.com
CoWare Inc. http://www.coware.com
Improv Systems Inc. http://www.improvsys.com
Mibench www.eecs.umich.edu/mibench
Simplescalar www.simplescalar.com
Stretch Inc. http://www.stretchinc.com
Tensilica Inc. http://www.tensilica.com
Warp Processors http://www.cs.ucr.edu/~vahid/warp/
Xilinx Inc. http://www.xilinx.com
Synopsys Inc. http://www.synopsys.com/products/logic/design_compiler.html
Tarjan D, Thoziyoor Sh, Jouppi NP (2006) Cacti 4.0, HP laboratories. Technical Report
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Noori, H., Mehdipour, F., Murakami, K. et al. An architecture framework for an adaptive extensible processor. J Supercomput 45, 313–340 (2008). https://doi.org/10.1007/s11227-008-0174-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-008-0174-4