A retargetable framework for compiler/architecture co-development

Abstract

Compiler-in-the-Loop (CiL) architecture exploration is widely accepted as being the right track for fast development of Application Specific Instruction-set Processors (ASIP). In this context, both, automatic application-specific Instruction Set Extension (ISE) and code generation by a compiler have received huge attention in the past. Together, both techniques enable processor designers to quickly adapt a processor’s Instruction Set Architecture (ISA) to the needs of a certain set of applications and to provide an appropriate high-level programming model. This manuscript presents a tool flow for identification and utilization of Custom Instructions (CIs) during architecture exploration in an automated fashion. By embedding this tool flow in an industry-proven architecture exploration framework, a methodology for simultaneous compiler/architecture co-exploration is derived. The advantage of the presented tool flow lies in its ability to develop a reusable ISA and an appropriate compiler for a set of applications and therefore to support the design of programmable architectures. In addition, ASIP architecture exploration is effectively improved since time consuming application analysis and compiler retargeting is automated. Through compilation and simulation of several benchmarks in accordance to extended ISAs, reliable feedback on speedup, code size and usability of identified CIs is provided. Furthermore, results on area consumption for extended ISAs are presented in order to compare the obtained speedup with the invested hardware effort of new CIs.

This is a preview of subscription content, access via your institution.

References

  1. 1.

    aiSee Graph Description Language (GDL) (2008). http://www.aisee.com/gdl/nutshell

  2. 2.

    Graph Matching Library: VFLib (2008). http://amalfi.dis.unina.it/graph/db/vflib-2.0

  3. 3.

    Partitioned Boolean Quadratic Programming (PBQP) Solver (2008). http://www.it.usyd.edu.au/~scholz/pbqp.html

  4. 4.

    An Infrastructure for Research in Instruction-Level Parallelism (2009). http://www.trimaran.com

  5. 5.

    ACE—Associated Compiler Experts bv (2009) The COSY Compiler Development System. http://www.ace.nl

  6. 6.

    Aditya S, Kathail V, Rau B (1999) Elcor’s machine description system: Version 3.0. Tech rep, Hewlett-Packard Company

  7. 7.

    Aho AV, Ganapathi M, Tjiang SWK (1989) Code generation using tree pattern matching and dynamic programing. ACM Trans Program Lang Syst 11(4):491–516

    Article  Google Scholar 

  8. 8.

    Araujo G, Malik S, Lee M (1996) Using register transfer paths in code generation for heterogeneous memory register architectures. In: Proc of the design automation conference (DAC), pp 591–596

    Google Scholar 

  9. 9.

    International ARC (2008) ARCtangent Processor. http://www.arc.com

  10. 10.

    Arnold M, Corporaal H (2001) Designing domain-specific processors. In: Proc of the conference on hardware/software codesign (CODES-ISSS), pp 61–66

    Google Scholar 

  11. 11.

    Atasu K, Duendar G, Oezturan C (2005) An integer linear programming approach for identifying instruction set extensions. In: Proc of the conference on hardware/software codesign (CODES-ISSS), pp 172–177

    Google Scholar 

  12. 12.

    Baleani M, Gennari F, Jiang Y, Patel Y, Brayton RK, Sangiovanni-Vincenelli A (2002) HW/SW partitioning and code generation of embedded control applications on a reconfigurable architecture platform. In: Proc of the conference on hardware/software codesign (CODES-ISSS), pp 151–156

    Google Scholar 

  13. 13.

    Biswas P, Dutt N, Ienne P, Pozzi L (2006) Automatic identification of application-specific functional units with architecturally visible storage. In: Proc of the conference on design, automation & test in Europe (DATE), pp 1–6

    Google Scholar 

  14. 14.

    Bonzini P, Pozzi L (2006) Code transformation strategies for extensible embedded processors. In: Proc of the conf on compilers, architectures and synthesis for embedded systems (CASES), pp 242–252

    Google Scholar 

  15. 15.

    Bonzini P, Pozzi L (2007) A retargetable framework for automated discovery of custom instructions. In: Proc of the conference on application specific systems, architectures, and processors (ASAP), pp 334–341

    Google Scholar 

  16. 16.

    Bonzini P, Pozzi L (2007) Polynomial-time subgraph enumeration for automated instruction set extension. In: Proc of the conference on design, automation & test in Europe (DATE), pp 1331–1336

    Google Scholar 

  17. 17.

    Bonzini P, Pozzi L (2008) Recurrence-aware instruction set selection for extensible processors. IEEE Trans Very Large Scale Integr (VLSI) Syst 16(10):1259–1267

    Article  Google Scholar 

  18. 18.

    Brisk P, Kaplan A, Kastner R, Sarrafzadeh M (2002) Instruction generation and regularity extraction for reconfigurable processors. In: Proc of the conf on compilers, architectures and synthesis for embedded systems (CASES), pp 262–269

    Google Scholar 

  19. 19.

    Brisk P, Verma AK, Ienne P (2007) Rethinking custom ISE identification: a new processor-agnostic method. In: Proc of the conf on compilers, architectures and synthesis for embedded systems (CASES), pp 125–134

    Google Scholar 

  20. 20.

    Chang P, Mahlke S, Chen W, Warter N, Hwu W (1991) IMPACT: An architectural framework for multiple-instruction-issue processors. ACM Comput Archit News (SIGARCH) 19(3):266–275

    Article  Google Scholar 

  21. 21.

    Chen X, Maskell DL Sun Y (2007) Fast identification of custom instructions for extensible processors. IEEE Trans Comput-Aided Des 26(2):359–368

    MATH  Article  Google Scholar 

  22. 22.

    Clark N, Hormati A, Mahlke S (2006) Scalable subgraph mapping for acyclic computation accelerators. In: Proc of the conf on compilers, architectures and synthesis for embedded systems (CASES), pp 147–157

    Google Scholar 

  23. 23.

    Clark N, Zhong H, Mahlke S (2003) Processor acceleration through automated instruction set customisation. In: Proc of the symposium on microarchitecture, p 129

    Google Scholar 

  24. 24.

    Clark N, Zhong H, Mahlke S (2005) Automated custom instruction generation for domain-specific processor acceleration. IEEE Trans Comput 54(10):1258–1270

    Article  Google Scholar 

  25. 25.

    Cong J, Fan Y, Han G, Zhang Z (2004) Application-specific instruction generation for configurable processors. In: Proc of the symposium on field programmable gate arrays (FPGA), pp 183–189

    Google Scholar 

  26. 26.

    Cordella LP, Foggia P, Sansone C, Vento M (2004) A (sub)graph isomorhism algorithm for matching large graphs. IEEE Trans Pattern Mach Intell 26(10):1367–1372

    Article  Google Scholar 

  27. 27.

    Day JD, Zimmerman H (1983) The OSI reference model. IEEE (USA) 71:12

    Google Scholar 

  28. 28.

    Eckstein E, Koening O, Scholz B (2003) Code instruction selection based on SSA graphs. In: Proc of the workshop on software and compilers for embedded systems (SCOPES), pp 49–65

    Google Scholar 

  29. 29.

    Ertl MA (1999) Optimal code selection in DAGs. In: Proc of the symposium on principles of programming languages (POPL ’99), pp 242–249

    Google Scholar 

  30. 30.

    Fauth A (1995) Beyond tool-specific machine descriptions. In: Marwedel P, Goosens G (eds) Code generation for embedded processors. Kluwer Academic, Amsterdam

    Google Scholar 

  31. 31.

    Fauth A, Knoll A (1993) Automatic generation of DSP program development tools using a machine description formalism. In: Proc of the conf on acoustics, speech and signal processing (ICASSP)

    Google Scholar 

  32. 32.

    Fraser CW, Hanson DR, Proebsting TA (1992) Engineering efficient code generators using tree matching and dynamic programming. Tech rep TR-386-92

  33. 33.

    Freericks M (1993) The nML machine description formalism. Tech rep, Technical University of Berlin, Department of Computer Science

  34. 34.

    Freericks M, Fauth A, Knoll A (1994) Implementation of complexy DSP systems using high-level design tools. In: Signal Processing: VI theories and applications

    Google Scholar 

  35. 35.

    Galuzzi C, Bertels K (2008) The instruction-set extension problem: a survey. In: Workshop on applied reconfigurable computing (ARC). Springer, Heidelberg, pp 209–220

    Google Scholar 

  36. 36.

    Galuzzi C, Panainte EM, Yankova Y, Bertels K, Vassiliadis S (2006) Automatic selection of application-specific instruction-set extensions. In: Proc of the conference on hardware/software codesign (CODES-ISSS), pp 256–261

    Google Scholar 

  37. 37.

    Garey MR, Johnson DS (1990) Computers and intractability; a guide to the theory of NP-completeness. Freeman, New York. ISBN: 0716710455

    Google Scholar 

  38. 38.

    Geurts W et al. (1996) Design of DSP systems with chess/checkers. In: Proc of 2nd Int workshop on code generation for embedded processors

    Google Scholar 

  39. 39.

    Gonzales R (2000) Xtensa: a configurable and extensible processor. IEEE Micro 20(2):60–70

    MathSciNet  Article  Google Scholar 

  40. 40.

    Goodwin D, Petkov D (2003) Automatic generation of application specific processors. In: Proc of the conf on compilers, architectures and synthesis for embedded systems (CASES), pp 137–147

    Google Scholar 

  41. 41.

    Gries M, Keutzer K (2005) Building ASIPs: the mescal methodology. Springer, Berlin. ISBN: 0-387-26057-9

    Book  Google Scholar 

  42. 42.

    Grun P, Halambi A, Khare A, Ganesh V, Dutt N, Nicolau A (1998) EXPRESSION: an ADL for system level design exploration. Tech rep 98-29, Department of Information and Computer Science, University of California, Irvine, Sep

  43. 43.

    Gupta R (1992) Generalized dominators and postdominators. In: Proc of the symposium on principles of programming languages (POPL), pp 246–257

    Google Scholar 

  44. 44.

    Gyllenhaal J, Rau B, Hwu W (1996) HMDES version 2.0 specification. Tech rep, IMPACT Research Group, Univ of Illinois

  45. 45.

    Halambi A, Shrivastava A, Dutt N, Nicolau A (2001) A customizable compiler framework for embedded systems. In: Proc of the workshop on software and compilers for embedded systems (SCOPES)

    Google Scholar 

  46. 46.

    Halambi A, Grun P, Ganesh V, Khare A, Dutt N, Nicolau A (1999) EXPRESSION: A language for architecture exploration through compiler/simulator retargetability. In: Proc of the conference on design, automation & test in Europe (DATE)

    Google Scholar 

  47. 47.

    Hoffmann A, Kogel T, Nohl A, Braun G, Schliebusch O, Wahlen O, Wieferink A, Meyr H (2001) A novel methodology for the design of application specific instruction set processors (ASIP) using a machine description language. IEEE Trans Comput-Aided Des 20(11):1338–1354

    Article  Google Scholar 

  48. 48.

    Hoffmann A, Meyr H, Leupers R (2003) Architecture exploration for embedded processors with Lisa. Kluwer Academic, Amsterdam. ISBN:1-4020-73380

    Google Scholar 

  49. 49.

    Hohenauer M, Scharwaechter H, Karuri K, Wahlen O, Kogel T, Leupers R, Ascheid G, Meyr H (2004) A methodology and tool suite for c compiler generation from ADL models. In: Proc of the conference on design, automation & test in Europe (DATE)

    Google Scholar 

  50. 50.

    Itoh M, Takeuchi Y, Imai M, Shiomi A (2000) Synthesizable HDL generation for pipelined processors from a micro-operation description. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences

  51. 51.

    Khare A (1999) SIMPRESS: a simulator generation environment for system-on-chip exploration. Tech rep, department of information and computer science, University of California, Irvine, Sep

  52. 52.

    Kobayashi S, Takeuchi Y, Kitajima A, Imai M (2001) Compiler generation in PEAS-III: an ASIP development system. In: Workshop on software and compilers for embedded processors (SCOPES)

    Google Scholar 

  53. 53.

    Koes DR, Goldstein SC (2008) Near optimal instruction selection on DAGs. In: International symposium on code generation and optimization (CGO)

    Google Scholar 

  54. 54.

    Lanner D, Praet JV, Kifli A, Schoofs K, Geurts W, Thoen F, Goossens G (1995) Chess: retargetable code generation for embedded DSP processors

  55. 55.

    Leupers R, Karuri K, Kraemer S, Pandey M (2006) A design flow for configurable embedded processors based on optimized instruction set synthesis. In: Proc of the conference on design, automation & test in Europe (DATE), pp 581–586

    Google Scholar 

  56. 56.

    Leupers R, Marwedel P (1996) Instruction selection for embedded DSPs with complex instructions. In: Proc of the European conference on design automation (EDAC), pp 200–205

    Google Scholar 

  57. 57.

    Liao S, Devadas S, Keutzer K, Tjiang S (1995) Instruction selection using binate covering for code size optimization. In: Proc of the conf on computer aided design (ICCAD), pp 393–399

    Google Scholar 

  58. 58.

    Liem C, May T, Paulin P (1994) Instruction-set matching and selection for DSP and ASIP code generation. In: Proc of the European design and test conference (ED & TC), pp 31–37

    Google Scholar 

  59. 59.

    Mishra P, Dutt N, Nicolua A (2001) Functional abstraction driven design space exploration of heterogeneous programmable architectures. In: Proc of the symposium on system synthesis (ISSS), pp 256–261

    Google Scholar 

  60. 60.

    Nilsson NJ (1982) Principles of artificial intelligence. Springer, Berlin. ISBN-13: 9783540113409

    MATH  Google Scholar 

  61. 61.

    Nohl A, Braun G, Schliebusch O, Leupers R, Meyr H (2002) A universal technique for fast and flexible instruction-set architecture simulation. In: Proc of the design automation conference (DAC), pp 22–27

    Google Scholar 

  62. 62.

    Peymandoust A, Pozzi L, Ienne P, De Micheli G (2003) Automatic instruction set extension and utilization for embedded processors. In: Proc of the conference on application specific systems, architectures, and processors (ASAP), pp 108–118

    Google Scholar 

  63. 63.

    Pozzi L, Atasu K, Ienne P (2006) Exact and approximate algorithms for the extension of embedded processor instruction sets. IEEE Trans Comput-Aided Des 25(7):1209–1229

    Article  Google Scholar 

  64. 64.

    Pozzi L, Vuletic M, Ienne P (2002) Automatic topology-based identification of instruction-set extensions for embedded processors. In: Proc of the conference on design, automation & test in Europe (DATE)

    Google Scholar 

  65. 65.

    Sakai S, Togasaki M, Yamazaki K (2003) A note on greedy algorithms for the maximum weighted independent set problem. Discrete Appl Math 126:313–322

    MathSciNet  MATH  Article  Google Scholar 

  66. 66.

    Scharwaechter H, Youn JM, Leupers R, Paek Y, Ascheid G, Meyr H (2007) A code generator generator for multi-output instructions. In: Proc of the conference on hardware/software codesign (CODES-ISSS)

    Google Scholar 

  67. 67.

    Schliebusch O, Meyr H, Leupers R (2007) Optimized ASIP synthesis from architecture description language models. Springer, Berlin

    Google Scholar 

  68. 68.

    Scholz B, Eckstein E (2002) Register allocation for irregular architectures. In: Proc of the workshop on software and compilers for embedded systems (SCOPES), pp 129–148

    Google Scholar 

  69. 69.

    Scholz B, Eckstein E (2003) Address mode selection. In: Proc of the symposium on code generation and optimization (CGO), pp 337–346

    Google Scholar 

  70. 70.

    Scholz B, Eckstein E (2006) Partitioned Boolean quadratic programming (PBQP). Tech rep, University of Sydney: School of Information Technologies, Jan

  71. 71.

    Shah N, Keutzer K (2002) Network processors: origin of species. In: Proc of the Int symposium of computer and information science

    Google Scholar 

  72. 72.

    Synopsys Inc (2011) Synopsys processor designer. Synopsys Inc. http://www.synopsys.com/Systems/BlockDesign/ProcessorDev/Pages/default.aspx

  73. 73.

    Teich J, Weper R (2000) A joined architecture/compiler design environment for ASIPs. In: Proc of the conf on compilers, architectures and synthesis for embedded systems (CASES)

    Google Scholar 

  74. 74.

    Teich J, Weper R, Fischer D, Trinkert S (2000) BUILDABONG: a rapid prototyping environment for ASIPs. In: Proc of the DSP Germany (DSPD)

    Google Scholar 

  75. 75.

    Tensilica (2009) Xtensa configurable processors. http://www.tensilica.com

  76. 76.

    Tjiang, SWK (1993) An Olive Twig. Tech rep, Synopsys Inc

  77. 77.

    Ullmann JR (1976) An algorithm for subgraph isomorphism. J ACM 23(1):31–42

    MathSciNet  Article  Google Scholar 

  78. 78.

    Verma AK, Brisk P, Ienne P (2008) Fast, quasi-optimal, and pipelined instruction-set extensions. In: Proc of the conference on design, automation & test in Europe (DATE), pp 334–339

    Google Scholar 

  79. 79.

    Yang F (1999) ESP: a 10-year retrospective. In: Proc of the embedded systems programming conference

    Google Scholar 

  80. 80.

    Yu P, Mitra T (2004) Scalable custom instructions identification for instruction set extensible processors. In: Proc of the conf on compilers, architectures and synthesis for embedded systems (CASES), pp 69–78

    Google Scholar 

  81. 81.

    Yu P, Mitra T (2007) Disjoint pattern enumeration for custom instruction identification. In: Proc of the conf on field programmable logic and applications (FPL), pp 273–278

    Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Hanno Scharwaechter.

Additional information

Extension of Conference Paper: An earlier version [66] of this paper appeared in the proceedings of the 5th IEEE/ACM international conference on hardware/software codesign and system synthesis. It introduces a code-generator named CBurg which is now applied for implementing a code-selector engine of the CoSy compiler system from ACE. Additionally, a methodology for recurrence-aware identification of custom instructions is presented that builds on the data flow graphs from the compiler’s intermediate representation. At the same time, it produces a code-generator description which is used to retarget the compiler backend to a new instruction set.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Scharwaechter, H., Kammler, D., Leupers, R. et al. A retargetable framework for compiler/architecture co-development. Des Autom Embed Syst 15, 311–342 (2011). https://doi.org/10.1007/s10617-011-9080-8

Download citation

Keywords

  • Compiler/architecture co-design
  • Code-selection
  • Instruction Set Extension (ISE)
  • Application Specific Instruction-set Processors (ASIP)