Compiler assisted architectural exploration framework for coarse grained reconfigurable arrays

Dimitroulakos, Grigorios; Kostaras, Nikos; Galanis, Michalis D.; Goutis, Costas E.

doi:10.1007/s11227-008-0208-y

Compiler assisted architectural exploration framework for coarse grained reconfigurable arrays

Published: 16 May 2008

Volume 48, pages 115–151, (2009)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Grigorios Dimitroulakos¹,
Nikos Kostaras²,
Michalis D. Galanis¹ &
…
Costas E. Goutis¹

170 Accesses
2 Citations
Explore all metrics

Abstract

Coarse Grain Reconfigurable Array (CGRA) architectures have been extensively used for accelerating time consuming loops. The design of such systems requires good balance between the architecture abilities and the loops’ characteristics. A reliable design is characterized by optimized cost-performance trade-off. The main target of this paper is to present an exploration framework that automates the evaluation of CGRA architectures. In specific, the framework helps the designer to identify CGRA architectures tuned toward a specific application domain. The whole process is assisted: (1) by an optimized retargetable compiler based on modulo scheduling and (2) by the Synopsys Design Compiler that provides realization metrics such as the area and clock frequency. Both target on the description of a parametric CGRA architecture template which is capable of instantiating a large diversity of these architectures. Until now, many studies suggest that clock frequency influences performance. However, none of them examines the impact of architecture on clock frequency and performance. Our work studies in a unified way for the first time the area, the clock frequency, the instructions per cycle and performance. Hence, architectures with good compromise between cost and performance can be identified. Another objective of the paper is to present the advances made to the compiler approach used by the exploration framework. In specific, a new more effective priority scheme is proposed while the modulo scheduler has been equipped with backtracking capability. The experiments outline the algorithm’s efficiency and scalability for a given set of DSP benchmarks. Moreover, optimized architectures with respect to cost-performance trade-off have been identified by an exploration over 72 CGRA architecture alternatives.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Hybrid Machine Learning Model for Code Optimization

Article 22 September 2023

Recent progress in InGaZnO FETs for high-density 2T0C DRAM applications

Article 21 September 2023

Formal method for the synthesis of optimal topologies of computing systems based on the projective description of graphs

Article 26 March 2022

References

Hartenstein R (2001) A decade of reconfigurable computing: A visionary retrospective. In: Proc of ACM/IEEE DATE’01, pp 642–649
Pact Corporation (2005) The XPP white Paper. Technical report, www.pactcorp.com
Mei B, Vernalde S, Verkest D, De Man H, Lauwereins R (2003) ADRES: an architecture with tightly coupled vliw processor and coarse grained reconfigurable matrix. In: Proc of FPL’03, pp 61–70
Singh H, Ming-Hau L, Guangming L et al. (2000) Morphosys: an integrated reconfigurable system for data-parallel and communication-intensive applications. IEEE Trans Comput 49(5):465–481
Article Google Scholar
Miyamori T, Olukotun K (1998) A quantitative analysis of reconfigurable coprocessors for multimedia applications. In: IEEE symposium on FPGAs for custom computing machines, pp 2–11
Ebeling C, Fisher C, Xing G, Shen M, Liu H (2004) Implementing an OFDM receiver on the RaPiD reconfigurable architecture. IEEE Trans Comput 53(11):1436–1448
Article Google Scholar
Waingold E, Taylor M, Sarkar V, Lee V et al. (1997) Baring it all to software: raw machines. IEEE Comput 30(9):86–93
Google Scholar
Lee J, Choi K, Dutt N (2003) Compilation approach for coarse-grained reconfigurable architectures. IEEE Des Test Comput 20(1):26–33
Article Google Scholar
Kwok Z, Wilton SJE ( 2005) Register file architecture optimization in coarse grained reconfigurable architecture. In: Proc 13^th annual IEEE symp. on field programmable custom computing machines, pp 1–10
Panda PR, Catthoor F, Dutt ND et al. (2001) Data and memory optimization techniques for embedded systems. ACM Trans Des Automat Electron Syst (TODAES) 6(2):149–206
Article Google Scholar
Hartenstein RW, Kress R (1995) A datapath synthesis system for the reconfigurable datapath architecture. In: Proc of ASP-DAC, Article No 77, Sep 1995
Cardoso JMP, Weinhardt M (2002) XPP-VC: a compiler with temporal partitioning for the PACT-XPP architecture. In: Proc of FPL 02. LNCS, vol 2438. Springer, Berlin, pp 864–874
Google Scholar
Ferreira R, Cardoso JMP, Toledo A, Neto HC (2005) Data driven regular reconfigurable arrays: design, space exploration and mapping. In: SAMOS, Greece 2005. LNCS, vol 3553. Springer, Berlin, pp 41–50
Google Scholar
Kennedy K, Allen R (2002) Optimizing compilers for modern architectures. Morgan Kauffman, San Mateo
Google Scholar
Zalamea J, Llosa J, Ayguade E, Valero M (2004) Register constrained modulo scheduling. IEEE Trans Parallel Distrib Syst 15(5):417–430
Article Google Scholar
Dimitroulakos G, Galanis MD, Goutis CE (2006) Exploring the design space of an optimized compiler approach for mesh-like coarse-grained reconfigurable architectures. In: Proc int symp par and distr systems (IPDPS 06), April 25–29, 2006, p 10
Galanis MD, Dimitroulakos G, Goutis CE (2007) Speedups and energy reductions from mapping DSP applications on an embedded reconfigurable system. IEEE Trans Very Large Scale Integr Syst 15(12):1362–1366
Article Google Scholar
Galanis MD, Dimitroulakos G, Goutis CE (2006) Partitioning methodology for heterogeneous reconfigurable functional unit. J Supercomput 38(1):17–34
Article Google Scholar
Mahlke SA, Lin DC, Chen WY et al (1992) Effective compiler support for predicated execution using the hyperblock. In: Proc 25th microarchitecture, pp 45–54
Allan VH, Jones RB, Lee RM, Allan SJ (1995) Software pipelining. ACM Comput Surv 27(3):367–432
Article Google Scholar
Rau BR (1994) Iterative Modulo scheduling: an algorithm for software pipelining loops. In: Proc 27th ann int’l symp microarchitecture, San Jose, CA, Dec 1994, pp 63–74
Lam MS (1988) Software pipelining: an effective scheduling technique for VLIW machines. In: Proc of SIGPLAN’88, pp 318–328
Ruttenberg J, Gao GR, Stoutchinin A, Lichtenstein W (1996) Software pipelining showdown: optimal vs heuristic methods in a production compiler. In: Proc of PLDI 96, pp 1–11
Hartenstein RW, Hoffman T, Nageldinger U (2000) Design–space exploration of low power coarse grained reconfigurable datapath array architectures. In: Proc PATMOS 2000. LNCS, vol 1918, pp 118–128
Panda PR, Dutt N, Nicolau A (1999) Memory issues in embedded systems-on-chip: optimizations and exploration. Kluwer Academic, Dordrecht
Google Scholar
Rau BR, Lee M, Tirumalai P, Schlansker MS Register allocation for software pipelined loops. In: Proc of ACM SIGPLAN
Wuytack S, Diguet JP, Catthoor F, De Man H (1998) Formalized methodology for data reuse exploration for low-power hierarchical memory mappings. In: IEEE transactions on VLSI systems, vol 6, no 4
Hall MW et al. (1996) Maximizing multiprocessor performance with the SUIF compiler. Computer 29:84–89
Article Google Scholar
Cormen TH, Leiserson CE, Rivest RL, Stein C (2001) Introduction to algorithms, 2nd edn. MIT Press, Cambridge
MATH Google Scholar
Leupers R, Basu A, Marwedel P (1998) Optimized array index computation in {DSP} programs. In: ASP-DAC, pp 87–92
De Micheli G (1994) Synthesis and optimization of digital circuits. McGraw-Hill, New York
Google Scholar
Texas Instruments Inc. (2005) www.ti.com
Synopsys (2008) http://www.synopsys.com/products/logic/design_compiler.html

Download references

Author information

Authors and Affiliations

VLSI Design Laboratory, Electrical and Computer Engineering Department, University of Patras, Patras, Greece
Grigorios Dimitroulakos, Michalis D. Galanis & Costas E. Goutis
Department of Computer Engineering and Informatics, University of Patras, Patras, Greece
Nikos Kostaras

Authors

Grigorios Dimitroulakos
View author publications
You can also search for this author in PubMed Google Scholar
Nikos Kostaras
View author publications
You can also search for this author in PubMed Google Scholar
Michalis D. Galanis
View author publications
You can also search for this author in PubMed Google Scholar
Costas E. Goutis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Grigorios Dimitroulakos.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dimitroulakos, G., Kostaras, N., Galanis, M.D. et al. Compiler assisted architectural exploration framework for coarse grained reconfigurable arrays. J Supercomput 48, 115–151 (2009). https://doi.org/10.1007/s11227-008-0208-y

Download citation

Received: 16 December 2006
Accepted: 25 April 2008
Published: 16 May 2008
Issue Date: May 2009
DOI: https://doi.org/10.1007/s11227-008-0208-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Compiler assisted architectural exploration framework for coarse grained reconfigurable arrays

Abstract

Access this article

Similar content being viewed by others

A Hybrid Machine Learning Model for Code Optimization

Recent progress in InGaZnO FETs for high-density 2T0C DRAM applications

Formal method for the synthesis of optimal topologies of computing systems based on the projective description of graphs

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Compiler assisted architectural exploration framework for coarse grained reconfigurable arrays

Abstract

Access this article

Similar content being viewed by others

A Hybrid Machine Learning Model for Code Optimization

Recent progress in InGaZnO FETs for high-density 2T0C DRAM applications

Formal method for the synthesis of optimal topologies of computing systems based on the projective description of graphs

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation