Advertisement

Lookahead Memory Prefetching for CGRAs Using Partial Loop Unrolling

  • Lukas Johannes JungEmail author
  • Christian Hochberger
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10824)

Abstract

Coarse Grained Reconfigurable Arrays have become an established approach to provide high computational performance in various environments. Several researchers have found that the achievable performance highly depends on the interface between memory and CGRA. In this contribution we show that a smart prefetching mechanism can increase the performance of the CGRA. At the same time it consumes less hardware resources and energy as state of the art prefetching mechanisms.

Keywords

Prefetching Loop unrolling CGRA 

References

  1. 1.
    Archibald, J., Baer, J.L.: Cache coherence protocols: evaluation using a multiprocessor simulation model. ACM Trans. Comput. Syst. 4(4), 273–298 (1986)CrossRefGoogle Scholar
  2. 2.
    Cong, J., Huang, H., Ma, C., Xiao, B., Zhou, P.: A fully pipelined and dynamically composable architecture of CGRA. In: 2014 FCCM, pp. 9–16, May 2014Google Scholar
  3. 3.
    Dahlgren, F., Stenstrom, P.: Evaluation of hardware-based stride and sequential prefetching in shared-memory multiprocessors. TPDS 7(4), 385–398 (1996)Google Scholar
  4. 4.
    Fuchs, A., Mannor, S., Weiser, U., Etsion, Y.: Loop-aware memory prefetching using code block working sets. In: 2014 MICRO, pp. 533–544, December 2014Google Scholar
  5. 5.
    Gatzka, S., Hochberger, C.: The AMIDAR class of reconfigurable processors. J. Supercomput. 32(2), 163–181 (2005)CrossRefGoogle Scholar
  6. 6.
    Gatzka, S., Hochberger, C.: Hardware based online profiling in AMIDAR processors. In: IPDPS, p. 144b (2005)Google Scholar
  7. 7.
    Hashemi, M., Mutlu, O., Patt, Y.N.: Continuous runahead: transparent hardware acceleration for memory intensive workloads. In: 2016 MICRO, pp. 1–12, October 2016Google Scholar
  8. 8.
    Hoy, C.H., Govindarajuz, V., Nowatzki, T., Nagaraju, R., Marzec, Z., Agarwal, P., Frericks, C., Cofell, R., Sankaralingam, K.: Performance evaluation of a DySER FPGA prototype system spanning the compiler, microarchitecture, and hardware implementation. In: 2015 ISPASS, pp. 203–214, March 2015Google Scholar
  9. 9.
    Jung, L.J., Hochberger, C.: Feasibility of high level compiler optimizations in online synthesis. In: 2015 ReConFig, pp. 1–7, December 2015Google Scholar
  10. 10.
    Jung, L.J., Hochberger, C.: Optimal processor interface for CGRA-based accelerators implemented on FPGAs. In: 2016 ReConFig, pp. 1–7, November 2016Google Scholar
  11. 11.
    Lee, H., Nguyen, D., Lee, J.: Optimizing stream program performance on CGRA-based systems. In: Proceedings of the 52nd DAC, DAC 2015, pp. 110:1–110:6. ACM, New York (2015)Google Scholar
  12. 12.
    Prabhakar, R., Zhang, Y., Koeplinger, D., Feldman, M., Zhao, T., Hadjis, S., Pedram, A., Kozyrakis, C., Olukotun, K.: Plasticine: a reconfigurable architecture for parallel paterns. In: Proceedings of the 44th ISCA, ISCA 2017, pp. 389–402. ACM, New York (2017)Google Scholar
  13. 13.
    Ruschke, T., Jung, L.J., Wolf, D., Hochberger, C.: Scheduler for inhomogeneous and irregular CGRAs with support for complex control flow. In: 2016 IPDPSW, pp. 198–207, May 2016Google Scholar
  14. 14.
    Vahid, F., Stitt, G., Lysecky, R.: Warp processing: dynamic translation of binaries to FPGA circuits. Computer 41(7), 40–46 (2008)CrossRefGoogle Scholar
  15. 15.
    Veredas, F.J., Scheppler, M., Moffat, W., Mei, B.: Custom implementation of the coarse-grained reconfigurable ADRES architecture for multimedia purposes. In: FPL 2005, pp. 106–111, August 2005Google Scholar
  16. 16.
    Yang, C., Liu, L., Yin, S., Wei, S.: Data cache prefetching via context directed pattern matching for coarse-grained reconfigurable arrays. In: 2016 53nd DAC, pp. 1–6, June 2016Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Electrical Engineering and Information Technology, Computer Systems GroupTU DarmstadtDarmstadtGermany

Personalised recommendations