Skip to main content

Dynamical Self-Reconfigurable Mechanism for Data-Driven Cell Array

Abstract

The utilization of computation resources and reconfiguration time has a large impact on reconfiguration system performance. In order to promote the performance, a dynamical self-reconfigurable mechanism for data-driven cell array is proposed. Cells can be fired only when the needed data arrives, and cell array can be worked on two modes: fixed execution and reconfiguration. On reconfiguration mode, cell function and data flow direction are changed automatically at run time according to contexts. Simultaneously using an H-tree interconnection network, through pre-storing multiple application mapping contexts in reconfiguration buffer, multiple applications can execute concurrently and context switching time is the minimal. For verifying system performance, some algorithms are selected for mapping onto the proposed structure, and the amount of configuration contexts and execution time are recorded for statistical analysis. The results show that the proposed self-reconfigurable mechanism can reduce the number of contexts efficiently, and has a low computing time.

This is a preview of subscription content, access via your institution.

References

  1. [1]

    YANG P L, MAREK-SADOWSKA M. High-performance architecture using fast dynamic reconfigurable accelerators [J]. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2018, 26(7): 1209–1222.

    Article  Google Scholar 

  2. [2]

    YIN S Y, YAO X Q, LU T Y, et al. Conflict-free loop mapping for coarse-grained reconfigurable architecture with multi-bank memory [J]. IEEE Transactions on Parallel and Distributed Systems, 2017, 28(9): 2471–2485.

    Article  Google Scholar 

  3. [3]

    TESSIER R, POCEK K, DEHON A. Reconfigurable computing architectures [J]. Proceedings of the IEEE, 2015, 103(3): 332–354.

    Article  Google Scholar 

  4. [4]

    OCHI H, YAMAGUCHI K, FUJIMOTO T, et al. Viaswitch FPGA: Highly dense mixed-grained reconfigurable architecture with overlay via-switch crossbars [J]. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2018, 26(12): 2723–2736.

    Article  Google Scholar 

  5. [5]

    FAN X, WU D, CAO W, et al. Stream processing dualtrack CGRA for object inference [J]. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2018, 26(6): 1098–1111.

    Article  Google Scholar 

  6. [6]

    DUCH L, BASU S, BRAOJOS R, et al. HEAL-WEAR: An ultra-low power heterogeneous system for bio-signal analysis [J]. IEEE Transactions on Circuits and Systems I: Regular Papers, 2017, 64(9): 2448–2461.

    Article  Google Scholar 

  7. [7]

    YIN S Y, YAO X Q, LIU D J, et al. Memory-aware loop mapping on coarse-grained reconfigurable architectures [J]. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2016, 24(5): 1895–1908.

    Article  Google Scholar 

  8. [8]

    KARUNARATNE M, MOHITE A K, MITRA T, et al. HyCUBE: A CGRA with reconfigurable single-cycle multi-hop interconnect [C]//Design Automation Conference. Austin: IEEE, 2017: 1–6.

    Google Scholar 

  9. [9]

    JAFRI S M A H, GIA T N, DYTCKOV S, et al. NeuroCGRA: A CGRA with support for neural networks [C]//International Conference on High Performance Computing & Simulation. Bologna: IEEE, 2014: 506–511.

    Google Scholar 

  10. [10]

    WANG Y S, LIU L B, YIN S Y, et al. On-chip memory hierarchy in one coarse-grained reconfigurable architecture to compress memory space and to reduce reconfiguration time and data-reference time [J]. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2014, 22(5): 983–994.

    Article  Google Scholar 

  11. [11]

    FELL A, RÁKOSSY Z E, CHATTOPADHYAY A. Force-directed scheduling for data flow graph mapping on coarse-grained reconfigurable architectures [C]//International Conference on ReConFigurable Computing and FPGAs. Cancun: IEEE, 2014: 1–8.

    Google Scholar 

  12. [12]

    MUKHERJEE M, FELL A, GUHA A. DFGenTool: A dataflow graph generation tool for coarse grain reconfigurable architectures [C]//2017 30th International Conference on VLSI Design and 2017 16th International Conference on Embedded Systems (VLSID). Hyderabad: IEEE, 2017: 67–72.

    Chapter  Google Scholar 

  13. [13]

    KRISHNAMOORTHY R, FUJITA M, VARADARAJAN K, et al. Interconnect-topology independent mapping algorithm for a Coarse Grained Reconfigurable Architecture [C]//International Conference on Field-Programmable Technology. New Delhi: IEEE, 2011: 1–5.

    Google Scholar 

  14. [14]

    MATSUTANI H, KOIBUCHI M, YAMADA Y, et al. Fat H-tree: A cost-efficient tree-based on-chip network [J]. IEEE Transactions on Parallel and Distributed Systems, 2009, 20(8): 1126–1141.

    Article  Google Scholar 

  15. [15]

    LIU L B, WANG Y S, YIN S Y, et al. Row-based configuration mechanism for a 2-D processing element array in coarse-grained reconfigurable architecture [J]. Science China Information Sciences, 2014, 57(10): 1–18.

    Google Scholar 

  16. [16]

    WU L H, ZHANG X Z, PAN X L, et al. Design of an static reconfiguration based on FPGA system [J]. International Journal of Multimedia and Ubiquitous Engineering, 2016, 11(2): 99–106.

    Article  Google Scholar 

  17. [17]

    JUNG S, KIM T G. An operation and interconnection sharing algorithm for reconfiguration overhead reduction using static partial reconfiguration [J]. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2008, 16(12): 1589–1595.

    Article  Google Scholar 

  18. [18]

    LIU B, CAO P, ZHU M, et al. Reconfiguration process optimization of dynamically coarse grain reconfigurable architecture for multimedia applications [J]. IE-ICE Transactions on Information and Systems, 2012, E95-D(7): 1858–1871.

    Article  Google Scholar 

  19. [19]

    CAROTENUTO P L, DELLA CIOPPA A, MARCELLI A, et al. An evolutionary approach to the dynamical reconfiguration of photovoltaic fields [J]. Neurocomputing, 2015, 170: 393–405.

    Article  Google Scholar 

  20. [20]

    GENG T S, LIU L B, YIN S Y, et al. Parallelization of computing-intensive tasks of the H.264 high profile decoding algorithm on a reconfigurable multimedia system [J]. IEICE Transactions on Information and Systems, 2010, E93-D(12): 3223–3231.

    Article  Google Scholar 

  21. [21]

    JAFRI S M A H, DANESHTALAB M, ABBAS N, et al. TransMap: Transformation based remapping and parallelism for high utilization and energy efficiency in CGRAs [J]. IEEE Transactions on Computers, 2016, 65(11): 3456–3469.

    MathSciNet  Article  Google Scholar 

  22. [22]

    KIM Y, JOO H, YOON S. Inter-coarse-grained reconfigurable architecture reconfiguration technique for efficient pipelining of kernel-stream on coarse-grained reconfigurable architecture-based multi-core architecture [J]. IET Circuits, Devices & Systems, 2016, 10(4): 251–265.

    Article  Google Scholar 

  23. [23]

    MOGHADDAM M S, BALAKRISHNAN M, PAUL K. Partial reconfiguration for dynamic mapping of task graphs onto 2D mesh platform [M]//Applied reconfigurable computing. Cham: Springer International Publishing, 2015: 373–382.

    Chapter  Google Scholar 

  24. [24]

    JAFRI S M A H, HEMANI A, PAUL K, et al. Compression based efficient and agile configuration mechanism for coarse grained reconfigurable architectures [C]//2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum. Anchorage: IEEE, 2011: 290–293.

    Chapter  Google Scholar 

  25. [25]

    XIE Z, LU Y H, FAN Y B, et al. Data mapping scheme and implementation for high-throughput DCT/IDCT transpose memory [C]//IEEE International Conference on Solid-State and Integrated Circuit Technology. Guilin: IEEE, 2014: 1–3.

    Google Scholar 

  26. [26]

    LIU W, LIU L B, YIN S Y, et al. A high parallel motion compensation implementation on a coarse-grained reconfigurable processor supporting H.264 high profile decoding [C]//IEEE International Conference on Solid-State and Integrated Circuit Technology. Guilin: IEEE, 2014: 1–3.

    Google Scholar 

  27. [27]

    SANGHVI H. 2D cache architecture for motion compensation in a 4K Ultra-HD AVC and HEVC video codec system [C]//2014 IEEE International Conference on Consumer Electronics. Las Vegas: IEEE, 2014: 189–190.

    Chapter  Google Scholar 

  28. [28]

    SHYMNA NIZAR N S, KRISHNA A R. An efficient folded pipelined architecture for Fast Fourier Transform using Cordic algorithm [C]//2014 IEEE International Conference on Advanced Communications, Control and Computing Technologies. Ramanathapuram: IEEE, 2014: 462–467.

    Chapter  Google Scholar 

  29. [29]

    ZHU J F, LIU L B, YIN S Y, et al. A hybrid reconfigurable architecture and design methods aiming at control-intensive kernels [J]. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2015, 23(9): 1700–1709.

    Article  Google Scholar 

  30. [30]

    LIU L B, WANG D, ZHU M, et al. An energy-efficient coarse-grained reconfigurable processing unit for multiple-standard video decoding [J]. IEEE Transactions on Multimedia, 2015, 17(10): 1706–1720.

    Article  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Rui Shan.

Additional information

Foundation item: the National Natural Science Foundation of China (Nos. 61802304, 61834005, 61772417, 61634004, and 61602377), and the Shaanxi Provincial Co-ordination Innovation Project of Science and Technology (No. 2016KTZDGY02-04-02)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Shan, R., Jiang, L., Wu, H. et al. Dynamical Self-Reconfigurable Mechanism for Data-Driven Cell Array. J. Shanghai Jiaotong Univ. (Sci.) 26, 511–521 (2021). https://doi.org/10.1007/s12204-021-2319-z

Download citation

Key words

  • cell array
  • configurable computing
  • self-reconfigurable mechanism
  • data-driven
  • data flow graph

CLC number

  • TP 302
  • TP 391