Coarse/Fine-grained Approaches for Pipelining Computing Stages in FPGA-Based Multicore Architectures

  • Ali Azarian
  • João M. P. Cardoso
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8806)


In recent years, there has been increasing interest on using task-level pipelining to accelerate the overall execution of applications mainly consisting of producer/consumer tasks. This paper presents coarse/fine-grained data flow synchronization approaches to achieve pipelining execution of the producer/consumer tasks in FPGA-based multicore architectures. Our approaches are able to speedup the overall execution of successive, data-dependent tasks, by using multiple cores and specific customization features provided by FPGAs. An important component of our approach is the use of customized inter-stage buffer schemes to communicate data and to synchronize the cores associated to the producer/consumer tasks. The experimental results show the feasibility of the approach when dealing with producer/consumer tasks with out-of-order communication and reveal noticeable performance improvements for a number of benchmarks over a single core implementation and not using task-level pipelining.


Multicore Architectures Task-level Pipelining FPGA Producer/Consumer Data synchronization 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Kim, D., Kim, K., Kim, J., Lee, S., Yoo, H.: Memory-centric network-on-chip for power efficient execution of task-level pipeline on a multi-core processor. IET Computers & Digital Techniques 3(5), 513–524 (2009)CrossRefGoogle Scholar
  2. 2.
    Ziegler, H., So, B., Hall, M., Diniz, P.: Coarse-grain pipelining on multiple FPGA architectures. In: Proc. 10th IEEE Symposium on Field-Programmable Custom Computing Machines, FCCM 2002, pp. 77–86 (2002)Google Scholar
  3. 3.
    Rodrigues, R., Cardoso, J.M.P., Diniz, P.C.: A data-driven approach for pipelining sequences of data-dependent loops. In: Proc. 15th IEEE Symposium on Field-Programmable Custom Computing Machines, FCCM 2007, pp. 219–228 (2007)Google Scholar
  4. 4.
    Azarian, A., Cardoso, J.M.P., Werner, S., Becker, J.: An FPGA-based multi-core approach for pipelining computing stages. In: Proc. 28th ACM Symposium on Applied Computing. SAC 2013, pp. 1533–1540. ACM (2013)Google Scholar
  5. 5.
    Ziegler, H., Hall, M., Diniz, P.: Compiler-generated communication for pipelined fpga applications. In: Proc. 40th Design Automation Conf., pp. 610–615 (2003)Google Scholar
  6. 6.
    Turjan, A., Kienhuis, B., Deprettere, E.: A technique to determine inter-process communication in the polyhedral model. In: Proc. 10th Int’l Workshop on Compilers for Parallel Computers (CPC 2003), pp. 1–9 (2003)Google Scholar
  7. 7.
    Turjan, A., Kienhuis, B., Deprettere, E.: Solving out-of-order communication in kahn process networks. Journal of VLSI Signal Processing Systems for Signal, Image, and Video Technology 40(1), 7–18 (2005)CrossRefGoogle Scholar
  8. 8.
    Smith, B.J.: Architecture and applications of the hep multiprocessor computer system. Real-Time Signal Processing IV 298, 241–248 (1982)CrossRefGoogle Scholar
  9. 9.
    Digilent, Inc.: Genesys Board Reference Manual (September 2013)Google Scholar
  10. 10.
    Xilinx, Inc.: MicroBlaze Processor Reference Guide v12.3 (2010)Google Scholar
  11. 11.
    Byrd, G., Flynn, M.: Producer-consumer communication in distributed shared memory multiprocessors. Proc. of the IEEE 87(3), 456–466 (1999)CrossRefGoogle Scholar
  12. 12.
    Turjan, A., Kienhuis, B., Deprettere, E.: Realizations of the extended linearization model. In: Domain-Specific Processors: Systems, Architectures, Modeling, and Simulation, pp. 171–191. CRC Press (2003)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Ali Azarian
    • 1
  • João M. P. Cardoso
    • 1
  1. 1.Faculty of EngineeringUniversity of Porto and INESC-TECPortoPortugal

Personalised recommendations