Architecture and applications for an All-FPGA parallel computer

Rajasekhar, Yamuna; Sass, Ron

doi:10.1007/s10586-013-0278-3

Architecture and applications for an All-FPGA parallel computer

Published: 29 June 2013

Volume 17, pages 315–325, (2014)
Cite this article

Cluster Computing Aims and scope Submit manuscript

Yamuna Rajasekhar¹ &
Ron Sass¹

401 Accesses
Explore all metrics

Abstract

The Reconfigurable Computing Cluster (RCC) project has been investigating unconventional architectures for high end computing using a cluster of FPGA devices connected by a high-speed, custom network. Most applications use the FPGAs to realize an embedded System-on-a-Chip (SoC) design augmented with application-specific accelerators to form a message-passing parallel computer. Other applications take a single accelerator core and tessellate the core across all of the devices, treating them like a large virtual FPGA. The experimental hardware has also been used for basic computer research by emulating novel architectures. This article discusses the genesis of the over-arching project, summarizes results of individual investigations that have been completed, and how this approach may prove useful in the investigation of future Exascale systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The TaPaSCo Open-Source Toolflow for the Automated Composition of Task-Based Parallel Reconfigurable Computing Systems

On-Chip and Distributed Dynamic Parallelism for Task-based Hardware Accelerators

Article Open access 29 April 2022

Carsten Heinz & Andreas Koch

Parallel multiprocessing and scheduling on the heterogeneous Xeon+FPGA platform

Article 18 June 2019

Andrés Rodríguez, Angeles Navarro, … Jose Nunez-Yanez

References

Almasi, G., Chatterjee, S., Gara, A., Gunnels, J., Gupta, M., Henning, A., Moreira, J.E., Walkup, B.: Unlocking the performance of the BlueGene/L supercomputer. In: Proceedings of the 2004 ACM/IEEE Conference on Supercomputing (SC’04), p. 57. IEEE Comput. Soc., Washington (2004). doi:10.1109/SC.2004.63
Google Scholar
Baxter, R., Booth, S., Bull, M., Cawood, G., Perry, J., Parsons, M., Simpsõn, A., Trew, A., McCormick, A., Smart, G., Smart, R., Cantle, A., Chamberlain, R., Genest, G.: Maxwell—a 64 FPGA supercomputer. In: Second NASA/ESA Conference on Adaptive Hardware and Systems (AHS 2007), pp. 287–294 (2007). doi:10.1109/AHS.2007.71
Chapter Google Scholar
Booth, S., Campbell, D., Chien, A., Lethin, R., Mullin, L., Rodrigues, A., Sass, R., Shalf, J., Snir, M., Sterling, T.: Exascale and beyond: Configuring, reasoning, scaling. Tech. rep., US Department of Energy, Office of Science, Office of Advanced Scientific Computing Research (ASCR) (2011). http://science.energy.gov/~/media/ascr/pdf/program-documents/docs/ArchitecturesIIWorkshopReport.pdf
Buntinas, D., Panda, D.K., Sadayappan, P.: Fast NIC-based barrier over Myrinet/GM. In: Parallel and Distributed Processing Symposium, International, vol. 1, (2001). doi:10.1109/IPDPS.2001.924993
Google Scholar
Buscemi, S., Sass, R.: Design of a scalable digital wireless channel emulator for networking radios. In: Military Communications Conference (MILCOM 2011), pp. 1858–1863 (2011). doi:10.1109/MILCOM.2011.6127583
Google Scholar
Chang, C., Wawrzynek, J., Brodersen, R.: Bee2: a high-end reconfigurable computing system. Design test of computers. IEEE 22(2), 114–125 (2005). doi:10.1109/MDT.2005.30
Google Scholar
Davis, J.D., Thacker, C.P., Chang, C.: BEE3: revitalizing computer architecture research. Tech. rep., Microsoft research (2009)
Eddington, C.: InfiniBridge: an InfiniBand channel adapter with integrated switch. IEEE MICRO 22, 48–56 (2002). doi:10.1109/MM.2002.997879
Article Google Scholar
Gao, S., Schmidt, A.G., Sass, R.: Hardware implementation of mpi_barrier on an FPGA cluster. In: Proceedings of the 19th International Conference on Field-Programmable Logic and Applications (FPL’09) (2009)
Google Scholar
Gao, S., Schmidt, A.G., Sass, R.: Impact of reconfigurable hardware on accelerating mpi_reduce. In: International Conference on Field Programmable Technology (FPT’10). IEEE Comput. Soc., Los Alamitos (2010)
Google Scholar
George, A., Lam, H., Stitt, G.: Novo-g: at the forefront of scalable reconfigurable supercomputing. Comput. Sci. Eng. 13, 82–86 (2011). http://dx.doi.org/10.1109/MCSE.2011.11. doi:10.1109/MCSE.2011.11
Article Google Scholar
Kritikos, W.V., Rajasekhar, Y., Schmidt, A.G., Sass, R.: A radix tree router for scalable fpga networks. In: Proceedings of the 2011 21st International Conference on Field Programmable Logic and Applications (FPL’11), pp. 76–81. IEEE Comput. Soc., Washington (2011). http://dx.doi.org/10.1109/FPL.2011.24. doi:10.1109/FPL.2011.24
Chapter Google Scholar
Mendon, A., Schmidt, A.G., Sass, R.: A hardware filesystem implementation with multi-disk support. Int. J. Reconfigurable Comput. (2009)
Moll, L., Shand, M., Heirich, A.: Sepia: scalable 3D compositing using PCI pamette. In: Proceedings of the Seventh Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM ’99), p. 146. IEEE Comput. Soc., Washington (1999). http://dl.acm.org/citation.cfm?id=795658.795874
Chapter Google Scholar
Nieplocha, J., Tipparaju, V., Krishnan, M.: Optimizing strided remote memory access operations on the quadrics QsNetII network interconnect. In: Proceedings of the Eighth International Conference on High-Performance Computing in Asia-Pacific Region (HPCASIA ’05), p. 28. IEEE Comput. Soc., Washington (2005). http://dx.doi.org/10.1109/HPCASIA.2005.62. doi:10.1109/HPCASIA.2005.62
Google Scholar
Kogge, P., et al.: Exascale computing study: technology challenges in achieving exascale systems. Tech. rep. TR-2008-13, DARPA Information Processing Techniques Office (IPTO) sponsored study (2008). www.cse.nd.edu/Reports/2008TR-2008-13.pdf
Rajasekhar, Y., Kritikos, W., Schmidt, A., Sass, R.: Teaching FPGA system design via a remote laboratory facility. In: International Conference on Field Programmable Logic and Applications (FPL 2008), pp. 687–690 (2008). doi:10.1109/FPL.2008.4630040
Chapter Google Scholar
Rajasekhar, Y., Sass, R.: A first analysis of a dynamic memory allocation controller (dmac) core. In: Proceedings of the 2011 Symposium on Application Accelerators in High-Performance Computing (SAAHPC’11), pp. 64–67. IEEE Comput. Soc., Washington (2011). http://dx.doi.org/10.1109/SAAHPC.2011.23. doi:10.1109/SAAHPC.2011.23
Chapter Google Scholar
Sass, R.: FPGA session control. Accessed March 2012 (2012). http://sourceforge.net/projects/fpga-session
Sass, R., Schmidt, A.G., Buscemi, S.: Reconfigurable computing cluster: a five-year perspective of the project. In: ParaFPGA2011: Parallel Computing with FPGAs (2011)
Google Scholar
Sass, R., Sharma, R.R., DeBardeleben, N.: Towards a hardware fault-injection testbed to support reproducible resiliency experiments. In: Proceedings of the 2009 Workshop on Resiliency in High Performance (Resilience ’09), pp. 15–22. ACM, New York (2009). http://doi.acm.org/10.1145/1552526.1552529. doi:10.1145/1552526.1552529
Chapter Google Scholar
Schmidt, A.G., Datta, S., Mendon, A.A., Sass, R.: Investigation into scaling i/o bound streaming applications productively with an all-fpga cluster. Int. J. Parallel Comput. (2011)
Schmidt, A.G., Kritikos, W.V., Gao, S., Sass, R.: An evaluation of an integrated on-chip/off-chip network for high-performance reconfigurable computing. Int. J. Reconfigurable Comput. (2012)
Schmidt, A.G., Kritikos, W.V., Sharma, R.R., Sass, R.: AIREN: A novel integration of on-chip and off-chip FPGA networks. In: Proceedings of the 17th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM’09). IEEE Comput. Soc., Los Alamitos (2009)
Google Scholar
Schmidt, A.G., Sass, R.: Improving fpga design and evaluation productivity with a hardware performance monitoring infrastructure. In: International Conference on Reconfigurable Computing and FPGAs (ReConFig), pp. 422–427 (2011). doi:10.1109/ReConFig.2011.53
Google Scholar
Shah, G., Bender, C.: Performance and experience with LAPI—a new high-performance communication library for the IBM RS/6000 SP. In: Proceedings of the 12th International Parallel Processing Symposium on International Parallel Processing Symposium (IPPS ’98), p. 260. IEEE Comput. Soc., Washington (1998). http://dl.acm.org/citation.cfm?id=876880.879642
Google Scholar
Underwood, K.D., Sass, R.R., Ligon, W.B. III: Cost effectiveness of an adaptable computing cluster. In: Proceedings of the 2001 ACM/IEEE Conference on Supercomputing (CD-ROM) (Supercomputing ’01), pp. 54. ACM, New York (2001). http://doi.acm.org/10.1145/582034.582088. doi:10.1145/582034.582088.
Chapter Google Scholar
Xilinx, Inc.: Ml410 virtex-4 fx embedded development platform. http://www.xilinx.com/ml410
Rajasekhar, Y., Sharma, R.R., Sass, R.: An extensible and portable tool suite for managing multi-node FPGA systems. In: Proceedings of the 20th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM’12). IEEE Comput. Soc., Los Alamitos (2012)
Google Scholar

Download references

Acknowledgements

The authors would like to thank our friends and colleagues—William V. Kritikos, Andrew G. Schmidt, Shan Yuan Gao, Ashwin A. Mendon, and Bin Huang—who advanced many of the projects summarized in this report.

This project was supported in part by the National Science Foundation under NSF Grants CNS 05-51688 (CRI) and CNS 04-10790 (EHS). The opinions expressed are those of the authors and not necessarily those of the Foundation.

Author information

Authors and Affiliations

Reconfigurable Computing Systems Laboratory, University of North Carolina at Charlotte, 9201 University City Blvd, Charlotte, NC, 28223, USA
Yamuna Rajasekhar & Ron Sass

Authors

Yamuna Rajasekhar
View author publications
You can also search for this author in PubMed Google Scholar
Ron Sass
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yamuna Rajasekhar.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rajasekhar, Y., Sass, R. Architecture and applications for an All-FPGA parallel computer. Cluster Comput 17, 315–325 (2014). https://doi.org/10.1007/s10586-013-0278-3

Download citation

Received: 17 February 2013
Accepted: 13 May 2013
Published: 29 June 2013
Issue Date: June 2014
DOI: https://doi.org/10.1007/s10586-013-0278-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Architecture and applications for an All-FPGA parallel computer

Abstract

Access this article

Similar content being viewed by others

The TaPaSCo Open-Source Toolflow for the Automated Composition of Task-Based Parallel Reconfigurable Computing Systems

On-Chip and Distributed Dynamic Parallelism for Task-based Hardware Accelerators

Parallel multiprocessing and scheduling on the heterogeneous Xeon+FPGA platform

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Architecture and applications for an All-FPGA parallel computer

Abstract

Access this article

Similar content being viewed by others

The TaPaSCo Open-Source Toolflow for the Automated Composition of Task-Based Parallel Reconfigurable Computing Systems

On-Chip and Distributed Dynamic Parallelism for Task-based Hardware Accelerators

Parallel multiprocessing and scheduling on the heterogeneous Xeon+FPGA platform

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation