Skip to main content

Advertisement

Log in

Architecture and applications for an All-FPGA parallel computer

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

The Reconfigurable Computing Cluster (RCC) project has been investigating unconventional architectures for high end computing using a cluster of FPGA devices connected by a high-speed, custom network. Most applications use the FPGAs to realize an embedded System-on-a-Chip (SoC) design augmented with application-specific accelerators to form a message-passing parallel computer. Other applications take a single accelerator core and tessellate the core across all of the devices, treating them like a large virtual FPGA. The experimental hardware has also been used for basic computer research by emulating novel architectures. This article discusses the genesis of the over-arching project, summarizes results of individual investigations that have been completed, and how this approach may prove useful in the investigation of future Exascale systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Almasi, G., Chatterjee, S., Gara, A., Gunnels, J., Gupta, M., Henning, A., Moreira, J.E., Walkup, B.: Unlocking the performance of the BlueGene/L supercomputer. In: Proceedings of the 2004 ACM/IEEE Conference on Supercomputing (SC’04), p. 57. IEEE Comput. Soc., Washington (2004). doi:10.1109/SC.2004.63

    Google Scholar 

  2. Baxter, R., Booth, S., Bull, M., Cawood, G., Perry, J., Parsons, M., Simpsõn, A., Trew, A., McCormick, A., Smart, G., Smart, R., Cantle, A., Chamberlain, R., Genest, G.: Maxwell—a 64 FPGA supercomputer. In: Second NASA/ESA Conference on Adaptive Hardware and Systems (AHS 2007), pp. 287–294 (2007). doi:10.1109/AHS.2007.71

    Chapter  Google Scholar 

  3. Booth, S., Campbell, D., Chien, A., Lethin, R., Mullin, L., Rodrigues, A., Sass, R., Shalf, J., Snir, M., Sterling, T.: Exascale and beyond: Configuring, reasoning, scaling. Tech. rep., US Department of Energy, Office of Science, Office of Advanced Scientific Computing Research (ASCR) (2011). http://science.energy.gov/~/media/ascr/pdf/program-documents/docs/ArchitecturesIIWorkshopReport.pdf

  4. Buntinas, D., Panda, D.K., Sadayappan, P.: Fast NIC-based barrier over Myrinet/GM. In: Parallel and Distributed Processing Symposium, International, vol. 1, (2001). doi:10.1109/IPDPS.2001.924993

    Google Scholar 

  5. Buscemi, S., Sass, R.: Design of a scalable digital wireless channel emulator for networking radios. In: Military Communications Conference (MILCOM 2011), pp. 1858–1863 (2011). doi:10.1109/MILCOM.2011.6127583

    Google Scholar 

  6. Chang, C., Wawrzynek, J., Brodersen, R.: Bee2: a high-end reconfigurable computing system. Design test of computers. IEEE 22(2), 114–125 (2005). doi:10.1109/MDT.2005.30

    Google Scholar 

  7. Davis, J.D., Thacker, C.P., Chang, C.: BEE3: revitalizing computer architecture research. Tech. rep., Microsoft research (2009)

  8. Eddington, C.: InfiniBridge: an InfiniBand channel adapter with integrated switch. IEEE MICRO 22, 48–56 (2002). doi:10.1109/MM.2002.997879

    Article  Google Scholar 

  9. Gao, S., Schmidt, A.G., Sass, R.: Hardware implementation of mpi_barrier on an FPGA cluster. In: Proceedings of the 19th International Conference on Field-Programmable Logic and Applications (FPL’09) (2009)

    Google Scholar 

  10. Gao, S., Schmidt, A.G., Sass, R.: Impact of reconfigurable hardware on accelerating mpi_reduce. In: International Conference on Field Programmable Technology (FPT’10). IEEE Comput. Soc., Los Alamitos (2010)

    Google Scholar 

  11. George, A., Lam, H., Stitt, G.: Novo-g: at the forefront of scalable reconfigurable supercomputing. Comput. Sci. Eng. 13, 82–86 (2011). http://dx.doi.org/10.1109/MCSE.2011.11. doi:10.1109/MCSE.2011.11

    Article  Google Scholar 

  12. Kritikos, W.V., Rajasekhar, Y., Schmidt, A.G., Sass, R.: A radix tree router for scalable fpga networks. In: Proceedings of the 2011 21st International Conference on Field Programmable Logic and Applications (FPL’11), pp. 76–81. IEEE Comput. Soc., Washington (2011). http://dx.doi.org/10.1109/FPL.2011.24. doi:10.1109/FPL.2011.24

    Chapter  Google Scholar 

  13. Mendon, A., Schmidt, A.G., Sass, R.: A hardware filesystem implementation with multi-disk support. Int. J. Reconfigurable Comput. (2009)

  14. Moll, L., Shand, M., Heirich, A.: Sepia: scalable 3D compositing using PCI pamette. In: Proceedings of the Seventh Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM ’99), p. 146. IEEE Comput. Soc., Washington (1999). http://dl.acm.org/citation.cfm?id=795658.795874

    Chapter  Google Scholar 

  15. Nieplocha, J., Tipparaju, V., Krishnan, M.: Optimizing strided remote memory access operations on the quadrics QsNetII network interconnect. In: Proceedings of the Eighth International Conference on High-Performance Computing in Asia-Pacific Region (HPCASIA ’05), p. 28. IEEE Comput. Soc., Washington (2005). http://dx.doi.org/10.1109/HPCASIA.2005.62. doi:10.1109/HPCASIA.2005.62

    Google Scholar 

  16. Kogge, P., et al.: Exascale computing study: technology challenges in achieving exascale systems. Tech. rep. TR-2008-13, DARPA Information Processing Techniques Office (IPTO) sponsored study (2008). www.cse.nd.edu/Reports/2008TR-2008-13.pdf

  17. Rajasekhar, Y., Kritikos, W., Schmidt, A., Sass, R.: Teaching FPGA system design via a remote laboratory facility. In: International Conference on Field Programmable Logic and Applications (FPL 2008), pp. 687–690 (2008). doi:10.1109/FPL.2008.4630040

    Chapter  Google Scholar 

  18. Rajasekhar, Y., Sass, R.: A first analysis of a dynamic memory allocation controller (dmac) core. In: Proceedings of the 2011 Symposium on Application Accelerators in High-Performance Computing (SAAHPC’11), pp. 64–67. IEEE Comput. Soc., Washington (2011). http://dx.doi.org/10.1109/SAAHPC.2011.23. doi:10.1109/SAAHPC.2011.23

    Chapter  Google Scholar 

  19. Sass, R.: FPGA session control. Accessed March 2012 (2012). http://sourceforge.net/projects/fpga-session

  20. Sass, R., Schmidt, A.G., Buscemi, S.: Reconfigurable computing cluster: a five-year perspective of the project. In: ParaFPGA2011: Parallel Computing with FPGAs (2011)

    Google Scholar 

  21. Sass, R., Sharma, R.R., DeBardeleben, N.: Towards a hardware fault-injection testbed to support reproducible resiliency experiments. In: Proceedings of the 2009 Workshop on Resiliency in High Performance (Resilience ’09), pp. 15–22. ACM, New York (2009). http://doi.acm.org/10.1145/1552526.1552529. doi:10.1145/1552526.1552529

    Chapter  Google Scholar 

  22. Schmidt, A.G., Datta, S., Mendon, A.A., Sass, R.: Investigation into scaling i/o bound streaming applications productively with an all-fpga cluster. Int. J. Parallel Comput. (2011)

  23. Schmidt, A.G., Kritikos, W.V., Gao, S., Sass, R.: An evaluation of an integrated on-chip/off-chip network for high-performance reconfigurable computing. Int. J. Reconfigurable Comput. (2012)

  24. Schmidt, A.G., Kritikos, W.V., Sharma, R.R., Sass, R.: AIREN: A novel integration of on-chip and off-chip FPGA networks. In: Proceedings of the 17th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM’09). IEEE Comput. Soc., Los Alamitos (2009)

    Google Scholar 

  25. Schmidt, A.G., Sass, R.: Improving fpga design and evaluation productivity with a hardware performance monitoring infrastructure. In: International Conference on Reconfigurable Computing and FPGAs (ReConFig), pp. 422–427 (2011). doi:10.1109/ReConFig.2011.53

    Google Scholar 

  26. Shah, G., Bender, C.: Performance and experience with LAPI—a new high-performance communication library for the IBM RS/6000 SP. In: Proceedings of the 12th International Parallel Processing Symposium on International Parallel Processing Symposium (IPPS ’98), p. 260. IEEE Comput. Soc., Washington (1998). http://dl.acm.org/citation.cfm?id=876880.879642

    Google Scholar 

  27. Underwood, K.D., Sass, R.R., Ligon, W.B. III: Cost effectiveness of an adaptable computing cluster. In: Proceedings of the 2001 ACM/IEEE Conference on Supercomputing (CD-ROM) (Supercomputing ’01), pp. 54. ACM, New York (2001). http://doi.acm.org/10.1145/582034.582088. doi:10.1145/582034.582088.

    Chapter  Google Scholar 

  28. Xilinx, Inc.: Ml410 virtex-4 fx embedded development platform. http://www.xilinx.com/ml410

  29. Rajasekhar, Y., Sharma, R.R., Sass, R.: An extensible and portable tool suite for managing multi-node FPGA systems. In: Proceedings of the 20th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM’12). IEEE Comput. Soc., Los Alamitos (2012)

    Google Scholar 

Download references

Acknowledgements

The authors would like to thank our friends and colleagues—William V. Kritikos, Andrew G. Schmidt, Shan Yuan Gao, Ashwin A. Mendon, and Bin Huang—who advanced many of the projects summarized in this report.

This project was supported in part by the National Science Foundation under NSF Grants CNS 05-51688 (CRI) and CNS 04-10790 (EHS). The opinions expressed are those of the authors and not necessarily those of the Foundation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yamuna Rajasekhar.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rajasekhar, Y., Sass, R. Architecture and applications for an All-FPGA parallel computer. Cluster Comput 17, 315–325 (2014). https://doi.org/10.1007/s10586-013-0278-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-013-0278-3

Keywords

Navigation