Skip to main content

Machine Learning for Agile FPGA Design

  • Chapter
  • First Online:
Machine Learning Applications in Electronic Design Automation

Abstract

Field-programmable gate arrays (FPGAs) have become popular means of hardware acceleration since they offer massive parallelism, flexible configurability, and potentially higher performance per Watt. However, the heterogeneous architecture of modern FPGAs and multiple abstractions across design stages present unprecedented challenges to FPGA design tasks, e.g., quality of results (QoR) estimation, and design space exploration, necessitating considerable manual effort for design optimization. Recently, machine learning (ML) has been applied extensively to such FPGA design tasks to minimize human supervision. In this chapter, we provide a comprehensive review of different ML techniques that hold promise to significantly enhance FPGA design automation. First, we provide a brief overview of the FPGA design flow followed by our insights into applying ML for enhanced agility in FPGA design automation. Then, we discuss representative works in applying ML in two different ways for FPGA design automation—ML as a predictor to improve QoR estimation and ML as a decision-maker to automate FPGA design space exploration to iteratively improve QoR estimation. Next, we present multiple recent case studies in detail to showcase the effective applications of ML in FPGA design optimization tasks. Finally, we highlight additional challenges and future opportunities to motivate more ML-based solutions to streamline fast and accurate FPGA design automation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 119.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. ABC: A System for Sequential Synthesis and Verification. http://www.eecs.berkeley.edu/alanmi/abc. Accessed December 14, 2022

  2. Abts, D., Ross, J., Sparling, J., Wong-VanHaren, M., Baker, M., Hawkins, T., Bell, A., Thompson, J., Kahsai, T., Kimmell, G., Hwang, J., Leslie-Hurd, R., Bye, M., Creswick, E., Boyd, M., Venigalla, M., Laforge, E., Purdy, J., Kamath, P., Maheshwari, D., Beidler, M., Rosseel, G., Ahmad, O., Gagarin, G., Czekalski, R., Rane, A., Parmar, S., Werner, J., Sproch, J., Macias, A., Kurtz, B.: Think fast: a tensor streaming processor (TSP) for accelerating deep learning workloads. In: International Symposium on Computer Architecture (ISCA) (2020)

    Google Scholar 

  3. Alawieh, M.B., Li, W., Lin, Y., Singhal, L., Iyer, M.A., Pan, D.Z.: High-definition routing congestion prediction for large-scale FPGAs. In: Asia and South Pacific Design Automation Conference (ASP-DAC) (2020)

    Google Scholar 

  4. Al-Hyari, A., Szentimrey, H., Shamli, A., Martin, T., Gréwal, G., Areibi, S.: A deep learning framework to predict routability for FPGA circuit placement. ACM Trans. Reconfig. Technol. Syst. 14(3), (2021)

    Google Scholar 

  5. Al-Khaleel, O., Baktır, S., Küpçü, A.: FPGA Implementation of an ECC processor using Edwards curves and DFT modular multiplication. In: International Conference on Information and Communication Systems (ICICS) (2021)

    Google Scholar 

  6. Amaru, L., Gaillardon, P.E., De Micheli, G.: Majority-inverter graph: a new paradigm for logic optimization. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 35(5), 806–819 (2015)

    Article  Google Scholar 

  7. An In-Depth Look at Google’s First Tensor Processing Unit. https://cloud.google.com/blog/big-data/2017/05/an-in-depth-look-at-googles-first-tensor-processing-unit-tpu. Accessed: December 14, 2022

  8. Ansel, J., Kamil, S., Veeramachaneni, K., Ragan-Kelley, J., Bosboom, J., O’Reilly, U.M., Amarasinghe, S.: OpenTuner: an extensible framework for program autotuning. In: International Conference on Parallel Architectures and Compilation Techniques (PACT) (2014)

    Google Scholar 

  9. Asiatici, M., Ienne, P.: Large-scale graph processing on FPGAs with caches for thousands of simultaneous misses. In: International Symposium on Computer Architecture (ISCA) (2021)

    Google Scholar 

  10. Balupala, H.K., Rahul, K., Yachareni, S.: Galois field arithmetic operations using Xilinx FPGAs in cryptography. In: International IOT, Electronics and Mechatronics Conference (IEMTRONICS) (2021)

    Google Scholar 

  11. Banerjee, K., Karfa, C., Sarkar, D., Mandal, C.: Verification of code motion techniques using value propagation. IEEE Trans. Comput. Aided Design Integ. Circuits Syst. (2014)

    Google Scholar 

  12. Canis, A., Choi, J., Aldham, M., Zhang, V., Kammoona, A., Anderson, J.H., Brown, S., Czajkowski, T.: LegUp: high-level synthesis for FPGA-based processor/accelerator systems. In: International Symposium on Field-Programmable Gate Arrays (FPGA) (2011)

    Google Scholar 

  13. Capligins, F., Litvinenko, A., Aboltins, A., Kolosovs, D.: FPGA Implementation and study of synchronization of modified Chua’s circuit-based chaotic oscillator for high-speed secure communications. In: Workshop on Advances in Information, Electronic and Electrical Engineering (AIEEE) (2021)

    Google Scholar 

  14. Castells-Rufas, D., Marco-Sola, S., Moure, J.C., Aguado, Q., Espinosa, A.: FPGA acceleration of pre-alignment filters for short read mapping with HLS. IEEE Access, 10, 22079–22100 (2022)

    Article  Google Scholar 

  15. Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: International Conference on Knowledge Discovery and Data Mining (KDD) (2016)

    Google Scholar 

  16. Chen, X., Tian, Y.: Learning to perform local rewriting for combinatorial optimization. In: International Conference on Neural Information Processing Systems (NeurIPS) (2019)

    Google Scholar 

  17. Cheng, L., Wong, M.D.: Floorplan design for multimillion gate FPGAs. IEEE Trans. Comput. Aided Design Integr. Circuits Syst. 25(12), 2795–2805 (2006)

    Article  Google Scholar 

  18. Chen, T., Du, Z., Sun, N., Wang, J., Wu, C., Chen, Y., Temam, O.: DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning. In: International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) (2014)

    Google Scholar 

  19. Chen, Y., Luo, T., Liu, S., Zhang, S., He, L., Wang, J., Li, L., Chen, T., Xu, Z., Sun, N., Temam, O.: DaDianNao: a machine-learning supercomputer. IEEE Micro, 609–622 (2014)

    Google Scholar 

  20. Chen, Y.H., Emer, J., Sze, V.: Eyeriss: a spatial architecture for energy-efficient dataflow for convolutional neural networks. In: International Symposium on Computer Architecture (ISCA) (2016)

    Google Scholar 

  21. Chen, X., Cheng, F., Tan, H., Chen, Y., He, B., Wong, W.F., Chen, D.: ThunderGP: resource-efficient graph processing framework on FPGAs with HLS. ACM Trans. Reconfig. Technol. Syst. (2022)

    Google Scholar 

  22. Cong, J., Ding, Y.: FlowMap: an optimal technology mapping algorithm for delay optimization in lookup-table based FPGA designs. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. (1994)

    Google Scholar 

  23. Cong, J., Ding, Y.: On area/depth trade-off in LUT-based FPGA technology mapping. IEEE Trans. Very Large Scale Integr. Syst. 2(2), 137–148 (1994)

    Article  Google Scholar 

  24. Cong, J., Zhang, Z.: An efficient and versatile scheduling algorithm based on SDC formulation. In: Design Automation Conference (DAC) (2006)

    Google Scholar 

  25. Cong, J., Liu, B., Neuendorffer, S., Noguera, J., Vissers, K., Zhang, Z.: High-level synthesis for FPGAs: from prototyping to deployment. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 30(4), 473–491 (2011)

    Article  Google Scholar 

  26. Dai, S., Zhou, Y., Zhang, H., Ustun, E., Young, E.F., Zhang, Z.: Fast and accurate estimation of quality of results in high-level synthesis with machine learning. In: IEEE Symposium on Field Programmable Custom Computing Machines (FCCM) (2018)

    Google Scholar 

  27. Damiani, A., Fiscaletti, G., Bacis, M., Brondolin, R., Santambrogio, M.D.: BlastFunction: a full-stack framework bringing FPGA hardware acceleration to cloud-native applications. ACM Trans. Reconfig. Technol. Syst. 15(2), 1–27 (2022)

    Article  Google Scholar 

  28. De Micheli, G.: Synthesis and Optimization of Digital Circuits. McGraw Hill, New York (1994)

    Google Scholar 

  29. Dennard, R., Gaensslen, F., Yu, H.N., Rideout, V., Bassous, E., LeBlanc, A.: Design of ion-implanted MOSFET’s with very small physical dimensions. IEEE J. Solid State Circuits, 9(5), 256–268 (1974)

    Article  Google Scholar 

  30. Du, Y., Hu, Y., Zhou, Z., Zhang, Z.: High-performance sparse linear algebra on HBM-equipped FPGAs using HLS: a case study on SpMV. In: International Symposium on Field-Programmable Gate Arrays (FPGA) (2022)

    Google Scholar 

  31. Du, Z., Fasthuber, R., Chen, T., Ienne, P., Li, L., Luo, T., Feng, X., Chen, Y., Temam, O.: ShiDianNao: shifting vision processing closer to the sensor. In: International Symposium on Computer Architecture (ISCA) (2015)

    Google Scholar 

  32. Farooq, U., Hasan, N.U., Baig, I., Zghaibeh, M.: Efficient FPGA routing using reinforcement learning. In: International Conference on Information and Communication Systems (ICICS) (2021)

    Google Scholar 

  33. Ferrandi, F., Castellana, V.G., Curzel, S., Fezzardi, P., Fiorito, M., Lattuada, M., Minutoli, M., Pilato, C., Tumeo, A.: Bambu: an open-source research framework for the high-level synthesis of complex applications. Design Automation Conf. (DAC) (2021)

    Google Scholar 

  34. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: International Conference on Neural Information Processing Systems (NeurIPS) (2014)

    Google Scholar 

  35. Gudur, V.Y., Maheshwari, S., Acharyya, A., Shafik, R.: An FPGA based energy-efficient read mapper with parallel filtering and in-situ verification. ACM Trans. Comput. Biol. Bioinformat. 1–1 (2021)

    Google Scholar 

  36. Guo, L., Maidee, P., Zhou, Y., Lavin, C., Wang, J., Chi, Y., Qiao, W., Kaviani, A., Zhang, Z., Cong, J.: RapidStream: parallel physical implementation of FPGA HLS designs. In: International Symposium on Field-Programmable Gate Arrays (FPGA) (2022)

    Google Scholar 

  37. Haghi, A., Marco-Sola, S., Alvarez, L., Diamantopoulos, D., Hagleitner, C., Moreto, M.: An FPGA accelerator of the wavefront algorithm for genomics pairwise alignment. In: International Conference on Field Programmable Logic and Applications (FPL) (2021)

    Google Scholar 

  38. Ham, T.J., Lee, Y., Seo, S.H., Song, U.G., Lee, J.W., Bruns-Smith, D., Sweeney, B., Asanovic, K., Oh, Y.H., Wills, L.W.: Accelerating genomic data analytics with composable hardware acceleration framework. IEEE Micro, 41(3), 42–49 (2021)

    Article  Google Scholar 

  39. Hamilton, W., Ying, Z., Leskovec, J.: Inductive representation learning on large graphs. In: International Conference on Neural Information Processing Systems (NeurIPS) (2017)

    Google Scholar 

  40. Han, S., Liu, X., Mao, H., Pu, J., Pedram, A., Horowitz, M.A., Dally, W.J.: EIE: efficient inference engine on compressed deep neural network. In: International Symposium on Computer Architecture (ISCA) (2016)

    Google Scholar 

  41. Han, S., Kang, J., Mao, H., Hu, Y., Li, X., Li, Y., Xie, D., Luo, H., Yao, S., Wang, Y., Yang, H., Dally, W.B.J.: ESE: efficient speech recognition engine with sparse LSTM on FPGA. In: International Symposium on Field-Programmable Gate Arrays (FPGA) (2017)

    Google Scholar 

  42. Handa, M., Vemuri, R.: An efficient algorithm for finding empty space for online FPGA placement. In: Design Automation Conference (DAC) (2004)

    Google Scholar 

  43. Hara, Y., Tomiyama, H., Honda, S., Takada, H., Ishii, K.: CHStone: a benchmark program suite for practical C-based high-level synthesis. In: International Symposium on Circuits and Systems (ISCAS) (2008)

    Google Scholar 

  44. Hassan, M.W., Athanas, P.M., Hanafy, Y.Y.: Domain-specific modeling and optimization for graph processing on FPGAs. In: International Symposium on Applied Reconfigurable Computing. Architectures (ARC) (2021)

    Google Scholar 

  45. Hegde, K., Asghari-Moghaddam, H., Pellauer, M., Crago, N., Jaleel, A., Solomonik, E., Emer, J., Fletcher, C.W.: ExTensor: an accelerator for sparse tensor algebra. IEEE Micro, 319–333 (2019)

    Google Scholar 

  46. Herklotz, Y., Pollard, J.D., Ramanathan, N., Wickerson, J.: Formal verification of high-level synthesis. In: Intl’l Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA) (2021)

    Google Scholar 

  47. Hornik, K., Stinchcombe, M., White, H.: Multilayer feedforward networks are universal approximators. Neural Netw. 2(5), 359–36 (1989)

    Article  MATH  Google Scholar 

  48. Hosny, A., Hashemi, S., Shalan, M., Reda, S.: Drills: deep reinforcement learning for logic synthesis. In: Asia and South Pacific Design Automation Conference (ASP-DAC) (2020)

    Google Scholar 

  49. Hu, Y., Du, Y., Ustun, E., Zhang, Z.: GraphLily: accelerating graph linear algebra on HBM-equipped FPGAs. In: International Conference on Computer-Aided Design (ICCAD) (2021)

    Google Scholar 

  50. Intel HLS Compiler. https://www.intel.com/content/www/us/en/software/programmable/quartus-prime/hls-compiler.html. Accessed: December 14, 2022

  51. Jia, W., Shaw, K.A., Martonosi, M.: Stargazer: automated regression-based GPU design space exploration. In: International Symposium on Performance Analysis of Systems and Software (ISPASS) (2012)

    Google Scholar 

  52. Kapre, N., Ng, H., Teo, K., Naude, J.: InTime: a machine learning approach for efficient selection of FPGA CAD tool parameters. In: International Symposium on Field-Programmable Gate Arrays (FPGA) (2015)

    Google Scholar 

  53. Karfa, C., Mandal, C., Sarkar, D., Pentakota, S.R., Reade, C.: A formal verification method of scheduling in high-level synthesis. In: International Symposium on Quality Electronic Design (ISQED) (2006)

    Google Scholar 

  54. Kim, J., Kang, J.K., Kim, Y.: A resource efficient integer-arithmetic-only FPGA-based CNN accelerator for real-time facial emotion recognition. IEEE Access, 9, 104367–104381 (2021)

    Article  Google Scholar 

  55. Knag, P., Kim, J.K., Chen, T., Zhang, Z.: A sparse coding neural network ASIC with on-chip learning for feature extraction and encoding. IEEE J. Solid State Circuits, 50(4), 1070–1079 (2015)

    Article  Google Scholar 

  56. Knaust, M., Seiler, E., Reinert, K., Steinke, T.: Co-design for energy efficient and fast genomic search: interleaved bloom filter on FPGA. In: International Symposium on Field-Programmable Gate Arrays (FPGA) (2022)

    Google Scholar 

  57. Kurek, M., Becker, T., Chau, T.C., Luk, W.: Automating optimization of reconfigurable designs. In: IEEE Symposium on Field Programmable Custom Computing Machines (FCCM) (2014)

    Google Scholar 

  58. Kurek, M., Deisenroth, M.P., Luk, W., Todman, T.: Knowledge transfer in automatic optimisation of reconfigurable designs. In: IEEE Symposium on Field Programmable Custom Computing Machines (FCCM) (2016)

    Google Scholar 

  59. Kwon, J., Carloni, L.P.: Transfer learning for design-space exploration with high-level synthesis. In: ACM/IEEE Workshop on Machine Learning for CAD (MLCAD) (2020)

    Google Scholar 

  60. Lai, Y., Ustun, E., Xiang, S., Fang, Z., Rong, H., Zhang, Z.: Programming and synthesis for software-defined FPGA acceleration: status and future prospects. ACM Trans. Reconfig. Technol. Syst. 14(4), 1–39 (2021)

    Article  Google Scholar 

  61. Lee, J., Song, T., He, J., Kandeepan, S., Wang, K.: Recurrent neural network FPGA hardware accelerator for delay-tolerant indoor optical wireless communications. Opt. Express, 29(16), 26165–26182 (2021)

    Article  Google Scholar 

  62. Li, H., Katkoori, S., Mak, W.K.: Power minimization algorithms for LUT-based FPGA technology mapping. ACM Trans. Design Automat. Electron. Syst. 9(1), 33–51 (2004)

    Article  Google Scholar 

  63. Li, D., Yao, S., Liu, Y.H., Wang, S., Sun, X.H.: Efficient design space exploration via statistical sampling and AdaBoost learning. In: Design Automation Conference (DAC) (2016)

    Google Scholar 

  64. Liang, S., Yin, S., Liu, L., Luk, W., Wei, S.: FP-BNN: binarized neural network on FPGA. Neurocomputing, 275(31), 1072–1086 (2018)

    Article  Google Scholar 

  65. Lin, J.Y., Jagannathan, A., Cong, J.: Placement-driven technology mapping for LUT-based FPGAs. In: International Symposium on Field-Programmable Gate Arrays (FPGA) (2003)

    Google Scholar 

  66. Lin, Y., Jiang, Z., Gu, J., Li, W., Dhar, S., Ren, H., Khailany, B., Pan, D.Z.: DREAMPlace: deep learning toolkit-enabled GPU acceleration for modern VLSI placement. IEEE Trans. Comput Aided Design Integr. Circuits Syst. 40(4), 748–761 (2021)

    Article  Google Scholar 

  67. Ling, A., Singh, D.P., Brown, S.D.: FPGA technology mapping: a study of optimality. In: Design Automation Conference (DAC) (2005)

    Google Scholar 

  68. Liu, H.Y., Carloni, L.P.: On learning-based methods for design-space exploration with high-level synthesis. In: Design Automation Conference (DAC) (2013)

    Google Scholar 

  69. Liu, D., Schafer, B.C.: Efficient and reliable high-level synthesis design space explorer for FPGAs. In: International Conference on Field Programmable Logic and Applications (FPL) (2016)

    Google Scholar 

  70. Liu, D., Chen, T., Liu, S., Zhou, J., Zhou, S., Temam, O., Feng, X., Zhou, X., Chen, Y.: PuDianNao: a polyvalent machine learning accelerator. In: International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) (2015)

    Google Scholar 

  71. Lo, C., Chow, P.: Model-based optimization of high-level synthesis directives. In: International Conference on Field Programmable Logic and Applications (FPL) (2016)

    Google Scholar 

  72. Lo, C., Chow, P.: Multi-fidelity optimization for high-level synthesis directives. In: International Conference on Field Programmable Logic and Applications (FPL) (2018)

    Google Scholar 

  73. Lo, C., Chow, P.: Hierarchical modelling of generators in design-space exploration. In: IEEE Symposium on Field Programmable Custom Computing Machines (FCCM) (2020)

    Google Scholar 

  74. Luk, W.: Improving performance estimation for FPGA-based accelerators for convolutional neural networks. In: International Symposium on Applied Reconfigurable Computing. Architectures (ARC) (2020)

    Google Scholar 

  75. Maarouf, D., Alhyari, A., Abuowaimer, Z., Martin, T., Gunter, A., Grewal, G., Areibi, S., Vannelli, A.: Machine-learning based congestion estimation for modern FPGAs. In: International Conference on Field Programmable Logic and Applications (FPL) (2018)

    Google Scholar 

  76. Makrani, H.M., Farahmand, F., Sayadi, H., Bondi, S., Dinakarrao, S.M.P., Homayoun, H., Rafatirad, S.: Pyramid: machine learning framework to estimate the optimal timing and resource usage of a high-level synthesis design. In: International Conference on Field Programmable Logic and Applications (FPL) (2019)

    Google Scholar 

  77. Mametjanov, A., Balaprakash, P., Choudary, C., Hovland, P.D., Wild, S.M., Sabin, G.: Autotuning FPGA design parameters for performance and power. In: IEEE Symposium on Field Programmable Custom Computing Machines (FCCM) (2015)

    Google Scholar 

  78. Manco, A., Castrillo, V.U.: An FPGA scalable software-defined radio platform for UAS communications research. J. Commun. 16(2), 42–51 (2021)

    Article  Google Scholar 

  79. Mason, L., Bartlett, P., Baxter, J., Frean, M.: Boosting algorithms as gradient descent. In: International Conference on Neural Information Processing Systems (NeurIPS) (1999)

    Google Scholar 

  80. Mehrabi, A., Manocha, A., Lee, B.C., Sorin, D.J.: Prospector: synthesizing efficient accelerators via statistical learning. In: Design, Automation, and Test in Europe (DATE) (2020)

    Google Scholar 

  81. Meng, P., Althoff, A., Gautier, Q., Kastner, R.: Adaptive threshold non-pareto elimination: re-thinking machine learning for system-level design space exploration on FPGAs. In: Design, Automation, and Test in Europe (DATE) (2016)

    Google Scholar 

  82. Mirza, M., Osindero, S.: Conditional generative adversarial nets (2014). Preprint. arXiv:1411.1784

    Google Scholar 

  83. Mishchenko, A., Chatterjee, S., Brayton, R.K.: DAG-aware AIG rewriting a fresh look at combinational logic synthesis. In: Design Automation Conference (DAC) (2006)

    Google Scholar 

  84. Mishchenko, A., Chatterjee, S., Brayton, R.K.: Improvements to technology mapping for LUT-based FPGAs. IEEE Trans. Comput. Aided Design Integr. Circuits Syst. (TCAD), 26(2), 240–253 (2007)

    Google Scholar 

  85. Murray, K.E., Petelin, O., Zhong, S., Wang, J.M., Eldafrawy, M., Legault, J.P., Sha, E., Graham, A.G., Wu, J., Walker, M.J., et al.: VTR 8: high-performance CAD and customizable FPGA architecture modelling. ACM Trans. Reconfig. Technol. Syst. 13(2), 1–55 (2020)

    Article  Google Scholar 

  86. Neto, W.L., Moreira, M.T., Amaru, L., Yu, C.: SLAP: a supervised learning approach for priority cuts technology mapping. In: Design Automation Conference (DAC) (2021)

    Google Scholar 

  87. Neto, W.L., Moreira, M.T., Amaru, L., Yu, C., Gaillardon, P.E.: Read your circuit: leveraging word embedding to guide logic optimization. In: Asia and South Pacific Design Automation Conference (ASP-DAC) (2021)

    Google Scholar 

  88. Nurvitadhi, E., Sheffield, D., Sim, J., Mishra, A., Venkatesh, G., Marr, D.: Accelerating binarized neural networks: comparison of FPGA, CPU, GPU, and ASIC. In: International Conference on Field Programmable Technology (FPT) (2016)

    Google Scholar 

  89. Nurvitadhi, E., Sim, J., Sheffield, D., Mishra, A., Krishnan, S., Marr, D.: Accelerating recurrent neural networks in analytics servers: comparison of FPGA, CPU, GPU, and ASIC. In: International Conference on Field Programmable Logic and Applications (FPL) (2016)

    Google Scholar 

  90. Nurvitadhi, E., Cook, J., Mishra, A., Marr, D., Nealis, K., Colangelo, P., Ling, A., Capalija, D., Aydonat, U., Dasu, A., Shumarayev, S.: In-package domain-specific ASICs for Intel Stratix 10 FPGAs: a case study of accelerating deep learning using TensorTile ASIC. Int’l Conf. on Field Programmable Logic and Applications (FPL). (2018)

    Google Scholar 

  91. NVIDIA DGX-1. https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/dgx-1/dgx-1-ai-supercomputer-datasheet-v4.pdf. Accessed: December 14, 2022

  92. NVIDIA Hopper H100. https://nvidianews.nvidia.com/news/nvidia-announces-hopper-architecture-the-next-generation-of-accelerated-computing. Accessed: December 14, 2022

  93. NVIDIA PASCAL GP100. https://images.nvidia.com/content/pdf/tesla/whitepaper/pascal-architecture-whitepaper.pdf. Accessed: December 14, 2022

  94. NVIDIA Tegra - Parker. https://blogs.nvidia.com/blog/2016/08/22/parker-for-self-driving-cars/. Accessed: December 14, 2022

  95. NVIDIA VOLTA GV100. https://devblogs.nvidia.com/parallelforall/inside-volta/. Accessed: December 14, 2022

  96. Papamichael, M.K., Milder, P., Hoe, J.C.: Nautilus: fast automated IP design space search using guided genetic algorithms. In: Design Automation Conference (DAC) (2015)

    Google Scholar 

  97. Papaphilippou, P., Meng, J., Gebara, N., Luk, W.: Hipernetch: high-performance FPGA network switch. ACM Trans. Reconfig. Technol. Syst. 15(1), 1–31 (2021)

    Article  Google Scholar 

  98. Parashar, A., Rhu, M., Mukkara, A., Puglielli, A., Venkatesan, R., Khailany, B., Emer, J.S., Keckler, S.W., Dally, W.J.: SCNN: an accelerator for compressed-sparse convolutional neural networks. In: International Symposium on Computer Architecture (ISCA) (2017)

    Google Scholar 

  99. Pui, C.W., Chen, G., Ma, Y., Young, E.F., Yu, B.: Clock-aware ultrascale FPGA placement with machine learning routability prediction. In: International Conference on Computer-Aided Design (ICCAD) (2017)

    Google Scholar 

  100. Pundir, N., Rahman, F., Farahmandi, F., Tehranipoor, M.: What is all the FaaS about? – remote exploitation of FPGA-as-a-service platforms. Cryptology ePrint Archive, Report 2021/746 (2021)

    Google Scholar 

  101. Rafii, A., Chow, P., Sun, W.: Pharos: a performance monitor for multi-FPGA systems. In: IEEE Symposium on Field Programmable Custom Computing Machines (FCCM) (2021)

    Google Scholar 

  102. Ramachandra, C.N., Nag, A., Balasubramonion, R., Kalsi, G., Pillai, K., Subramoney, S.: ONT-X: an FPGA approach to real-time portable genomic analysis. In: IEEE Symposium on Field Programmable Custom Computing Machines (FCCM) (2021)

    Google Scholar 

  103. Reagen, B., Adolf, R., Shao, Y.S., Wei, G.Y., Brooks, D.: MachSuite: benchmarks for accelerator design and customized architectures. In: International Symposium on Workload Characterization (IISWC) (2014)

    Google Scholar 

  104. Reagen, B., Whatmough, P., Adolf, R., Rama, S., Lee, H., Lee, S.K., Hernández-Lobato, J.M., Wei, G.Y., Brooks, D.: Minerva: enabling low-power, highly-accurate deep neural network accelerators. In: International Symposium on Computer Architecture (ISCA) (2016)

    Google Scholar 

  105. Ronak, B., Fahmy, S.A.: Mapping for maximum performance on FPGA DSP blocks. IEEE Trans. Comput. Aided Design Integr. Circuits Syst. 35(4), 573–585 (2016)

    Article  Google Scholar 

  106. Schafer, B.C., Mahapatra, A.: S2CBench: synthesizable systemC benchmark suite for high-level synthesis. IEEE Embed. Syst. Lett. 6(3), 53–56 (2014)

    Article  Google Scholar 

  107. Sechen, C.: VLSI Placement and Global Routing using Simulated Annealing, vol. 54. Springer Science & Business Media, Berlin (2012)

    Google Scholar 

  108. Shwartz-Ziv, R., Armon, A.: Tabular data: deep learning is not all you need. Infor. Fusion, 81, 84–90 (2022)

    Article  Google Scholar 

  109. Soeken, M., Amaru, L.G., Gaillardon, P.E., De Micheli, G.: Exact synthesis of majority-inverter graphs and its applications. IEEE Trans. Comput. Aided Design Integr. Circuits Syst. 36(11), 1842–1855 (2017)

    Article  Google Scholar 

  110. Soeken, M., Haaswijk, W., Testa, E., Mishchenko, A., Amarù, L.G., Brayton, R.K., De Micheli, G.: Practical exact synthesis. In: Design, Automation, and Test in Europe (DATE) (2018)

    Google Scholar 

  111. Szentimrey, H., Al-Hyari, A., Foxcroft, J., Martin, T., Noel, D., Grewal, G., Areibi, S.: Machine learning for congestion management and routability prediction within FPGA placement. ACM Trans. Design Automat. Electron. Syst. (TODAES), 25(5), 1–25 (2020)

    Google Scholar 

  112. Tang, X., Giacomin, E., Alacchi, A., Chauviere, B., Gaillardon, P.E.: OpenFPGA: an opensource framework enabling rapid prototyping of customizable FPGAs. In: International Conference on Field Programmable Logic and Applications (FPL) (2019)

    Google Scholar 

  113. Testa, E., Soeken, M., Amarù, L., De Micheli, G.: Reducing the multiplicative complexity in logic networks for cryptography and security applications. In: Design Automation Conference (DAC) (2019)

    Google Scholar 

  114. Ustun, E., Xiang, S., Gui, J., Yu, C., Zhang, Z.: LAMDA: Learning-assisted multi-stage autotuning for FPGA design closure. In: IEEE Symposium on Field Programmable Custom Computing Machines (FCCM) (2019)

    Google Scholar 

  115. Ustun, E., Deng, C., Pal, D., Li, Z., Zhang, Z.: Accurate operation delay prediction for FPGA HLS using graph neural networks. In: International Conference on Computer-Aided Design (ICCAD) (2020)

    Google Scholar 

  116. Wang, Z., Schafer, B.C.: Machine learning to set meta-heuristic specific parameters for high-level synthesis design space exploration. In: Design Automation Conference (DAC) (2020)

    Google Scholar 

  117. Wang, W., Bolic, M., Parri, J.: pvFPGA: accessing an FPGA-based hardware accelerator in a paravirtualized environment. In: Intl’l Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS) (2013)

    Google Scholar 

  118. Wang, Q., Zheng, L., Huang, Y., Yao, P., Gui, C., Liao, X., Jin, H., Jiang, W., Mao, F.: GraSU: a fast graph update library for FPGA-based dynamic graph processing. In: International Symposium on Field-Programmable Gate Arrays (FPGA) (2021)

    Google Scholar 

  119. Wille, R., Soeken, M., Drechsler, R.: Reducing the number of lines in reversible circuits. In: Design Automation Conference (DAC) (2010)

    Google Scholar 

  120. Wu, N., Xie, Y., Hao, C.: IronMan: GNN-assisted design space exploration in high-level synthesis via reinforcement learning. In: Great Lakes Symposium on VLSI (2021)

    Google Scholar 

  121. Wu, Y., Wang, Q., Zheng, L., Liao, X., Jin, H., Jiang, W., Zheng, R., Hu, K.: FDGLib: a communication library for efficient large-scale graph processing in FPGA-accelerated data centers. J. Comput. Sci. Technol. 36, 1051–1070 (2021)

    Article  Google Scholar 

  122. Xie, Z., Huang, Y.H., Fang, G.Q., Ren, H., Fang, S.Y., Chen, Y., Hu, J.: RouteNet: routability prediction for mixed-size designs using convolutional neural network. In: International Conference on Computer-Aided Design (ICCAD) (2018)

    Google Scholar 

  123. Xilinx Inc.: Floorplanning Methodology Guide (2013)

    Google Scholar 

  124. Xilinx Inc.: UltraScale Architecture Configurable Logic Block (2017)

    Google Scholar 

  125. Xilinx Inc.: UltraScale Architecture DSP Slice User Guide (2019)

    Google Scholar 

  126. Xin, G., Zhao, Y., Han, J.: A multi-layer parallel hardware architecture for homomorphic computation in machine learning. In: International Symposium on Circuits and Systems (ISCAS) (2021)

    Google Scholar 

  127. Xu, C., Liu, G., Zhao, R., Yang, S., Luo, G., Zhang, Z.: A parallel bandit-based approach for autotuning FPGA compilation. In: International Symposium on Field-Programmable Gate Arrays (FPGA) (2017)

    Google Scholar 

  128. Xu, P., Zhang, X., Hao, C., Zhao, Y., Zhang, Y., Wang, Y., Li, C., Guan, Z., Chen, D., Lin, Y.: AutoDNNchip: an automated DNN chip predictor and builder for both FPGAs and ASICs. In: International Symposium on Field-Programmable Gate Arrays (FPGA) (2020)

    Google Scholar 

  129. Yang, L., He, Z., Fan, D.: A fully onchip binarized convolutional neural network FPGA implementation with accurate inference. In: International Symposium on Low Power Electronics and Design (ISLPED) (2018)

    Google Scholar 

  130. Yosys Open Synthesis Suite. https://github.com/YosysHQ/yosys. Accessed: December 14, 2022

  131. Yu, C.: FlowTune: practical multi-armed bandits in boolean optimization. In: International Conference on Computer-Aided Design (ICCAD) (2020)

    Google Scholar 

  132. Yu, C., Zhang, Z.: Painting on placement: forecasting routing congestion using conditional generative adversarial nets. In: Design Automation Conference (DAC) (2019)

    Google Scholar 

  133. Yu, C., Zhou, W.: Decision making in synthesis cross technologies using LSTMs and transfer learning. In: ACM/IEEE Workshop on Machine Learning for CAD (MLCAD) (2020)

    Google Scholar 

  134. Yu, C., Choudhury, M., Sullivan, A., Ciesielski, M.J.: Advanced datapath synthesis using graph isomorphism. In: International Conference on Computer-Aided Design (ICCAD) (2017)

    Google Scholar 

  135. Yu, C., Xiao, H., De Micheli, G.: Developing synthesis flows without human knowledge. Design Automation Conference (DAC) (2018)

    Google Scholar 

  136. Zhang, Z., Liu, B.: SDC-based modulo scheduling for pipeline synthesis. In: International Conference on Computer-Aided Design (ICCAD) (2013)

    Google Scholar 

  137. Zeng, H., Prasanna, V.: GraphACT: accelerating GCN training on CPU-FPGA heterogeneous platforms. In: International Symposium on Field-Programmable Gate Arrays (FPGA) (2020)

    Google Scholar 

  138. Zhang, S., Du, Z., Zhang, L., Lan, H., Liu, S., Li, L., Guo, Q., Chen, T., Chen, Y.: Cambricon-X: an accelerator for sparse neural networks. IEEE Micro, 1–12 (2016)

    Google Scholar 

  139. Zhang, X., Wang, J., Zhu, C., Lin, Y., Xiong, J., Hwu, W.m., Chen, D.: DNNBuilder: an automated tool for building high-performance DNN hardware accelerators for FPGAs. In: International Conference on Computer-Aided Design (ICCAD) (2018)

    Google Scholar 

  140. Zhang, C., Hu, H., Cao, S., Jiang, Z.: A novel blind detection method and FPGA implementation for energy-efficient sidelink communications. In: Workshop on Signal Processing Systems (SiPS) (2021)

    Google Scholar 

  141. Zhang, Y., Pan, J., Liu, X., Chen, H., Chen, D., Zhang, Z.: FracBNN: accurate and FPGA-efficient binary neural networks with fractional activations. In: International Symposium on Field-Programmable Gate Arrays (FPGA) (2021)

    Google Scholar 

  142. Zhang, Y., Zhang, Z., Lew, L.: PokeBNN: a binary pursuit of lightweight accuracy. In: Conference on Computer Vision and Pattern Recognition (CVPR) (CVPR) (2022)

    Google Scholar 

  143. Zhao, J., Liang, T., Sinha, S., Zhang, W.: Machine learning based routing congestion prediction in FPGA high-level synthesis. In: Design, Automation, and Test in Europe (DATE) (2019)

    Google Scholar 

  144. Zhou, Y., Gupta, U., Dai, S., Zhao, R., Srivastava, N., Jin, H., Featherston, J., Lai, Y.H., Liu, G., Velasquez, G.A., Wang, W., Zhang, Z.: Rosetta: a realistic high-level synthesis benchmark suite for software programmable FPGAs. In: International Symposium on Field-Programmable Gate Arrays (FPGA) (2018)

    Google Scholar 

  145. Zhou, Y., Gupta, U., Dai, S., Zhao, R., Srivastava, N., Jin, H., Featherston, J., Lai, Y.H., Liu, G., Velasquez, G.A., et al.: Rosetta: a realistic high-level synthesis benchmark suite for software programmable FPGAs. In: International Symposium on Field-Programmable Gate Arrays (FPGA) (2018)

    Google Scholar 

  146. Zhou, S., Kannan, R., Prasanna, V.K., Seetharaman, G., Wu, Q.: HitGraph: high-throughput graph processing framework on FPGA. IEEE Trans. Parallel Distrib. Syst. 30(10), 2249–2264 (2019)

    Article  Google Scholar 

  147. Zhu, K., Liu, M., Chen, H., Zhao, Z., Pan, D.Z.: Exploring logic optimizations with reinforcement learning and graph convolutional network. In: ACM/IEEE Workshop on Machine Learning for CAD (MLCAD) (2020)

    Google Scholar 

  148. Zhu, Y., Zhu, M., Yang, B., Zhu, W., Deng, C., Chen, C., Wei, S., Liu, L.: LWRpro: an energy-efficient configurable crypto-processor for Module-LWR. IEEE Trans. Circuits Syst. I, 68(3), 1146–1159 (2021)

    Article  Google Scholar 

  149. Ziegler, M.M., Bertran, R., Buyuktosunoglu, A., Bose, P.: Machine learning techniques for taming the complexity of modern hardware design. IBM J. Res. Develop. 61(4/5), 13:1–13:14 (2017)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Debjit Pal .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Pal, D., Deng, C., Ustun, E., Yu, C., Zhang, Z. (2022). Machine Learning for Agile FPGA Design. In: Ren, H., Hu, J. (eds) Machine Learning Applications in Electronic Design Automation. Springer, Cham. https://doi.org/10.1007/978-3-031-13074-8_16

Download citation

Publish with us

Policies and ethics