Skip to main content
Log in

DGCNN on FPGA: Acceleration of the Point Cloud Classifier Using FPGAs

  • Published:
Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Abstract

Over the last few years, deep learning on irregular 3D data given its wide range of applications has become one of the active topics in the field. While field programmable gate array (FPGA)-based acceleration of deep learning models has been proved to produce power-efficient designs in comparison with other platforms such as CPUs and GPUs, only a few studies have been conducted regarding the models that consume point clouds as their input. Although tailoring the hardware designs to specific networks could lead to better optimization opportunities, it is also important to keep the reusability of the design in mind, especially for a new and evolving topic like learning on point clouds. In this work, we have aimed to achieve reusability by keeping the hardware isolated from the computational graph. Considering the numerous types of layers used in dynamic graph convolutional neural network (DGCNN) and its popularity, our proposed design aims for the thorough acceleration of DGCNN. The challenges including 18 types of tensor operations, achieving burst transfers, dealing with kernel complexities, external memory banks, in-order and out-of-order execution modes, and approaches with multiple processing elements have been explained in details throughout the paper. Our experiments on a single FPGA with a single bitstream, DDR4 memory subsystem, and Float32 data type demonstrated speedups of \(2.73\times \) to \(8.4\times \) compared to a sequential single-threaded implantation on an Intel Core i7 6700HQ.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Data Availability

All data included in this study are available in DeepPoint-V2-FPGA’s repository, https://doi.org/10.5281/zenodo.6397222.

Notes

  1. https://github.com/salehjg/MeshToPointcloudFPS.

  2. https://github.com/salehjg/Shapenet2_Preparation.

References

  1. F. Albu, J. Kadlec, C. Softley, R. Matousek, A. Hermanek, N. Coleman, A. Fagan, Implementation of (normalised) RLS lattice on virtex, in Field-Programmable Logic and Applications. ed. by G. Brebner, R. Woods (Springer, Berlin Heidelberg, Berlin, Heidelberg, 2001), pp.91–100

    Chapter  Google Scholar 

  2. F. Albu, J. Kadlec, N. Coleman, A. Fagan, Pipelined implementations of the a priori error-feedback LSL algorithm using logarithmic arithmetic, in 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 3 (2002), pp. III-2681–III-2684. https://doi.org/10.1109/ICASSP.2002.5745200

  3. L. Bai, Y. Lyu, X. Xu, X. Huang, PointNet on FPGA for real-time LiDAR point cloud processing, in 2020 IEEE International Symposium on Circuits and Systems (ISCAS) (2020), pp. 1–5. https://doi.org/10.1109/ISCAS45731.2020.9180841

  4. A.X. Chang, T. Funkhouser, L. Guibas, P. Hanrahan, Q. Huang, Z. Li, S. Savarese, M. Savva, S. Song, H. Su, J. Xiao, L. Yi, F. Yu, ShapeNet: An Information-Rich 3D Model Repository, arXiv preprint, arXiv (2015). https://doi.org/10.48550/ARXIV.1512.03012

  5. R.Q. Charles, H. Su, M. Kaichun, L.J. Guibas, PointNet: deep learning on point sets for 3D classification and segmentation, in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017), pp. 77–85. https://doi.org/10.1109/CVPR.2017.16

  6. R. Chen, V.K. Prasanna, Computer generation of high throughput and memory efficient sorting designs on FPGA. IEEE Trans. Parallel Distrib. Syst. 28(11), 3100–3113 (2017). https://doi.org/10.1109/TPDS.2017.2705128

    Article  Google Scholar 

  7. M. Christ, F. de Dinechin, F. Pétrot, Low-precision logarithmic arithmetic for neural network accelerators, in ASAP 2022—33rd IEEE International Conference on Application-Specific Systems, Architectures and Processors, Gothenburg, Sweden (2022). https://hal.inria.fr/hal-03684585

  8. P. Cignoni, M. Callieri, M. Corsini, M. Dellepiane, F. Ganovelli, G. Ranzuglia, MeshLab: an open-source mesh processing tool, in Eurographics Italian Chapter Conference (2008)

  9. J. Coleman, E. Chester, C. Softley, J. Kadlec, Arithmetic on the European logarithmic microprocessor. IEEE Trans. Comput. 49(7), 702–715 (2000). https://doi.org/10.1109/12.863040

    Article  Google Scholar 

  10. R. DiCecco, G. Lacey, J. Vasiljevic, P. Chow, G. Taylor, S. Areibi, Caffeinated FPGAs: FPGA framework for convolutional neural networks, in 2016 International Conference on Field-Programmable Technology (FPT) (2016), pp. 265–268. https://doi.org/10.1109/FPT.2016.7929549

  11. J. de Fine Licht, T. Hoefler, hlslib: Software Engineering for Hardware Design, arXiv preprint, arXiv (2019). https://doi.org/10.48550/ARXIV.1910.04436

  12. J. de Fine Licht, G. Kwasniewski, T. Hoefler, Flexible communication avoiding matrix multiplication on FPGA with high-level synthesis, in The 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Association for Computing Machinery, New York, NY, USA, FPGA ’20 (2020), pp. 244–254. https://doi.org/10.1145/3373087.3375296

  13. J. de Fine Licht, M. Besta, S. Meierhans, T. Hoefler, Transformations of high-level synthesis codes for high-performance computing. IEEE Trans. Parallel Distrib. Syst. 32(5), 1014–1029 (2021). https://doi.org/10.1109/TPDS.2020.3039409

    Article  Google Scholar 

  14. L. Gong, C. Wang, X. Li, X. Zhou, Improving HW/SW adaptability for accelerating CNNs on FPGAs through a dynamic/static co-reconfiguration approach. IEEE Trans. Parallel Distrib. Syst. 32(7), 1854–1865 (2021). https://doi.org/10.1109/TPDS.2020.3046762

    Article  Google Scholar 

  15. Y.B. Jmaa, R.B. Atitallah, D. Duvivier, M.B. Jemaa, A comparative study of sorting algorithms with FPGA acceleration by high level synthesis. Computación y Sistemas 23(1), 213 (2019). https://doi.org/10.13053/cys-23-1-2999

    Article  Google Scholar 

  16. S. Liang, C. Liu, Y. Wang, H. Li, X. Li, DeepBurning-GL: an automated framework for generating graph neural network accelerators, in 2020 IEEE/ACM International Conference On Computer Aided Design (ICCAD) (2020), pp. 1–9

  17. L. Luo, Y. Wu, F. Qiao, Y. Yang, Q. Wei, X. Zhou, Y. Fan, S. Xu, X. Liu, H. Yang, M. Hübner, Design of FPGA-based accelerator for convolutional neural network under heterogeneous computing framework with OpenCL. Int. J. Reconfig. Comput. (2018). https://doi.org/10.1155/2018/1785892

    Article  Google Scholar 

  18. J. Matai, D. Richmond, D. Lee, Z. Blair, Q. Wu, A. Abazari, R. Kastner, Resolve: generation of high-performance sorting architectures from high-level synthesis, in Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Association for Computing Machinery, New York, NY, USA, FPGA ’16 (2016), pp. 195–204. https://doi.org/10.1145/2847263.2847268

  19. D. Maturana, S. Scherer, VoxNet: a 3D convolutional neural network for real-time object recognition, in 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (2015), pp. 922–928. https://doi.org/10.1109/IROS.2015.7353481

  20. C.R. Qi, H. Su, M. Nießner, A. Dai, M. Yan, L.J. Guibas, Volumetric and multi-view CNNs for object classification on 3D data, in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016), pp. 5648–5656. https://doi.org/10.1109/CVPR.2016.609

  21. C.R. Qi, L. Yi, H. Su, L.J. Guibas, PointNet++: deep hierarchical feature learning on point sets in a metric space, in Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4–9, 2017, Long Beach, CA, USA. ed. by I. Guyon, U. von Luxburg, S. Bengio, H.M. Wallach, R. Fergus, S.V.N. Vishwanathan, R. Garnett (2017), pp. 5099–5108. https://proceedings.neurips.cc/paper/2017/hash/d8bf84be3800d12f74d8b05e9b89836f-Abstract.html

  22. J. Qiu, J. Wang, S. Yao, K. Guo, B. Li, E. Zhou, J. Yu, T. Tang, N. Xu, S. Song, Y. Wang, H. Yang, Going deeper with embedded FPGA platform for convolutional neural network, in Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Monterey, CA, USA, February 21–23, 2016. ed. by D. Chen, J.W. Greene (ACM, 2016), pp. 26–35. https://doi.org/10.1145/2847263.2847265

  23. R.B. Rusu, S. Cousins, 3D is here: point cloud library (PCL), in 2011 IEEE International Conference on Robotics and Automation (2011), pp. 1–4. https://doi.org/10.1109/ICRA.2011.5980567

  24. A. Shawahna, S.M. Sait, A. El-Maleh, FPGA-based accelerators of deep learning networks for learning and classification: a review. IEEE Access 7, 7823–7859 (2019). https://doi.org/10.1109/ACCESS.2018.2890150

    Article  Google Scholar 

  25. H. Su, S. Maji, E. Kalogerakis, E.G. Learned-Miller, Multi-view convolutional neural networks for 3D shape recognition, in IEEE International Conference on Computer Vision, ICCV, Santiago, Chile (IEEE Computer Society, 2015), pp. 945–953. https://doi.org/10.1109/ICCV.2015.114

  26. N. Suda, V. Chandra, G. Dasika, A. Mohanty, Y. Ma, S. Vrudhula, J.S. Seo, Y. Cao, Throughput-optimized OpenCL-based FPGA accelerator for large-scale convolutional neural networks, in Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Association for Computing Machinery, New York, NY, USA, FPGA ’16 (2016), pp. 16–25. https://doi.org/10.1145/2847263.2847276

  27. Y. Wang, Y. Sun, Z. Liu, S.E. Sarma, M.M. Bronstein, J.M. Solomon, Dynamic graph CNN for learning on point clouds. ACM Trans. Graph. 38(5), 1–12 (2019). https://doi.org/10.1145/3326362

    Article  Google Scholar 

  28. Z. Wu, S. Song, A. Khosla, F. Yu, L. Zhang, X. Tang, J. Xiao, 3D ShapeNets:a deep representation for volumetric shapes, in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015), pp. 1912–1920. https://doi.org/10.1109/CVPR.2015.7298801

  29. C. Zhang, P. Li, G. Sun, Y. Guan, B. Xiao, J. Cong, Optimizing FPGA-based accelerator design for deep convolutional neural networks, in Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Association for Computing Machinery, New York, NY, USA, FPGA ’15 (2015), pp. 161–170. https://doi.org/10.1145/2684746.2689060

  30. C. Zhang, Z. Fang, P. Zhou, P. Pan, J. Cong, Caffeine: towards uniformed representation and acceleration for deep convolutional neural networks, in 2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD) (2016), pp. 1–8. https://doi.org/10.1145/2966986.2967011

  31. J.F. Zhang, Z. Zhang, Point-X: a spatial-locality-aware architecture for energy-efficient graph-based point-cloud deep learning, in MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture, Association for Computing Machinery, New York, NY, USA, MICRO ’21 (2021), pp. 1078–1090. https://doi.org/10.1145/3466752.3480081

  32. Q. Zhang, M. Zhang, T. Chen, Z. Sun, Y. Ma, B. Yu, Recent advances in convolutional neural network acceleration. Neurocomputing 323, 37–51 (2019). https://doi.org/10.1016/j.neucom.2018.09.038

    Article  Google Scholar 

  33. X. Zheng, M. Zhu, Y. Xu, Y. Li, An FPGA based parallel implementation for point cloud neural network, in 2019 IEEE 13th International Conference on ASIC (ASICON) (2019), pp. 1–4. https://doi.org/10.1109/ASICON47005.2019.8983660

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ghader Karimian.

Ethics declarations

Competing interests

The authors declare that they have no financial or nonfinancial competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jamali Golzar, S., Karimian, G., Shoaran, M. et al. DGCNN on FPGA: Acceleration of the Point Cloud Classifier Using FPGAs. Circuits Syst Signal Process 42, 748–779 (2023). https://doi.org/10.1007/s00034-022-02179-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00034-022-02179-0

Keywords

Navigation