DGCNN on FPGA: Acceleration of the Point Cloud Classifier Using FPGAs

Jamali Golzar, Saleh; Karimian, Ghader; Shoaran, Maryam; Fattahi Sani, Mohammad

doi:10.1007/s00034-022-02179-0

DGCNN on FPGA: Acceleration of the Point Cloud Classifier Using FPGAs

Published: 06 October 2022

Volume 42, pages 748–779, (2023)
Cite this article

Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Saleh Jamali Golzar¹,
Ghader Karimian¹,
Maryam Shoaran¹ &
…
Mohammad Fattahi Sani²

853 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

Over the last few years, deep learning on irregular 3D data given its wide range of applications has become one of the active topics in the field. While field programmable gate array (FPGA)-based acceleration of deep learning models has been proved to produce power-efficient designs in comparison with other platforms such as CPUs and GPUs, only a few studies have been conducted regarding the models that consume point clouds as their input. Although tailoring the hardware designs to specific networks could lead to better optimization opportunities, it is also important to keep the reusability of the design in mind, especially for a new and evolving topic like learning on point clouds. In this work, we have aimed to achieve reusability by keeping the hardware isolated from the computational graph. Considering the numerous types of layers used in dynamic graph convolutional neural network (DGCNN) and its popularity, our proposed design aims for the thorough acceleration of DGCNN. The challenges including 18 types of tensor operations, achieving burst transfers, dealing with kernel complexities, external memory banks, in-order and out-of-order execution modes, and approaches with multiple processing elements have been explained in details throughout the paper. Our experiments on a single FPGA with a single bitstream, DDR4 memory subsystem, and Float32 data type demonstrated speedups of \(2.73\times \) to \(8.4\times \) compared to a sequential single-threaded implantation on an Intel Core i7 6700HQ.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Fig. 2

Fig. 10

Data-Intensive Computing Acceleration with Python in Xilinx FPGA

Comparison of Different Deployment Approaches of FPGA-Based Hardware Accelerator for 3D Object Detection Models

Accelerating DNN-based 3D point cloud processing for mobile computing

Article 19 September 2019

Data Availability

All data included in this study are available in DeepPoint-V2-FPGA’s repository, https://doi.org/10.5281/zenodo.6397222.

Notes

References

F. Albu, J. Kadlec, C. Softley, R. Matousek, A. Hermanek, N. Coleman, A. Fagan, Implementation of (normalised) RLS lattice on virtex, in Field-Programmable Logic and Applications. ed. by G. Brebner, R. Woods (Springer, Berlin Heidelberg, Berlin, Heidelberg, 2001), pp.91–100
Chapter Google Scholar
F. Albu, J. Kadlec, N. Coleman, A. Fagan, Pipelined implementations of the a priori error-feedback LSL algorithm using logarithmic arithmetic, in 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 3 (2002), pp. III-2681–III-2684. https://doi.org/10.1109/ICASSP.2002.5745200
L. Bai, Y. Lyu, X. Xu, X. Huang, PointNet on FPGA for real-time LiDAR point cloud processing, in 2020 IEEE International Symposium on Circuits and Systems (ISCAS) (2020), pp. 1–5. https://doi.org/10.1109/ISCAS45731.2020.9180841
A.X. Chang, T. Funkhouser, L. Guibas, P. Hanrahan, Q. Huang, Z. Li, S. Savarese, M. Savva, S. Song, H. Su, J. Xiao, L. Yi, F. Yu, ShapeNet: An Information-Rich 3D Model Repository, arXiv preprint, arXiv (2015). https://doi.org/10.48550/ARXIV.1512.03012
R.Q. Charles, H. Su, M. Kaichun, L.J. Guibas, PointNet: deep learning on point sets for 3D classification and segmentation, in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017), pp. 77–85. https://doi.org/10.1109/CVPR.2017.16
R. Chen, V.K. Prasanna, Computer generation of high throughput and memory efficient sorting designs on FPGA. IEEE Trans. Parallel Distrib. Syst. 28(11), 3100–3113 (2017). https://doi.org/10.1109/TPDS.2017.2705128
Article Google Scholar
M. Christ, F. de Dinechin, F. Pétrot, Low-precision logarithmic arithmetic for neural network accelerators, in ASAP 2022—33rd IEEE International Conference on Application-Specific Systems, Architectures and Processors, Gothenburg, Sweden (2022). https://hal.inria.fr/hal-03684585
P. Cignoni, M. Callieri, M. Corsini, M. Dellepiane, F. Ganovelli, G. Ranzuglia, MeshLab: an open-source mesh processing tool, in Eurographics Italian Chapter Conference (2008)
J. Coleman, E. Chester, C. Softley, J. Kadlec, Arithmetic on the European logarithmic microprocessor. IEEE Trans. Comput. 49(7), 702–715 (2000). https://doi.org/10.1109/12.863040
Article Google Scholar
R. DiCecco, G. Lacey, J. Vasiljevic, P. Chow, G. Taylor, S. Areibi, Caffeinated FPGAs: FPGA framework for convolutional neural networks, in 2016 International Conference on Field-Programmable Technology (FPT) (2016), pp. 265–268. https://doi.org/10.1109/FPT.2016.7929549
J. de Fine Licht, T. Hoefler, hlslib: Software Engineering for Hardware Design, arXiv preprint, arXiv (2019). https://doi.org/10.48550/ARXIV.1910.04436
J. de Fine Licht, G. Kwasniewski, T. Hoefler, Flexible communication avoiding matrix multiplication on FPGA with high-level synthesis, in The 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Association for Computing Machinery, New York, NY, USA, FPGA ’20 (2020), pp. 244–254. https://doi.org/10.1145/3373087.3375296
J. de Fine Licht, M. Besta, S. Meierhans, T. Hoefler, Transformations of high-level synthesis codes for high-performance computing. IEEE Trans. Parallel Distrib. Syst. 32(5), 1014–1029 (2021). https://doi.org/10.1109/TPDS.2020.3039409
Article Google Scholar
L. Gong, C. Wang, X. Li, X. Zhou, Improving HW/SW adaptability for accelerating CNNs on FPGAs through a dynamic/static co-reconfiguration approach. IEEE Trans. Parallel Distrib. Syst. 32(7), 1854–1865 (2021). https://doi.org/10.1109/TPDS.2020.3046762
Article Google Scholar
Y.B. Jmaa, R.B. Atitallah, D. Duvivier, M.B. Jemaa, A comparative study of sorting algorithms with FPGA acceleration by high level synthesis. Computación y Sistemas 23(1), 213 (2019). https://doi.org/10.13053/cys-23-1-2999
Article Google Scholar
S. Liang, C. Liu, Y. Wang, H. Li, X. Li, DeepBurning-GL: an automated framework for generating graph neural network accelerators, in 2020 IEEE/ACM International Conference On Computer Aided Design (ICCAD) (2020), pp. 1–9
L. Luo, Y. Wu, F. Qiao, Y. Yang, Q. Wei, X. Zhou, Y. Fan, S. Xu, X. Liu, H. Yang, M. Hübner, Design of FPGA-based accelerator for convolutional neural network under heterogeneous computing framework with OpenCL. Int. J. Reconfig. Comput. (2018). https://doi.org/10.1155/2018/1785892
Article Google Scholar
J. Matai, D. Richmond, D. Lee, Z. Blair, Q. Wu, A. Abazari, R. Kastner, Resolve: generation of high-performance sorting architectures from high-level synthesis, in Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Association for Computing Machinery, New York, NY, USA, FPGA ’16 (2016), pp. 195–204. https://doi.org/10.1145/2847263.2847268
D. Maturana, S. Scherer, VoxNet: a 3D convolutional neural network for real-time object recognition, in 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (2015), pp. 922–928. https://doi.org/10.1109/IROS.2015.7353481
C.R. Qi, H. Su, M. Nießner, A. Dai, M. Yan, L.J. Guibas, Volumetric and multi-view CNNs for object classification on 3D data, in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016), pp. 5648–5656. https://doi.org/10.1109/CVPR.2016.609
C.R. Qi, L. Yi, H. Su, L.J. Guibas, PointNet++: deep hierarchical feature learning on point sets in a metric space, in Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4–9, 2017, Long Beach, CA, USA. ed. by I. Guyon, U. von Luxburg, S. Bengio, H.M. Wallach, R. Fergus, S.V.N. Vishwanathan, R. Garnett (2017), pp. 5099–5108. https://proceedings.neurips.cc/paper/2017/hash/d8bf84be3800d12f74d8b05e9b89836f-Abstract.html
J. Qiu, J. Wang, S. Yao, K. Guo, B. Li, E. Zhou, J. Yu, T. Tang, N. Xu, S. Song, Y. Wang, H. Yang, Going deeper with embedded FPGA platform for convolutional neural network, in Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Monterey, CA, USA, February 21–23, 2016. ed. by D. Chen, J.W. Greene (ACM, 2016), pp. 26–35. https://doi.org/10.1145/2847263.2847265
R.B. Rusu, S. Cousins, 3D is here: point cloud library (PCL), in 2011 IEEE International Conference on Robotics and Automation (2011), pp. 1–4. https://doi.org/10.1109/ICRA.2011.5980567
A. Shawahna, S.M. Sait, A. El-Maleh, FPGA-based accelerators of deep learning networks for learning and classification: a review. IEEE Access 7, 7823–7859 (2019). https://doi.org/10.1109/ACCESS.2018.2890150
Article Google Scholar
H. Su, S. Maji, E. Kalogerakis, E.G. Learned-Miller, Multi-view convolutional neural networks for 3D shape recognition, in IEEE International Conference on Computer Vision, ICCV, Santiago, Chile (IEEE Computer Society, 2015), pp. 945–953. https://doi.org/10.1109/ICCV.2015.114
N. Suda, V. Chandra, G. Dasika, A. Mohanty, Y. Ma, S. Vrudhula, J.S. Seo, Y. Cao, Throughput-optimized OpenCL-based FPGA accelerator for large-scale convolutional neural networks, in Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Association for Computing Machinery, New York, NY, USA, FPGA ’16 (2016), pp. 16–25. https://doi.org/10.1145/2847263.2847276
Y. Wang, Y. Sun, Z. Liu, S.E. Sarma, M.M. Bronstein, J.M. Solomon, Dynamic graph CNN for learning on point clouds. ACM Trans. Graph. 38(5), 1–12 (2019). https://doi.org/10.1145/3326362
Article Google Scholar
Z. Wu, S. Song, A. Khosla, F. Yu, L. Zhang, X. Tang, J. Xiao, 3D ShapeNets:a deep representation for volumetric shapes, in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015), pp. 1912–1920. https://doi.org/10.1109/CVPR.2015.7298801
C. Zhang, P. Li, G. Sun, Y. Guan, B. Xiao, J. Cong, Optimizing FPGA-based accelerator design for deep convolutional neural networks, in Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Association for Computing Machinery, New York, NY, USA, FPGA ’15 (2015), pp. 161–170. https://doi.org/10.1145/2684746.2689060
C. Zhang, Z. Fang, P. Zhou, P. Pan, J. Cong, Caffeine: towards uniformed representation and acceleration for deep convolutional neural networks, in 2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD) (2016), pp. 1–8. https://doi.org/10.1145/2966986.2967011
J.F. Zhang, Z. Zhang, Point-X: a spatial-locality-aware architecture for energy-efficient graph-based point-cloud deep learning, in MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture, Association for Computing Machinery, New York, NY, USA, MICRO ’21 (2021), pp. 1078–1090. https://doi.org/10.1145/3466752.3480081
Q. Zhang, M. Zhang, T. Chen, Z. Sun, Y. Ma, B. Yu, Recent advances in convolutional neural network acceleration. Neurocomputing 323, 37–51 (2019). https://doi.org/10.1016/j.neucom.2018.09.038
Article Google Scholar
X. Zheng, M. Zhu, Y. Xu, Y. Li, An FPGA based parallel implementation for point cloud neural network, in 2019 IEEE 13th International Conference on ASIC (ASICON) (2019), pp. 1–4. https://doi.org/10.1109/ASICON47005.2019.8983660

Download references

Author information

Authors and Affiliations

Faculty of Electrical and Computer Engineering, University of Tabriz, Tabriz, Iran
Saleh Jamali Golzar, Ghader Karimian & Maryam Shoaran
Bristol Robotics Laboratory, Coldharbour Lane, Bristol, UK
Mohammad Fattahi Sani

Authors

Saleh Jamali Golzar
View author publications
You can also search for this author in PubMed Google Scholar
Ghader Karimian
View author publications
You can also search for this author in PubMed Google Scholar
Maryam Shoaran
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Fattahi Sani
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ghader Karimian.

Ethics declarations

Competing interests

The authors declare that they have no financial or nonfinancial competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Jamali Golzar, S., Karimian, G., Shoaran, M. et al. DGCNN on FPGA: Acceleration of the Point Cloud Classifier Using FPGAs. Circuits Syst Signal Process 42, 748–779 (2023). https://doi.org/10.1007/s00034-022-02179-0

Download citation

Received: 31 March 2022
Revised: 27 August 2022
Accepted: 06 September 2022
Published: 06 October 2022
Issue Date: February 2023
DOI: https://doi.org/10.1007/s00034-022-02179-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

DGCNN on FPGA: Acceleration of the Point Cloud Classifier Using FPGAs

Abstract

Access this article

Similar content being viewed by others

Data-Intensive Computing Acceleration with Python in Xilinx FPGA

Comparison of Different Deployment Approaches of FPGA-Based Hardware Accelerator for 3D Object Detection Models

Accelerating DNN-based 3D point cloud processing for mobile computing

Data Availability

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

DGCNN on FPGA: Acceleration of the Point Cloud Classifier Using FPGAs

Abstract

Access this article

Similar content being viewed by others

Data-Intensive Computing Acceleration with Python in Xilinx FPGA

Comparison of Different Deployment Approaches of FPGA-Based Hardware Accelerator for 3D Object Detection Models

Accelerating DNN-based 3D point cloud processing for mobile computing

Data Availability

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation