SMA-Net: Deep learning-based identification and fitting of CAD models from point clouds

Hu, Sijie; Polette, Arnaud; Pernot, Jean-Philippe

doi:10.1007/s00366-022-01648-z

SMA-Net: Deep learning-based identification and fitting of CAD models from point clouds

Original Article
Published: 13 April 2022

Volume 38, pages 5467–5488, (2022)
Cite this article

Engineering with Computers Aims and scope Submit manuscript

805 Accesses
4 Citations
1 Altmetric
Explore all metrics

Abstract

Identification and fitting is an important task in reverse engineering and virtual/augmented reality. Compared to the traditional approaches, carrying out such tasks with a deep learning-based method have much room to exploit. This paper presents SMA-Net (Spatial Merge Attention Network), a novel deep learning-based end-to-end bottom-up architecture, specifically focused on fast identification and fitting of CAD models from point clouds. The network is composed of three parts whose strengths are clearly highlighted: voxel-based multi-resolution feature extractor, spatial merge attention mechanism and multi-task head. It is trained with both virtually-generated point clouds and as-scanned ones created from multiple instances of CAD models, themselves obtained with randomly generated parameter values. Using this data generation pipeline, the proposed approach is validated on two different data sets that have been made publicly available: robot data set for Industry 4.0 applications, and furniture data set for virtual/augmented reality. Experiments show that this reconstruction strategy achieves compelling and accurate results in a very high speed, and that it is very robust on real data obtained for instance by laser scanner and Kinect.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 7

Points2Surf Learning Implicit Surfaces from Point Clouds

Shape completion using orthogonal views through a multi-input–output network

Article 18 April 2023

Leonardo Delgado & Eduardo F. Morales

Primitive shape recognition from real-life scenes using the PointNet deep neural network

Article Open access 02 August 2022

Senjing Zheng & Marco Castellani

References

Armeni I, Sener O, Zamir AR, Jiang H, Brilakis I, Fischer M, Savarese S (2016) 3D semantic parsing of large-scale indoor spaces. In IEEE Conf. on comput. vision and pattern recognition, pp 1534–1543
Avetisyan A, Dahnert M, Dai A, Savva M, Chang AX, Nießner M (2019) Scan2cad: learning cad model alignment in rgb-d scans. In Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 2614–2623
Avetisyan A, Dai A, Nießner M (2019) End-to-end cad model retrieval and 9dof alignment in 3d scans. In: Proceedings of the IEEE/CVF International Conference on computer vision, pp 2551–2560
Ba JL, Kiros JR, Hinton GE (2016) Layer normalization. arXiv preprint arXiv:1607.06450
Bey A, Chaine R, Marc R, Thibault G, Akkouche S (2011) Reconstruction of consistent 3d CAD models from point cloud data using a priori CAD models. ISPRS 3812:289–294
Google Scholar
Buonamici F, Carfagni M, Furferi R, Governi L, Lapini A, Volpe Y (2018) Reverse engineering modeling methods and tools: a survey. Comput-Aided Des Appl 15(3):443–464
Article Google Scholar
Buonamici F, Carfagni M, Furferi R, Governi L, Lapini A, Volpe Y (2018) Reverse engineering of mechanical parts: a template-based approach. J Comput Des Eng 5(2):145–159
Google Scholar
Charles RQ, Su H, Kaichun M, Guibas LJ (2017) PointNet: deep learning on point sets for 3D classification and segmentation. In: 2017 IEEE Conf. on comput. vision and pattern recognition (CVPR), pp 77–85
Chen X, Ma H, Wan J, Li B, Xia T (2017) Multi-view 3D object detection network for autonomous driving. In: 2017 IEEE Conference on computer vision and pattern recognition (CVPR), pp 6526–6534
Choy C, Dong W, Koltun V (2020) Deep global registration. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 2514–2523
Choy C, Gwak J, Savarese S (2019) 4D spatio-temporal ConvNets: Minkowski convolutional neural networks. In: 2019 IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 3070–3079
Choy C, Park J, Koltun V (2019) Fully convolutional geometric features. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp 8958–8966
Chu X, Tian Z, Wang Y, Zhang B, Ren H, Wei X, Xia H, Shen C (2021) Twins: revisiting the design of spatial attention in vision transformers. 1(2):3 arXiv preprint arXiv:2104.13840
Dai A, Chang AX, Savva M, Halber M, Funkhouser T, Nießner M (2017) ScanNet: richly-annotated 3D reconstructions of indoor scenes. In: IEEE Conf. on comput. vision and pattern recognition, pp 2432–2443
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, et al (2020) An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929
Erdös G, Nakano T, Váncza J (2014) Adapting CAD models of complex engineering objects to measured point cloud data. CIRP Ann 63(1):157–160
Article Google Scholar
Ester M, Kriegel H-P, Sander J, Xu X (1996) Density-based spatial clustering of applications with noise. In: Int. Conf. knowledge discovery and data mining, vol 240, p 6
Fayolle P-A, Pasko A (2015) User-assisted reverse modeling with evolutionary algorithms. In: IEEE Congress on evolutionary computation, pp 2176–2183. https://doi.org/10.1109/CEC.2015.7257153
Fischler MA, Bolles RC (1981) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun ACM 24(6):381–395
Article MathSciNet Google Scholar
Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? The kitti vision benchmark suite. In: 2012 IEEE Conference on computer vision and pattern recognition, pp 3354–3361
Gelfand N, Mitra NJ, Guibas LJ, Pottmann H (2005) Robust global registration. In: Symposium on geometry processing, vol 2, no 3, p 5
Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: Proceedings of the fourteenth international conference on artificial intelligence and statistics. JMLR Workshop and Conference Proceedings, pp 315–323
Graham B, Engelcke M, Maaten LVD (2018) 3D semantic segmentation with submanifold sparse convolutional networks. In: 2018 IEEE/CVF Conference on computer vision and pattern recognition, pp 9224–9232
Graham B, Engelcke M, Van Der Maaten L (2018) 3d semantic segmentation with submanifold sparse convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 9224–9232
Graham B, van der Maaten L (2017) Submanifold sparse convolutional networks. arXiv preprint arxiv:1706.01307
Guo R, Zou C, Hoiem D (2015) Predicting complete 3d models of indoor scenes. arxiv:1504.02437
Gupta S, Arbeláez P, Girshick R, Malik J (2015) Aligning 3d models to rgb-d images of cluttered scenes. In: 2015 IEEE Conference on computer vision and pattern recognition (CVPR), pp 4731–4740
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 770–778
Hu Q et al (2021) Learning semantic segmentation of large-scale point clouds with random sampling. In: IEEE transactions on pattern analysis and machine intelligence. https://doi.org/10.1109/TPAMI.2021.3083288
Ishimtsev V, Bokhovkin A, Artemov A, Ignatyev S, Niessner M, Zorin D, Burnaev E (2020) Cad-deform: deformable fitting of cad models to 3d scans. arXiv preprint arXiv:2007.11965
Kang Z, Li Z (2015) Primitive fitting based on the efficient multibaysac algorithm. PLoS One 10(3):e0117341
Article Google Scholar
Katz S, Tal A (2015) On the visibility of point clouds. In: 2015 IEEE International Conference on computer vision (ICCV), pp 1350–1358
Khan S, Naseer M, Hayat M, Zamir SW, Khan FS, Shah M (2021) Transformers in vision: A survey. arXiv preprint arXiv:2101.01169
Kim H, Yeo C, Lee ID, Mun D (2020) Deep-learning-based retrieval of piping component catalogs for plant 3d cad model reconstruction. Comput Ind 123:103320
Article Google Scholar
Kundu A, Yin X, Fathi A, Ross D, Brewington B, Funkhouser T, Pantofaru C (2020) Virtual multi-view fusion for 3d semantic segmentation. In: European Conference on computer vision. Springer, pp 518–535
Li D, Shen X, Yu Y, Guan H, Wang H, Li D (2020) Ggm-net: graph geometric moments convolution neural network for point cloud shape classification. IEEE Access 8:124989–124998
Article Google Scholar
Li Y, Wu X, Chrysanthou Y, Sharf A, Cohen-Or D, Mitra NJ (2011) Globfit: consistently fitting primitives by discovering global relations. ACM Trans Graph 30(4):52:1-52:12
Article Google Scholar
Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin transformer: hierarchical vision transformer using shifted windows. arXiv preprint arXiv:2103.14030
Lu Y (2017) Industry 4.0: a survey on technologies, applications and open research issues. J Ind Inf Integr 6:1–10
Google Scholar
Mo K, Zhu S, Chang AX, Yi L, Tripathi S, Guibas LJ, Su H (2019) PartNet: a large-scale benchmark for fine-grained and hierarchical part-level 3D object understanding. In: 2019 IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 909–918
Montlahuc J, Shah GA, Polette A, Pernot J-P (2019) As-scanned point clouds generation for virtual reverse engineering of CAD assembly models. Comput-Aided Des Appl 16(6):1171–1182
Article Google Scholar
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L et al (2019) Pytorch: an imperative style, high-performance deep learning library. Adv Neural Inf Process Syst 32:8026–8037
Google Scholar
Qi CR, Su H, Nießner M, Dai A, Yan M, Guibas LJ (2016) Volumetric and multi-view cnns for object classification on 3d data. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 5648–5656
Qi CR, Yi L, Su H, Guibas LJ (2017) PointNet++: deep hierarchical feature learning on point sets in a metric space. In: Proc. of the 31st Int. Conf. on neural information processing systems, NIPS’17, pp 5105–5114
Robertson C, Fisher RB, Werghi N, Ashbrook AP (2000) Fitting of constrained feature models to poor 3D data. In: Parmee IC (ed) Evolutionary design and manufacture. Springer, London, pp 149–160
Chapter Google Scholar
Ronneberger O, Fischer P, Brox T (2015) U-Net: convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015, Cham, 2015. Springer International Publishing, pp 234–241
Saxena A, Prasad M, Gupta A, Bharill N, Patel OP, Tiwari A, Er MJ, Ding W, Lin C-T (2017) A review of clustering techniques and developments. Neurocomputing 267:664–681
Article Google Scholar
Schnabel R, Wahl R, Klein R (2007) Efficient ransac for point-cloud shape detection. Comput Graphics Forum 26(2):214–226
Article Google Scholar
Sener O, Koltun V (2018) Multi-task learning as multi-objective optimization. Advances in neural information processing systems, vol 31, pp 525–536
Shah GA, Polette A, Pernot JP, Giannini F, Monti M (2021) Simulated annealing-based fitting of CAD models to point clouds of mechanical parts’ assemblies. Eng Comput 37(4):2891–2909
Shi W, Rajkumar R (2020) Point-gnn: graph neural network for 3d object detection in a point cloud. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 1711–1719
Su H, Maji S, Kalogerakis E, Learned-Miller E (2015) Multi-view convolutional neural networks for 3d shape recognition. In: Proceedings of the IEEE International Conference on computer vision, pp 945–953
Tang H, Liu Z, Zhao S, Lin Y, Lin J, Wang H, Han S (2020) Searching efficient 3D architectures with sparse point-voxel convolution. In: European Conference on computer vision (ECCV), pp 685–702
Thomas H, Qi CR, Deschaud J, Marcotegui B, Goulette F, Guibas L (2019) KPConv: flexible and deformable convolution for point clouds. In: IEEE Int. Conf. on computer vision (ICCV), pp 6410–6419
Ulyanov D, Vedaldi A, Lempitsky V (2016) Instance normalization: the missing ingredient for fast stylization. arXiv preprint arXiv:1607.08022
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, vol 30, pp 5998–6008
Wang C, Samari B, Siddiqi K (2018) Local spectral graph convolution for point set feature learning. In: Proceedings of the European Conference on computer vision (ECCV), pp 52–66
Wang L, Huang Y, Hou Y, Zhang S, Shan J (2019) Graph attention convolution for point cloud semantic segmentation. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 10296–10305
Wang S, Suo S, Ma W-C, Pokrovsky A, Urtasun R (2018) Deep parametric continuous convolutional neural networks. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 2589–2597
Willis KD, Pu Y, Luo J, Chu H, Du T, Lambourne JG, Solar-Lezama A, Matusik W (2021) Fusion 360 gallery: a dataset and environment for programmatic cad construction from human design sequences. ACM Trans Graph (TOG) 40(4):1–24
Wu S, Wu T, Lin F, Tian S, Guo G (2021) Fully transformer networks for semantic image segmentation. arXiv preprint arXiv:2106.04108
Wu W, Qi Z, Fuxin L (2019) Pointconv: deep convolutional networks on 3d point clouds. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 9621–9630
Xie Y, Tian J, Zhu XX (2020) Linking points with labels in 3d: a review of point cloud semantic segmentation. IEEE Geosci Remote Sens Mag 8(4):38–59
Article Google Scholar
Xu Y, Fan T, Xu M, Zeng L, Qiao Y (2018) Spidercnn: deep learning on point sets with parameterized convolutional filters. In: Proceedings of the European conference on computer vision (ECCV), pp 87–102
Xu Y, Fan T, Xu M, Zeng L, Qiao Y (2018) Spidercnn: deep learning on point sets with parameterized convolutional filters. In: Proceedings of the European conference on computer vision (ECCV), pp 87–102
Yi L, Kim VG, Ceylan D, Shen IC, Yan M, Su H et al (2016) A scalable active framework for region annotation in 3d shape collections. ACM Trans Graph (ToG) 35(6):1–12
Zhao H, Jiang L, Fu C-W, Jia J (2019) Pointweb: enhancing local neighborhood features for point cloud processing. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 5565–5573
Zhu B, Jiang Z, Zhou X, Li Z, Yu G (2019) Class-balanced grouping and sampling for point cloud 3d object detection. arXiv preprint arXiv:1908.09492

Download references

Author information

Authors and Affiliations

LISPEN, Arts et Métiers Institute of Technology, 13617, Aix-en-Provence, France
Sijie Hu, Arnaud Polette & Jean-Philippe Pernot

Authors

Sijie Hu
View author publications
You can also search for this author in PubMed Google Scholar
Arnaud Polette
View author publications
You can also search for this author in PubMed Google Scholar
Jean-Philippe Pernot
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jean-Philippe Pernot.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix 1

See Tables 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 and 16.

Table 5 Parameter name and index relationship of the chairs in the furniture dataset

Full size table

Table 6 Parameter name and index relationship of the tables in the furniture dataset

Full size table

Table 7 Result of Chair_1 in the furniture test dataset (\(N_\mathrm{test}^\mathrm{Furniture} = 3000\)), with the name of each parameter \(p_i\) provided in Table 5

Full size table

Table 8 Result of Chair_2 in the furniture test dataset (\(N_\mathrm{test}^\mathrm{Furniture} = 3000\)), with the name of each parameter \(p_i\) provided in Table 5

Full size table

Table 9 Result of Chair_3 in the furniture test dataset (\(N_\mathrm{test}^\mathrm{Furniture} = 3000\)), with the name of each parameter \(p_i\) provided in Table 5

Full size table

Table 10 Result of Chair_4 in the furniture test dataset (\(N_\mathrm{test}^\mathrm{Furniture} = 3000\)), with the name of each parameter \(p_i\) provided in Table 5

Full size table

Table 11 Result of Table_1 in the furniture test dataset (\(N_\mathrm{test}^\mathrm{Furniture} = 3000\)), with the name of each parameter \(p_i\) provided in Table 6

Full size table

Table 12 Result of Table_2 in the furniture test dataset (\(N_\mathrm{test}^\mathrm{Furniture} = 3000\)), with the name of each parameter \(p_i\) provided in Table 6

Full size table

Table 13 Result of Table_3 in the furniture test dataset (\(N_\mathrm{test}^\mathrm{Furniture} = 3000\)), with the name of each parameter \(p_i\) provided in Table 6

Full size table

Table 14 Result of Table_4 in the furniture test dataset (\(N_\mathrm{test}^\mathrm{Furniture} = 3000\)), with the name of each parameter \(p_i\) provided in Table 6

Full size table

Table 15 Result of Table_5 in the furniture test dataset (\(N_\mathrm{test}^\mathrm{Furniture} = 3000\)), with the name of each parameter \(p_i\) provided in Table 6

Full size table

Table 16 Result of Table_6 in the furniture test dataset (\(N_\mathrm{test}^\mathrm{Furniture} = 3000\)), with the name of each parameter \(p_i\) provided in Table 6

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hu, S., Polette, A. & Pernot, JP. SMA-Net: Deep learning-based identification and fitting of CAD models from point clouds. Engineering with Computers 38, 5467–5488 (2022). https://doi.org/10.1007/s00366-022-01648-z

Download citation

Received: 30 November 2021
Accepted: 18 March 2022
Published: 13 April 2022
Issue Date: December 2022
DOI: https://doi.org/10.1007/s00366-022-01648-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SMA-Net: Deep learning-based identification and fitting of CAD models from point clouds

Abstract

Access this article

Similar content being viewed by others

Points2Surf Learning Implicit Surfaces from Point Clouds

Shape completion using orthogonal views through a multi-input–output network

Primitive shape recognition from real-life scenes using the PointNet deep neural network

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix 1

Rights and permissions

About this article

Cite this article

Keywords

Navigation

SMA-Net: Deep learning-based identification and fitting of CAD models from point clouds

Abstract

Access this article

Similar content being viewed by others

Points2Surf Learning Implicit Surfaces from Point Clouds

Shape completion using orthogonal views through a multi-input–output network

Primitive shape recognition from real-life scenes using the PointNet deep neural network

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix 1

Appendix 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation