Abstract
Identification and fitting is an important task in reverse engineering and virtual/augmented reality. Compared to the traditional approaches, carrying out such tasks with a deep learning-based method have much room to exploit. This paper presents SMA-Net (Spatial Merge Attention Network), a novel deep learning-based end-to-end bottom-up architecture, specifically focused on fast identification and fitting of CAD models from point clouds. The network is composed of three parts whose strengths are clearly highlighted: voxel-based multi-resolution feature extractor, spatial merge attention mechanism and multi-task head. It is trained with both virtually-generated point clouds and as-scanned ones created from multiple instances of CAD models, themselves obtained with randomly generated parameter values. Using this data generation pipeline, the proposed approach is validated on two different data sets that have been made publicly available: robot data set for Industry 4.0 applications, and furniture data set for virtual/augmented reality. Experiments show that this reconstruction strategy achieves compelling and accurate results in a very high speed, and that it is very robust on real data obtained for instance by laser scanner and Kinect.
Similar content being viewed by others
References
Armeni I, Sener O, Zamir AR, Jiang H, Brilakis I, Fischer M, Savarese S (2016) 3D semantic parsing of large-scale indoor spaces. In IEEE Conf. on comput. vision and pattern recognition, pp 1534–1543
Avetisyan A, Dahnert M, Dai A, Savva M, Chang AX, Nießner M (2019) Scan2cad: learning cad model alignment in rgb-d scans. In Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 2614–2623
Avetisyan A, Dai A, Nießner M (2019) End-to-end cad model retrieval and 9dof alignment in 3d scans. In: Proceedings of the IEEE/CVF International Conference on computer vision, pp 2551–2560
Ba JL, Kiros JR, Hinton GE (2016) Layer normalization. arXiv preprint arXiv:1607.06450
Bey A, Chaine R, Marc R, Thibault G, Akkouche S (2011) Reconstruction of consistent 3d CAD models from point cloud data using a priori CAD models. ISPRS 3812:289–294
Buonamici F, Carfagni M, Furferi R, Governi L, Lapini A, Volpe Y (2018) Reverse engineering modeling methods and tools: a survey. Comput-Aided Des Appl 15(3):443–464
Buonamici F, Carfagni M, Furferi R, Governi L, Lapini A, Volpe Y (2018) Reverse engineering of mechanical parts: a template-based approach. J Comput Des Eng 5(2):145–159
Charles RQ, Su H, Kaichun M, Guibas LJ (2017) PointNet: deep learning on point sets for 3D classification and segmentation. In: 2017 IEEE Conf. on comput. vision and pattern recognition (CVPR), pp 77–85
Chen X, Ma H, Wan J, Li B, Xia T (2017) Multi-view 3D object detection network for autonomous driving. In: 2017 IEEE Conference on computer vision and pattern recognition (CVPR), pp 6526–6534
Choy C, Dong W, Koltun V (2020) Deep global registration. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 2514–2523
Choy C, Gwak J, Savarese S (2019) 4D spatio-temporal ConvNets: Minkowski convolutional neural networks. In: 2019 IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 3070–3079
Choy C, Park J, Koltun V (2019) Fully convolutional geometric features. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp 8958–8966
Chu X, Tian Z, Wang Y, Zhang B, Ren H, Wei X, Xia H, Shen C (2021) Twins: revisiting the design of spatial attention in vision transformers. 1(2):3 arXiv preprint arXiv:2104.13840
Dai A, Chang AX, Savva M, Halber M, Funkhouser T, Nießner M (2017) ScanNet: richly-annotated 3D reconstructions of indoor scenes. In: IEEE Conf. on comput. vision and pattern recognition, pp 2432–2443
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, et al (2020) An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929
Erdös G, Nakano T, Váncza J (2014) Adapting CAD models of complex engineering objects to measured point cloud data. CIRP Ann 63(1):157–160
Ester M, Kriegel H-P, Sander J, Xu X (1996) Density-based spatial clustering of applications with noise. In: Int. Conf. knowledge discovery and data mining, vol 240, p 6
Fayolle P-A, Pasko A (2015) User-assisted reverse modeling with evolutionary algorithms. In: IEEE Congress on evolutionary computation, pp 2176–2183. https://doi.org/10.1109/CEC.2015.7257153
Fischler MA, Bolles RC (1981) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun ACM 24(6):381–395
Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? The kitti vision benchmark suite. In: 2012 IEEE Conference on computer vision and pattern recognition, pp 3354–3361
Gelfand N, Mitra NJ, Guibas LJ, Pottmann H (2005) Robust global registration. In: Symposium on geometry processing, vol 2, no 3, p 5
Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: Proceedings of the fourteenth international conference on artificial intelligence and statistics. JMLR Workshop and Conference Proceedings, pp 315–323
Graham B, Engelcke M, Maaten LVD (2018) 3D semantic segmentation with submanifold sparse convolutional networks. In: 2018 IEEE/CVF Conference on computer vision and pattern recognition, pp 9224–9232
Graham B, Engelcke M, Van Der Maaten L (2018) 3d semantic segmentation with submanifold sparse convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 9224–9232
Graham B, van der Maaten L (2017) Submanifold sparse convolutional networks. arXiv preprint arxiv:1706.01307
Guo R, Zou C, Hoiem D (2015) Predicting complete 3d models of indoor scenes. arxiv:1504.02437
Gupta S, Arbeláez P, Girshick R, Malik J (2015) Aligning 3d models to rgb-d images of cluttered scenes. In: 2015 IEEE Conference on computer vision and pattern recognition (CVPR), pp 4731–4740
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 770–778
Hu Q et al (2021) Learning semantic segmentation of large-scale point clouds with random sampling. In: IEEE transactions on pattern analysis and machine intelligence. https://doi.org/10.1109/TPAMI.2021.3083288
Ishimtsev V, Bokhovkin A, Artemov A, Ignatyev S, Niessner M, Zorin D, Burnaev E (2020) Cad-deform: deformable fitting of cad models to 3d scans. arXiv preprint arXiv:2007.11965
Kang Z, Li Z (2015) Primitive fitting based on the efficient multibaysac algorithm. PLoS One 10(3):e0117341
Katz S, Tal A (2015) On the visibility of point clouds. In: 2015 IEEE International Conference on computer vision (ICCV), pp 1350–1358
Khan S, Naseer M, Hayat M, Zamir SW, Khan FS, Shah M (2021) Transformers in vision: A survey. arXiv preprint arXiv:2101.01169
Kim H, Yeo C, Lee ID, Mun D (2020) Deep-learning-based retrieval of piping component catalogs for plant 3d cad model reconstruction. Comput Ind 123:103320
Kundu A, Yin X, Fathi A, Ross D, Brewington B, Funkhouser T, Pantofaru C (2020) Virtual multi-view fusion for 3d semantic segmentation. In: European Conference on computer vision. Springer, pp 518–535
Li D, Shen X, Yu Y, Guan H, Wang H, Li D (2020) Ggm-net: graph geometric moments convolution neural network for point cloud shape classification. IEEE Access 8:124989–124998
Li Y, Wu X, Chrysanthou Y, Sharf A, Cohen-Or D, Mitra NJ (2011) Globfit: consistently fitting primitives by discovering global relations. ACM Trans Graph 30(4):52:1-52:12
Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin transformer: hierarchical vision transformer using shifted windows. arXiv preprint arXiv:2103.14030
Lu Y (2017) Industry 4.0: a survey on technologies, applications and open research issues. J Ind Inf Integr 6:1–10
Mo K, Zhu S, Chang AX, Yi L, Tripathi S, Guibas LJ, Su H (2019) PartNet: a large-scale benchmark for fine-grained and hierarchical part-level 3D object understanding. In: 2019 IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 909–918
Montlahuc J, Shah GA, Polette A, Pernot J-P (2019) As-scanned point clouds generation for virtual reverse engineering of CAD assembly models. Comput-Aided Des Appl 16(6):1171–1182
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L et al (2019) Pytorch: an imperative style, high-performance deep learning library. Adv Neural Inf Process Syst 32:8026–8037
Qi CR, Su H, Nießner M, Dai A, Yan M, Guibas LJ (2016) Volumetric and multi-view cnns for object classification on 3d data. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 5648–5656
Qi CR, Yi L, Su H, Guibas LJ (2017) PointNet++: deep hierarchical feature learning on point sets in a metric space. In: Proc. of the 31st Int. Conf. on neural information processing systems, NIPS’17, pp 5105–5114
Robertson C, Fisher RB, Werghi N, Ashbrook AP (2000) Fitting of constrained feature models to poor 3D data. In: Parmee IC (ed) Evolutionary design and manufacture. Springer, London, pp 149–160
Ronneberger O, Fischer P, Brox T (2015) U-Net: convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015, Cham, 2015. Springer International Publishing, pp 234–241
Saxena A, Prasad M, Gupta A, Bharill N, Patel OP, Tiwari A, Er MJ, Ding W, Lin C-T (2017) A review of clustering techniques and developments. Neurocomputing 267:664–681
Schnabel R, Wahl R, Klein R (2007) Efficient ransac for point-cloud shape detection. Comput Graphics Forum 26(2):214–226
Sener O, Koltun V (2018) Multi-task learning as multi-objective optimization. Advances in neural information processing systems, vol 31, pp 525–536
Shah GA, Polette A, Pernot JP, Giannini F, Monti M (2021) Simulated annealing-based fitting of CAD models to point clouds of mechanical parts’ assemblies. Eng Comput 37(4):2891–2909
Shi W, Rajkumar R (2020) Point-gnn: graph neural network for 3d object detection in a point cloud. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 1711–1719
Su H, Maji S, Kalogerakis E, Learned-Miller E (2015) Multi-view convolutional neural networks for 3d shape recognition. In: Proceedings of the IEEE International Conference on computer vision, pp 945–953
Tang H, Liu Z, Zhao S, Lin Y, Lin J, Wang H, Han S (2020) Searching efficient 3D architectures with sparse point-voxel convolution. In: European Conference on computer vision (ECCV), pp 685–702
Thomas H, Qi CR, Deschaud J, Marcotegui B, Goulette F, Guibas L (2019) KPConv: flexible and deformable convolution for point clouds. In: IEEE Int. Conf. on computer vision (ICCV), pp 6410–6419
Ulyanov D, Vedaldi A, Lempitsky V (2016) Instance normalization: the missing ingredient for fast stylization. arXiv preprint arXiv:1607.08022
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, vol 30, pp 5998–6008
Wang C, Samari B, Siddiqi K (2018) Local spectral graph convolution for point set feature learning. In: Proceedings of the European Conference on computer vision (ECCV), pp 52–66
Wang L, Huang Y, Hou Y, Zhang S, Shan J (2019) Graph attention convolution for point cloud semantic segmentation. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 10296–10305
Wang S, Suo S, Ma W-C, Pokrovsky A, Urtasun R (2018) Deep parametric continuous convolutional neural networks. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 2589–2597
Willis KD, Pu Y, Luo J, Chu H, Du T, Lambourne JG, Solar-Lezama A, Matusik W (2021) Fusion 360 gallery: a dataset and environment for programmatic cad construction from human design sequences. ACM Trans Graph (TOG) 40(4):1–24
Wu S, Wu T, Lin F, Tian S, Guo G (2021) Fully transformer networks for semantic image segmentation. arXiv preprint arXiv:2106.04108
Wu W, Qi Z, Fuxin L (2019) Pointconv: deep convolutional networks on 3d point clouds. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 9621–9630
Xie Y, Tian J, Zhu XX (2020) Linking points with labels in 3d: a review of point cloud semantic segmentation. IEEE Geosci Remote Sens Mag 8(4):38–59
Xu Y, Fan T, Xu M, Zeng L, Qiao Y (2018) Spidercnn: deep learning on point sets with parameterized convolutional filters. In: Proceedings of the European conference on computer vision (ECCV), pp 87–102
Xu Y, Fan T, Xu M, Zeng L, Qiao Y (2018) Spidercnn: deep learning on point sets with parameterized convolutional filters. In: Proceedings of the European conference on computer vision (ECCV), pp 87–102
Yi L, Kim VG, Ceylan D, Shen IC, Yan M, Su H et al (2016) A scalable active framework for region annotation in 3d shape collections. ACM Trans Graph (ToG) 35(6):1–12
Zhao H, Jiang L, Fu C-W, Jia J (2019) Pointweb: enhancing local neighborhood features for point cloud processing. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 5565–5573
Zhu B, Jiang Z, Zhou X, Li Z, Yu G (2019) Class-balanced grouping and sampling for point cloud 3d object detection. arXiv preprint arXiv:1908.09492
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Hu, S., Polette, A. & Pernot, JP. SMA-Net: Deep learning-based identification and fitting of CAD models from point clouds. Engineering with Computers 38, 5467–5488 (2022). https://doi.org/10.1007/s00366-022-01648-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00366-022-01648-z