Skip to main content

SMA-Net: Deep learning-based identification and fitting of CAD models from point clouds

Abstract

Identification and fitting is an important task in reverse engineering and virtual/augmented reality. Compared to the traditional approaches, carrying out such tasks with a deep learning-based method have much room to exploit. This paper presents SMA-Net (Spatial Merge Attention Network), a novel deep learning-based end-to-end bottom-up architecture, specifically focused on fast identification and fitting of CAD models from point clouds. The network is composed of three parts whose strengths are clearly highlighted: voxel-based multi-resolution feature extractor, spatial merge attention mechanism and multi-task head. It is trained with both virtually-generated point clouds and as-scanned ones created from multiple instances of CAD models, themselves obtained with randomly generated parameter values. Using this data generation pipeline, the proposed approach is validated on two different data sets that have been made publicly available: robot data set for Industry 4.0 applications, and furniture data set for virtual/augmented reality. Experiments show that this reconstruction strategy achieves compelling and accurate results in a very high speed, and that it is very robust on real data obtained for instance by laser scanner and Kinect.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

References

  1. Armeni I, Sener O, Zamir AR, Jiang H, Brilakis I, Fischer M, Savarese S (2016) 3D semantic parsing of large-scale indoor spaces. In IEEE Conf. on comput. vision and pattern recognition, pp 1534–1543

  2. Avetisyan A, Dahnert M, Dai A, Savva M, Chang AX, Nießner M (2019) Scan2cad: learning cad model alignment in rgb-d scans. In Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 2614–2623

  3. Avetisyan A, Dai A, Nießner M (2019) End-to-end cad model retrieval and 9dof alignment in 3d scans. In: Proceedings of the IEEE/CVF International Conference on computer vision, pp 2551–2560

  4. Ba JL, Kiros JR, Hinton GE (2016) Layer normalization. arXiv preprint arXiv:1607.06450

  5. Bey A, Chaine R, Marc R, Thibault G, Akkouche S (2011) Reconstruction of consistent 3d CAD models from point cloud data using a priori CAD models. ISPRS 3812:289–294

    Google Scholar 

  6. Buonamici F, Carfagni M, Furferi R, Governi L, Lapini A, Volpe Y (2018) Reverse engineering modeling methods and tools: a survey. Comput-Aided Des Appl 15(3):443–464

    Article  Google Scholar 

  7. Buonamici F, Carfagni M, Furferi R, Governi L, Lapini A, Volpe Y (2018) Reverse engineering of mechanical parts: a template-based approach. J Comput Des Eng 5(2):145–159

    Google Scholar 

  8. Charles RQ, Su H, Kaichun M, Guibas LJ (2017) PointNet: deep learning on point sets for 3D classification and segmentation. In: 2017 IEEE Conf. on comput. vision and pattern recognition (CVPR), pp 77–85

  9. Chen X, Ma H, Wan J, Li B, Xia T (2017) Multi-view 3D object detection network for autonomous driving. In: 2017 IEEE Conference on computer vision and pattern recognition (CVPR), pp 6526–6534

  10. Choy C, Dong W, Koltun V (2020) Deep global registration. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 2514–2523

  11. Choy C, Gwak J, Savarese S (2019) 4D spatio-temporal ConvNets: Minkowski convolutional neural networks. In: 2019 IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 3070–3079

  12. Choy C, Park J, Koltun V (2019) Fully convolutional geometric features. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp 8958–8966

  13. Chu X, Tian Z, Wang Y, Zhang B, Ren H, Wei X, Xia H, Shen C (2021) Twins: revisiting the design of spatial attention in vision transformers. 1(2):3 arXiv preprint arXiv:2104.13840

  14. Dai A, Chang AX, Savva M, Halber M, Funkhouser T, Nießner M (2017) ScanNet: richly-annotated 3D reconstructions of indoor scenes. In: IEEE Conf. on comput. vision and pattern recognition, pp 2432–2443

  15. Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, et al (2020) An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929

  16. Erdös G, Nakano T, Váncza J (2014) Adapting CAD models of complex engineering objects to measured point cloud data. CIRP Ann 63(1):157–160

    Article  Google Scholar 

  17. Ester M, Kriegel H-P, Sander J, Xu X (1996) Density-based spatial clustering of applications with noise. In: Int. Conf. knowledge discovery and data mining, vol 240, p 6

  18. Fayolle P-A, Pasko A (2015) User-assisted reverse modeling with evolutionary algorithms. In: IEEE Congress on evolutionary computation, pp 2176–2183. https://doi.org/10.1109/CEC.2015.7257153

  19. Fischler MA, Bolles RC (1981) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun ACM 24(6):381–395

    MathSciNet  Article  Google Scholar 

  20. Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? The kitti vision benchmark suite. In: 2012 IEEE Conference on computer vision and pattern recognition, pp 3354–3361

  21. Gelfand N, Mitra NJ, Guibas LJ, Pottmann H (2005) Robust global registration. In: Symposium on geometry processing, vol 2, no 3, p 5

  22. Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: Proceedings of the fourteenth international conference on artificial intelligence and statistics. JMLR Workshop and Conference Proceedings, pp 315–323

  23. Graham B, Engelcke M, Maaten LVD (2018) 3D semantic segmentation with submanifold sparse convolutional networks. In: 2018 IEEE/CVF Conference on computer vision and pattern recognition, pp 9224–9232

  24. Graham B, Engelcke M, Van Der Maaten L (2018) 3d semantic segmentation with submanifold sparse convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 9224–9232

  25. Graham B, van der Maaten L (2017) Submanifold sparse convolutional networks. arXiv preprint arxiv:1706.01307

  26. Guo R, Zou C, Hoiem D (2015) Predicting complete 3d models of indoor scenes. arxiv:1504.02437

  27. Gupta S, Arbeláez P, Girshick R, Malik J (2015) Aligning 3d models to rgb-d images of cluttered scenes. In: 2015 IEEE Conference on computer vision and pattern recognition (CVPR), pp 4731–4740

  28. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 770–778

  29. Hu Q et al (2021) Learning semantic segmentation of large-scale point clouds with random sampling. In: IEEE transactions on pattern analysis and machine intelligence. https://doi.org/10.1109/TPAMI.2021.3083288

  30. Ishimtsev V, Bokhovkin A, Artemov A, Ignatyev S, Niessner M, Zorin D, Burnaev E (2020) Cad-deform: deformable fitting of cad models to 3d scans. arXiv preprint arXiv:2007.11965

  31. Kang Z, Li Z (2015) Primitive fitting based on the efficient multibaysac algorithm. PLoS One 10(3):e0117341

    Article  Google Scholar 

  32. Katz S, Tal A (2015) On the visibility of point clouds. In: 2015 IEEE International Conference on computer vision (ICCV), pp 1350–1358

  33. Khan S, Naseer M, Hayat M, Zamir SW, Khan FS, Shah M (2021) Transformers in vision: A survey. arXiv preprint arXiv:2101.01169

  34. Kim H, Yeo C, Lee ID, Mun D (2020) Deep-learning-based retrieval of piping component catalogs for plant 3d cad model reconstruction. Comput Ind 123:103320

    Article  Google Scholar 

  35. Kundu A, Yin X, Fathi A, Ross D, Brewington B, Funkhouser T, Pantofaru C (2020) Virtual multi-view fusion for 3d semantic segmentation. In: European Conference on computer vision. Springer, pp 518–535

  36. Li D, Shen X, Yu Y, Guan H, Wang H, Li D (2020) Ggm-net: graph geometric moments convolution neural network for point cloud shape classification. IEEE Access 8:124989–124998

    Article  Google Scholar 

  37. Li Y, Wu X, Chrysanthou Y, Sharf A, Cohen-Or D, Mitra NJ (2011) Globfit: consistently fitting primitives by discovering global relations. ACM Trans Graph 30(4):52:1-52:12

    Google Scholar 

  38. Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin transformer: hierarchical vision transformer using shifted windows. arXiv preprint arXiv:2103.14030

  39. Lu Y (2017) Industry 4.0: a survey on technologies, applications and open research issues. J Ind Inf Integr 6:1–10

    Google Scholar 

  40. Mo K, Zhu S, Chang AX, Yi L, Tripathi S, Guibas LJ, Su H (2019) PartNet: a large-scale benchmark for fine-grained and hierarchical part-level 3D object understanding. In: 2019 IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 909–918

  41. Montlahuc J, Shah GA, Polette A, Pernot J-P (2019) As-scanned point clouds generation for virtual reverse engineering of CAD assembly models. Comput-Aided Des Appl 16(6):1171–1182

    Article  Google Scholar 

  42. Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L et al (2019) Pytorch: an imperative style, high-performance deep learning library. Adv Neural Inf Process Syst 32:8026–8037

    Google Scholar 

  43. Qi CR, Su H, Nießner M, Dai A, Yan M, Guibas LJ (2016) Volumetric and multi-view cnns for object classification on 3d data. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 5648–5656

  44. Qi CR, Yi L, Su H, Guibas LJ (2017) PointNet++: deep hierarchical feature learning on point sets in a metric space. In: Proc. of the 31st Int. Conf. on neural information processing systems, NIPS’17, pp 5105–5114

  45. Robertson C, Fisher RB, Werghi N, Ashbrook AP (2000) Fitting of constrained feature models to poor 3D data. In: Parmee IC (ed) Evolutionary design and manufacture. Springer, London, pp 149–160

    Chapter  Google Scholar 

  46. Ronneberger O, Fischer P, Brox T (2015) U-Net: convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015, Cham, 2015. Springer International Publishing, pp 234–241

  47. Saxena A, Prasad M, Gupta A, Bharill N, Patel OP, Tiwari A, Er MJ, Ding W, Lin C-T (2017) A review of clustering techniques and developments. Neurocomputing 267:664–681

    Article  Google Scholar 

  48. Schnabel R, Wahl R, Klein R (2007) Efficient ransac for point-cloud shape detection. Comput Graphics Forum 26(2):214–226

    Article  Google Scholar 

  49. Sener O, Koltun V (2018) Multi-task learning as multi-objective optimization. Advances in neural information processing systems, vol 31, pp 525–536

  50. Shah GA, Polette A, Pernot JP, Giannini F, Monti M (2021) Simulated annealing-based fitting of CAD models to point clouds of mechanical parts’ assemblies. Eng Comput 37(4):2891–2909

  51. Shi W, Rajkumar R (2020) Point-gnn: graph neural network for 3d object detection in a point cloud. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 1711–1719

  52. Su H, Maji S, Kalogerakis E, Learned-Miller E (2015) Multi-view convolutional neural networks for 3d shape recognition. In: Proceedings of the IEEE International Conference on computer vision, pp 945–953

  53. Tang H, Liu Z, Zhao S, Lin Y, Lin J, Wang H, Han S (2020) Searching efficient 3D architectures with sparse point-voxel convolution. In: European Conference on computer vision (ECCV), pp 685–702

  54. Thomas H, Qi CR, Deschaud J, Marcotegui B, Goulette F, Guibas L (2019) KPConv: flexible and deformable convolution for point clouds. In: IEEE Int. Conf. on computer vision (ICCV), pp 6410–6419

  55. Ulyanov D, Vedaldi A, Lempitsky V (2016) Instance normalization: the missing ingredient for fast stylization. arXiv preprint arXiv:1607.08022

  56. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, vol 30, pp 5998–6008

  57. Wang C, Samari B, Siddiqi K (2018) Local spectral graph convolution for point set feature learning. In: Proceedings of the European Conference on computer vision (ECCV), pp 52–66

  58. Wang L, Huang Y, Hou Y, Zhang S, Shan J (2019) Graph attention convolution for point cloud semantic segmentation. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 10296–10305

  59. Wang S, Suo S, Ma W-C, Pokrovsky A, Urtasun R (2018) Deep parametric continuous convolutional neural networks. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 2589–2597

  60. Willis KD, Pu Y, Luo J, Chu H, Du T, Lambourne JG, Solar-Lezama A, Matusik W (2021) Fusion 360 gallery: a dataset and environment for programmatic cad construction from human design sequences. ACM Trans Graph (TOG) 40(4):1–24

  61. Wu S, Wu T, Lin F, Tian S, Guo G (2021) Fully transformer networks for semantic image segmentation. arXiv preprint arXiv:2106.04108

  62. Wu W, Qi Z, Fuxin L (2019) Pointconv: deep convolutional networks on 3d point clouds. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 9621–9630

  63. Xie Y, Tian J, Zhu XX (2020) Linking points with labels in 3d: a review of point cloud semantic segmentation. IEEE Geosci Remote Sens Mag 8(4):38–59

    Article  Google Scholar 

  64. Xu Y, Fan T, Xu M, Zeng L, Qiao Y (2018) Spidercnn: deep learning on point sets with parameterized convolutional filters. In: Proceedings of the European conference on computer vision (ECCV), pp 87–102

  65. Xu Y, Fan T, Xu M, Zeng L, Qiao Y (2018) Spidercnn: deep learning on point sets with parameterized convolutional filters. In: Proceedings of the European conference on computer vision (ECCV), pp 87–102

  66. Yi L, Kim VG, Ceylan D, Shen IC, Yan M, Su H et al (2016) A scalable active framework for region annotation in 3d shape collections. ACM Trans Graph (ToG) 35(6):1–12

  67. Zhao H, Jiang L, Fu C-W, Jia J (2019) Pointweb: enhancing local neighborhood features for point cloud processing. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp 5565–5573

  68. Zhu B, Jiang Z, Zhou X, Li Z, Yu G (2019) Class-balanced grouping and sampling for point cloud 3d object detection. arXiv preprint arXiv:1908.09492

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jean-Philippe Pernot.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix 1

Appendix 1

See Tables 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 and 16.

Table 5 Parameter name and index relationship of the chairs in the furniture dataset
Table 6 Parameter name and index relationship of the tables in the furniture dataset
Table 7 Result of Chair_1 in the furniture test dataset (\(N_\mathrm{test}^\mathrm{Furniture} = 3000\)), with the name of each parameter \(p_i\) provided in Table 5
Table 8 Result of Chair_2 in the furniture test dataset (\(N_\mathrm{test}^\mathrm{Furniture} = 3000\)), with the name of each parameter \(p_i\) provided in Table 5
Table 9 Result of Chair_3 in the furniture test dataset (\(N_\mathrm{test}^\mathrm{Furniture} = 3000\)), with the name of each parameter \(p_i\) provided in Table 5
Table 10 Result of Chair_4 in the furniture test dataset (\(N_\mathrm{test}^\mathrm{Furniture} = 3000\)), with the name of each parameter \(p_i\) provided in Table 5
Table 11 Result of Table_1 in the furniture test dataset (\(N_\mathrm{test}^\mathrm{Furniture} = 3000\)), with the name of each parameter \(p_i\) provided in Table 6
Table 12 Result of Table_2 in the furniture test dataset (\(N_\mathrm{test}^\mathrm{Furniture} = 3000\)), with the name of each parameter \(p_i\) provided in Table 6
Table 13 Result of Table_3 in the furniture test dataset (\(N_\mathrm{test}^\mathrm{Furniture} = 3000\)), with the name of each parameter \(p_i\) provided in Table 6
Table 14 Result of Table_4 in the furniture test dataset (\(N_\mathrm{test}^\mathrm{Furniture} = 3000\)), with the name of each parameter \(p_i\) provided in Table 6
Table 15 Result of Table_5 in the furniture test dataset (\(N_\mathrm{test}^\mathrm{Furniture} = 3000\)), with the name of each parameter \(p_i\) provided in Table 6
Table 16 Result of Table_6 in the furniture test dataset (\(N_\mathrm{test}^\mathrm{Furniture} = 3000\)), with the name of each parameter \(p_i\) provided in Table 6

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Hu, S., Polette, A. & Pernot, JP. SMA-Net: Deep learning-based identification and fitting of CAD models from point clouds. Engineering with Computers (2022). https://doi.org/10.1007/s00366-022-01648-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00366-022-01648-z

Keywords

  • Identification
  • Fitting
  • Deep learning
  • Transformer
  • Data generation
  • Virtual reality
  • Reverse engineering