Skip to main content
Log in

3D Recognition: State of the Art and Trends

  • SURVEY ARTICLES
  • Published:
Automation and Remote Control Aims and scope Submit manuscript

An Erratum to this article was published on 01 July 2022

This article has been updated

Abstract

We consider the field of three-dimensional technical vision and in particular three-dimensional recognition. The problems of three-dimensional vision are singled out, and methods for obtaining and presenting three-dimensional data, as well as applications of three-dimensional vision, are reviewed. Deep learning methods in 3D recognition problems are surveyed. The main modern trends in this field are revealed. So far, quite a few neural network architectures, convolutional layers, sampling, pooling, and aggregation operations, and methods for representing and processing three-dimensional input data have been proposed. The field is under active development, with the greatest variety of methods being presented for point clouds.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.
Fig. 6.
Fig. 7.
Fig. 8.

Similar content being viewed by others

Change history

REFERENCES

  1. COLMAP Project Page on Github.io—Main Page. https://colmap.github.io. Cited August 3, 2021.

  2. COLMAP Project Page on Github.io—Datasets. https://colmap.github.io/datasets.html. Cited August 3, 2021.

  3. Pérez, P. and Iván, R., Blurring the boundaries between real and artificial in architecture and urban design through the use artificial intelligence, PhD Thesis, Univ. Coruña, 2017.

  4. Neubauer, W., Doneus, M., Studnicka, N., and Riegl, J., Combined high resolution laser scanning and photogrammetrical documentation of the pyramids at Giza, CIPA XX Int. Symp. (Citeseer, 2005), pp. 470–475.

  5. McCarthy, J.K., Benjamin, J., Winton, T., and van Duivenvoorde, W., 3D Recording and Interpretation for Maritime Archaeology, Springer Nature, 2019.

  6. Hoiem, D. and Savarese, S., Representations and techniques for 3D object recognition and scene interpretation, Synth. Lect. Artif. Intell. Mach. Learn., 2011, vol. 5, no. 5, pp. 1–169.

  7. Biederman, I., On the semantics of a glance at a scene, Perceptual Organ., 1981, vol. 213, p. 253.

    Google Scholar 

  8. Bello, S.A., Yu, S., Wang, C., Adam, J.M., and Li, J., Deep learning on 3D point clouds, Remote Sensing, 2020, vol. 12, no. 11, p. 1729.

    Article  Google Scholar 

  9. Maturana, D. and Scherer, S., 3D convolutional neural networks for landing zone detection from lidar, IEEE ICRA. IEEE, 2015, pp. 3471–3478.

  10. Maturana, D. and Scherer, S., Voxnet: a 3D convolutional neural network for real-time object recognition, IEEE/RSJ IROS. IEEE, 2015, pp. 922–928.

  11. Qi, C.R., Su, H., Nießner, M., Dai, A., Yan, M., and Guibas, L.J., Volumetric and multi-view CNNs for object classification on 3D data, Proc. CVPR, 2016, pp. 5648–5656.

  12. Wang, C., Cheng, M., Sohel, F., Bennamoun, M., and Li, J., NormalNet: a voxel-based CNN for 3D object classification and retrieval, Neurocomputing, 2019, vol. 323, pp. 139–147.

    Article  Google Scholar 

  13. Ghadai, S., Lee, X., Balu, A., Sarkar, S., and Krishnamurthy, A., Multi-resolution 3D convolutional neural networks for object recognition. 2018. .

  14. Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., and Xiao, J., 3D ShapeNets: a deep representation for volumetric shapes, Proc. CVPR, 2015, pp. 1912–1920.

  15. Riegler, G., Osman Ulusoy, A., and Geiger, A., Octnet: learning deep 3D representations at high resolutions, Proc. CVPR, 2017, pp. 3577–3586.

  16. Tatarchenko, M., Dosovitskiy, A., and Brox, T., Octree generating networks: efficient convolutional architectures for high-resolution 3D outputs, Proc. IEEE Int. Conf. Comput. Vision, 2017, pp. 2088–2096.

  17. Su, Hang., Maji, S., Kalogerakis, E., and Learned-Miller, E., Multi-view convolutional neural networks for 3D shape recognition, Proc. IEEE Int. Conf. Comput. Vision, 2015, pp. 945–953.

  18. Leng, B., Guo, S., Zhang, X., and Xiong, Z., 3D object retrieval with stacked local convolutional autoencoder, Signal Process., 2015, vol. 112, pp. 119–128.

    Article  Google Scholar 

  19. Bai, S., Bai, X., Zhou, Z., Zhang, Z., and Jan Latecki, L., Gift: a real-time and scalable 3D shape search engine, Proc. CVPR, 2016, pp. 5023–5032.

  20. Kalogerakis, E., Averkiou, M., Maji, S., and Chaudhuri, S., 3D shape segmentation with projective convolutional networks, Proc. CVPR, 2017, pp. 3779–3788.

  21. Cao, Z., Huang, Q., and Karthik, R., 3D object classification via spherical projections, 2017 Int. Conf. 3D Vision (3DV), IEEE, 2017, pp. 566–574.

  22. Zhang, L., Sun, J., and Zheng, Q., 3D point cloud recognition based on a multi-view convolutional neural network, Sensors, 2018, vol. 18, no. 11, p. 3681.

    Article  Google Scholar 

  23. Kanezaki, A., Matsushita, Y., and Nishida, Y., RotationNet: joint object categorization and pose estimation using multiviews from unsupervised viewpoints, Proc. CVPR, 2018, pp. 5010–5019.

  24. Kundu, A., Yin, X., Fathi, A., Ross, D., Brewington, B., Funkhouser, T., and Pantofaru, C., Virtual multi-view fusion for 3D semantic segmentation, Eur. Conf. Comput. Vision (ECCV), Springer, 2020, pp. 518–535.

  25. Su, H., Jampani, V., Sun, D., Maji, S., Kalogerakis, E., Yang, M.-H., and Kautz, J., Splatnet: sparse lattice networks for point cloud processing, Proc. CVPR, 2018, pp. 2530–2539.

  26. Rao, Y., Lu, J., and Zhou, J., Spherical fractal convolutional neural networks for point cloud recognition, Proc. CVPR, 2019, pp. 452–460.

  27. Qi, C.R., Su, H., Mo, K., and Guibas, L.J., Pointnet: deep learning on point sets for 3D classification and segmentation, Proc. CVPR, 2017, pp. 652–660.

  28. Qi, C.R., Yi, L., Su, H., and Guibas, L.J., Pointnet++: deep hierarchical feature learning on point sets in a metric space, 2017. .

  29. Zhou, Y. and Tuzel, O., Voxelnet: end-to-end learning for point cloud based 3D object detection, Proc. CVPR, 2018, pp. 4490–4499.

  30. Li, J., Chen, B.M., and Lee, G.H., So-Net: self-organizing network for point cloud analysis, Proc. CVPR, 2018, pp. 9397–9406.

  31. Hua, B.-S., Tran, M.-K., and Yeung, S.-K., Pointwise convolutional neural networks, Proc. CVPR, 2018, pp. 984–993.

  32. Zhao, Y., Birdal, T., Deng, H., and Tombari, F., 3D point capsule networks, Proc. CVPR, 2019, pp. 1009–1018.

  33. Sabour, S., Frosst, N., and Hinton, G.E., Dynamic routing between capsules, 2017. .

  34. Li, Y., Bu, R., Sun, M., Wu, W., Di, X., and Chen, B., PointCNN: Convolution on \( \chi \)-transformed points, 2018. .

  35. Zhao, H., Jiang, L., Fu, C.-W., and Jia, J., Pointweb: enhancing local neighborhood features for point cloud processing, Proc. CVPR, 2019, pp. 5565–5573.

  36. Wu, W., Qi, Z., and Fuxin, L., PointConv: deep convolutional networks on 3D point clouds, Proc. CVPR, 2019, pp. 9621–9630.

  37. Liu, Y., Fan, B., Xiang, S., and Pan, C., Relation-shape convolutional neural network for point cloud analysis, Proc. CVPR, 2019, pp. 8895–8904.

  38. Lan, S., Yu, R., Yu, G., and Davis, L.S., Modeling local geometric structure of 3D point clouds using Geo-CNN, Proc. CVPR, 2019, pp. 998–1008.

  39. Komarichev, A., Zhong, Z., and Hua, J., A-CNN: annularly convolutional neural networks on point clouds, Proc. CVPR, 2019, pp. 7421–7430.

  40. Xu, Y., Fan, T., Xu, M., Zeng, L., and Qiao, Y., Spidercnn: deep learning on point sets with parameterized convolutional filters, Proc. ECCV, 2018, pp. 87–102.

  41. Arshad, S., Shahzad, M., Riaz, Q., and Fraz, M.M., DPRNet: deep 3D point based residual network for semantic segmentation and classification of 3D point clouds, IEEE Access, 2019, vol. 7, pp. 68892–68904.

    Article  Google Scholar 

  42. Klambauer, G., Unterthiner, T., Mayr, A., and Hochreiter, S., Self-normalizing neural networks, 2017. .

  43. Yang, J., Zhang, Q., Ni, B., Li, L., Liu, J., Zhou, M., and Tian, Q., Modeling point clouds with self-attention and Gumbel subset sampling, Proc. CVPR, 2019, pp. 3323–3332.

  44. Liu, J., Ni B., Li, C., Yang, J., and Tian, Q., Dynamic points agglomeration for hierarchical point sets learning, Proc. CVPR, 2019. pp. 7546–7555.

  45. Zhang, M., You, H., Kadam, P., Liu, S., and Kuo, C.-C.J., Pointhop: an explainable machine learning method for point cloud classification, IEEE Trans. Multimedia, 2020, vol. 22, no. 7, pp. 1744–1755.

    Article  Google Scholar 

  46. Kuo, C.-C.J., Zhang, M., Li, S., Duan, J., and Chen, Y., Interpretable convolutional neural networks via feedforward design, J. Visual Commun. Image Representation, 2019, vol. 60, pp. 346–359.

    Article  Google Scholar 

  47. Zhang, M., Wang, Y., Kadam, P., Liu, S., and Kuo, C.-C.J., Pointhop++: a lightweight learning model on point sets for 3D classification, IEEE Int. Conf. Image Process. (ICIP), 2020, pp. 3319–3323.

  48. Kadam, P., Zhang, M., Liu, S., and Kuo, C.-C.J., R-PointHop: a green, accurate and unsupervised point cloud registration method, 2021. .

  49. Chen, N., Liu, L., Cui, Z., Chen, R., Ceylan, D., Tu, C., and Wang, W., Unsupervised learning of intrinsic structural representation points, Proc. CVPR, 2020, pp. 9121–9130.

  50. Klokov, R. and Lempitsky, V., Escape from cells: deep Kd-networks for the recognition of 3D point cloud models, Proc. IEEE Int. Conf. Comput. Vision, 2017, pp. 863–872.

  51. Wang, Y., Sun, Y., Liu, Z., Sarma, S.E., Bronstein, M.M., and Solomon, J.M., Dynamic graph CNN for learning on point clouds, ACM Trans. Graphics (TOG), 2019, vol. 38, no. 5, pp. 1–12.

    Article  Google Scholar 

  52. Wang, C., Samari, B., and Siddiqi, K., Local spectral graph convolution for point set feature learning, Proc. ECCV, 2018, pp. 52–66

  53. Han, W., Wen, C., Wang, C., Li, X., and Li, Q., Point2Node: correlation learning of dynamic-node for point cloud feature modeling, Proc. AAAI Conf. Artif. Intell., 2020, vol. 34, pp. 10925–10932.

    Google Scholar 

  54. Landrieu, L. and Simonovsky, M., Large-scale point cloud semantic segmentation with superpoint graphs, Proc. CVPR, 2018, pp. 4558–4567.

  55. Landrieu, L. and Boussaha, M., Point cloud oversegmentation with graph-structured deep metric learning, Proc. CVPR, 2019, pp. 7440–7449.

  56. Wang, L., Huang, Y., Hou, Y., Zhang, S., and Shan, J., Graph attention convolution for point cloud semantic segmentation, Proc. CVPR, 2019, pp. 10296–10305.

  57. Lin, Z.-H., Huang, S.-Y., and Wang, Y.-C.F., Convolution in the cloud: learning deformable kernels in 3D graph convolution networks for point cloud analysis, Proc. CVPR, 2020, pp. 1800–1809.

  58. Xiang, T., Zhang, C., Song, Y., Yu, J., and Cai, W., Walk in the cloud: learning curves for point clouds shape analysis, 2021. .

  59. Feng, Y., Feng, Y., You, H., Zhao, X., and Gao, Y., MeshNet: mesh neural network for 3D shape representation, Proc. AAAI Conf. Artif. Intell., 2019, vol. 33, pp. 8279–8286.

    Google Scholar 

  60. Muzahid, A., Wan, W., Sohel, F., Wu, L., and Hou, L., CurveNet: curvature-based multitask learning deep networks for 3D object recognition, IEEE/CAA J. Autom. Sin., 2020, vol. 8, no. 6, pp. 1177–1187.

    Article  Google Scholar 

  61. Qiao, Y.-L., Gao, L., Rosin, P., Lai, Y.-K., and Chen, X., Learning on 3D meshes with Laplacian encoding and pooling, IEEE Trans. Visualization Comput. Graphics., 2020.

  62. Lahav, A. and Tal, A., MeshWalker: deep mesh understanding by random walks, ACM Trans. Graphics (TOG), 2020, vol. 39, no. 6, pp. 1–13.

    Article  Google Scholar 

  63. Yang, Z., Litany, O., Birdal, T., Sridhar, S., and Guibas, L., Continuous geodesic convolutions for learning on 3D shapes, Proc. IEEE/CVF Winter Conf. Appl. Comput. Vision, 2021, pp. 134–144.

  64. Yuan, S. and Fang, Y., Ross: robust learning of one-shot 3D shape segmentation, Proc. IEEE/CVF Winter Conf. Appl. Comput. Vision, 2020, pp. 1961–1969.

  65. Gao, L., Wu, T., Yuan, Y.-J., Lin, M.-X., Lai, Y.-K., and Zhang, H., TM-Net: deep generative networks for textured meshes, 2020. .

Download references

Funding

This work was supported by the Russian Foundation for Basic Research, project no. 20-37-90039.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to S. R. Orlova or A. V. Lopata.

Additional information

Translated by V. Potapchouck

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Orlova, S.R., Lopata, A.V. 3D Recognition: State of the Art and Trends. Autom Remote Control 83, 503–519 (2022). https://doi.org/10.1134/S0005117922040014

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1134/S0005117922040014

Keywords

Navigation