Skip to main content
Log in

DCARN: Deep Context Aware Recurrent Neural Network for Semantic Segmentation of Large Scale Unstructured 3D Point Cloud

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

Semantic segmentation of large unstructured 3D point clouds is important problem for 3D object recognition which in turn is essential to solving more complex tasks such as scene understanding. The problem is highly challenging owing to large scale of data, varying point density and localization errors of 3D points. Nevertheless, with recent successes of deep neural network architectures to solve complex 2D perceptual problems, several researchers have shown interest to translate the developed 2D networks to 3D point cloud segmentation by a prior voxelization step for an explicit neighborhood representation. However, such a 3D grid representation loses the fine details and inherent structure due to quantization artifacts. For this purpose, this paper proposes an approach to performing semantic segmentation of 3D point clouds by exploiting the idea of super-point based graph construction. The proposed architecture is composed of two cascaded modules including a light-weight representation learning module which uses unsupervised geometric grouping to partition the large-scale unstructured 3D point cloud and a deep context aware sequential network based on long short memory units and graph convolutions with embedding residual learning for semantic segmentation. The proposed model is evaluated on two standard benchmark datasets and achieves competitive performance with the existing state-of-the-art datasets. The code and the obtained results have been made public at https://github.com/saba155/DCARN.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Konrad W, Mathias R, Dieter F, Norbert H (2013) Image acquisition and model selection for multi-view stereo. Int Arch Photogramm Remote Sens Spatial Inf Sci 40(5W):251–258

    Google Scholar 

  2. Rottensteiner F, Trinder J, Clode S (2005) Data acquisition for 3d city models from lidar extracting buildings and roads. In : Proceedings 2005 IEEE international geoscience and remote sensing symposium, 2005. IGARSS’05, vol 1, pp 4

  3. Buckley Simon J, Howell JA, Enge HD, Kurz TH (2008) Terrestrial laser scanning in geology: data acquisition, processing and accuracy considerations. J Geol Soc 165(3):625–638

    Article  Google Scholar 

  4. Muhammad S, Xiang ZX (2015) Automatic detection and reconstruction of 2-d/3-d building shapes from spaceborne tomosar point clouds. IEEE Trans Geosci Remote Sens 54(3):1292–1310

    Google Scholar 

  5. Shahzad M, Zhu XX (2014) Robust reconstruction of building facades for large areas using spaceborne tomosar point clouds. IEEE Trans Geosci Remote Sens 53(2):752–769

    Article  Google Scholar 

  6. Wolf D, Prankl J, Vincze M (2015) Fast semantic segmentation of 3d point clouds using a dense crf with learned parameters. In: 2015 IEEE International conference on robotics and automation (ICRA), pp 4867–4873

  7. Guinard S, Landrieu L (2017) Weakly supervised segmentation-aided classification of urban scenes from 3d lidar point clouds. In: International archives of the photogrammetry, remote sensing and spatial information sciences, vol XLII-1/W1

  8. Hu H, Munoz D, Bagnell JA, Hebert M (2013) Efficient 3-d scene analysis from streaming data. In: 2013 IEEE international conference on robotics and automation, pp 2297–2304

  9. Weinmann M, Hinz S, Weinmann M (2017) A hybrid semantic point cloud classification-segmentation framework based on geometric features and semantic rules. PFG-J Photogramm Remote Sens Geoinf Sci 85(3):183–194

    Google Scholar 

  10. Liu Y, Xu C, Chen Z, Chen C, Zhao H, Jin X (2020) Deep dual-stream network with scale context selection attention module for semantic segmentation. Neural Process Lett 51:2281–2299. https://doi.org/10.1007/s11063-019-10148-z

    Article  Google Scholar 

  11. Zhang R, Candra SA, Vetter K, Zakhor A (2015) Sensor fusion for semantic segmentation of urban scenes. In: 2015 IEEE international conference on robotics and automation (ICRA), pp 1850–1857

  12. Becker C, Häni N, Rosinskaya E, d’Angelo E, Strecha C (2017) Classification of aerial photogrammetric 3d point clouds. arXiv preprint arXiv:1705.08374

  13. Timo H, Wegner JD, Konrad S (2016) Fast semantic segmentation of 3d point clouds with strongly varying density. ISPRS Ann Photogramm Remote Sens Spatial Inf Sci 3(3):177–184

    Google Scholar 

  14. Munoz D, Vandapel N, Hebert M (2008) Directional associative markov network for 3-d point cloud classification. In: Proceedings of 3DPVT’08 - the fourth international symposium on 3D data processing, visualization and transmission, June 18–20. Georgia Institute of Technology, Atlanta, GA, USA

  15. Vosselman G (2013) Point cloud segmentation for urban scene classification. Int Arch Photogramm Remote Sens Spatial Inf Sci 1:257–262

    Article  Google Scholar 

  16. Maturana D, Scherer S (2015) Voxnet: A 3d convolutional neural network for real-time object recognition. In: 2015 IEEE/RSJ international conference on intelligent robots and systems (IROS), pp 922–928

  17. Tchapmi L, Choy C, Armeni I, Gwak J, Savarese S (2017) Segcloud: semantic segmentation of 3d point clouds. In: 2017 international conference on 3D vision (3DV), pp 537–547

  18. Yang Yu, Qingbiao W, Khan Y, Chen M (2015) An adaptive variation model for point cloud normal computation. Neural Comput Appl 26(6):1451–1460

    Article  Google Scholar 

  19. Li C, Bing L, Zhang Y, Liu H, Yanyun Q (2018) 3d reconstruction of indoor scenes via image registration. Neural Process Lett 48(3):1281–1304

    Article  Google Scholar 

  20. Qi CR, Su H, Mo K, Guibas LJ (2017) Pointnet: deep learning on point sets for 3d classification and segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 652–660

  21. Hua B-S, Tran M-K, Yeung S-K (2018) Pointwise convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 984–993

  22. Huang Q, Wang W, Neumann U (2018) Recurrent slice networks for 3d segmentation of point clouds. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2626–2635

  23. Qi CR, Yi L, Su H, Guibas LJ (2017) PointNet++: deep hierarchical feature learning on point sets in a metric space. In: 31st conference on neural information processing systems (NIPS 2017). Long Beach, CA, USA

  24. Wang W, Yu R, Huang Q, Neumann U (2018) Sgpn: similarity group proposal network for 3d point cloud instance segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2569–2578

  25. Yi L, Zhao W, Wang H, Sung M, Guibas LJ (2019) Gspn: generative shape proposal network for 3d instance segmentation in point cloud. In: The IEEE conference on computer vision and pattern recognition (CVPR), June

  26. Shi S, Wang X, Li H (2019) Pointrcnn: 3d object proposal generation and detection from point cloud. In: The IEEE conference on computer vision and pattern recognition (CVPR), June

  27. Wang X, Liu S, Shen X, Shen C, Jia J (2019) Associatively segmenting instances and semantics in point clouds. In: The IEEE conference on computer vision and pattern recognition (CVPR), June

  28. Chen C, Li G, Xu R, Chen T, Wang M, Lin L (2019) Clusternet: deep hierarchical cluster network with rigorously rotation-invariant representation for point cloud analysis. In: The IEEE conference on computer vision and pattern recognition (CVPR), June

  29. Pham Q-H, Nguyen T, Hua B-S, Roig G, Yeung S-K (2019) Jsis3d: joint semantic-instance segmentation of 3d point clouds with multi-task pointwise networks and multi-value conditional random fields. In: The IEEE conference on computer vision and pattern recognition (CVPR), June

  30. Lang AH, Vora S, Caesar H, Zhou L, Yang J, Beijbom O (2019) Pointpillars: fast encoders for object detection from point clouds. In: The IEEE conference on computer vision and pattern recognition (CVPR), June

  31. Landrieu L, Simonovsky M (2018) Large-scale point cloud semantic segmentation with superpoint graphs. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4558–4567

  32. Wang Y, Ji R, Chang S-F (2013) Label propagation from imagenet to 3d point clouds. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3135–3142

  33. Tian Z, Liu L, Zhang Z, Fei B (2015) Superpixel-based segmentation for 3d prostate mr images. IEEE Trans Med Imag 35(3):791–801

    Article  Google Scholar 

  34. Xu J, Ishikawa H, Wollstein G, Schuman JS (2011) 3d optical coherence tomography super pixel with machine classifier analysis for glaucoma detection. In: 2011 Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pp 3395–3398

  35. Yang H, Zhang H (2016) Efficient 3d room shape recovery from a single panorama. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5422–5430

  36. Landrieu L, Boussaha M (2019) Point cloud oversegmentation with graph-structured deep metric learning. In: The IEEE conference on computer vision and pattern recognition (CVPR) June

  37. Achanta R, Shaji A, Smith K, Lucchi A, Fua P, Süsstrunk S (2012) Slic superpixels compared to state-of-the-art superpixel methods. IEEE Trans Pattern Anal Mach Intel 34(11):2274–2282

    Article  Google Scholar 

  38. Aubry M, Schlickewei U, Cremers D (2011) The wave kernel signature: a quantum mechanical approach to shape analysis. In: 2011 IEEE international conference on computer vision workshops (ICCV workshops), pp 1626–1633

  39. Sun J, Ovsjanikov M, Guibas L (2009) A concise and provably informative multi-scale signature based on heat diffusion. Comput Graph Forum 28:1383–1392

    Article  Google Scholar 

  40. Bronstein MM , Kokkinos I (2010) Scale-invariant heat kernel signatures for non-rigid shape recognition. In: 2010 IEEE computer society conference on computer vision and pattern recognition, pp 1704–1711

  41. Rusu RB, Blodow N, Marton ZC, Beetz M (2008) Aligning point cloud views using persistent feature histograms. In: 2008 IEEE/RSJ international conference on intelligent robots and systems, pp 3384–3391

  42. Rusu RB, Blodow N, Beetz M (2009) Fast point feature histograms (fpfh) for 3d registration. In: 2009 IEEE international conference on robotics and automation, pp 3212–3217

  43. Ling H, Jacobs DW (2007) Shape classification using the inner-distance. IEEE Trans Pattern Anal Mach Intel 29(2):286–299

    Article  Google Scholar 

  44. Chen D-Y, Tian X-P, Shen Y-T, Ouhyoung M (2003) On visual similarity based 3d model retrieval. Comput Graph Forum 22:223–232

    Article  Google Scholar 

  45. Johnson AE, Hebert M (1999) Using spin images for efficient object recognition in cluttered 3d scenes. IEEE Trans Pattern Anal Mach Intel 21(5):433–449

    Article  Google Scholar 

  46. Yu J, Tan M, Zhang H, Tao D, Rui Y (2019) Hierarchical deep click feature prediction for finegrained image recognition. In: IEEE transactions on pattern analysis and machine intelligence, early access 30 July 2019. https://doi.org/10.1109/TPAMI.2019.2932058

  47. Jun Yu, Tao D, Wang M, Rui Y (2014) Learning to rank using user clicks and visual features for image retrieval. IEEE Trans Cybern 45(4):767–779

    Google Scholar 

  48. Hong C, Jun Yu, Wan J, Tao D, Wang M (2015) Multimodal deep autoencoder for human pose recovery. IEEE Trans Image Process 24(12):5659–5670

    Article  MathSciNet  MATH  Google Scholar 

  49. Jun Yu, Rui Y, Tao D (2014) Click prediction for web image reranking using multimodal sparse coding. IEEE Trans Image Process 23(5):2019–2032

    Article  MathSciNet  MATH  Google Scholar 

  50. Engelmann F, Kontogianni T, Hermans A, Leibe B (2017) Exploring spatial context for 3d semantic segmentation of point clouds. In: Proceedings of the IEEE international conference on computer vision, pp 716–724

  51. Arshad S, Shahzad M, Riaz Q, Fraz MM (2019) Dprnet: deep 3d point based residual network for semantic segmentation and classification of 3d point clouds. IEEE Access 7:68892–68904

    Article  Google Scholar 

  52. Simonovsky M, Komodakis N (2017) Dynamic edge-conditioned filters in convolutional neural networks on graphs. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3693–3702

  53. Newell ME (1972) A new approach to the shaded picture problem. In: Proceedings of the ACM National Conference

  54. Van LCF, Golub GH (1983) Matrix computations. Johns Hopkins University Press, Baltimore

    MATH  Google Scholar 

  55. Fukunaga K, Hostetler L (1975) The estimation of the gradient of a density function, with applications in pattern recognition. IEEE Trans Inf Theory 21(1):32–40

    Article  MathSciNet  MATH  Google Scholar 

  56. Cheng Y (1995) Mean shift, mode seeking, and clustering. IEEE Trans Pattern Anal Mach Intel 17(8):790–799

    Article  Google Scholar 

  57. Ester M, Kriegel H-P, Sander J, Xiaowei X et al (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. Kdd 96:226–231

    Google Scholar 

  58. Campello RJ, Moulavi D, Sander J (2013) Density-based clustering based on hierarchical density estimates. In: Pacific-Asia conference on knowledge discovery and data mining, pp 160–172

  59. Demantké J, Mallet C, David N, Vallet B (2011) Dimensionality based scale selection in 3D lidar point clouds. International Archives of the Photogrammetry. Remote Sensing and Spatial Information Sciences, vol 38-5/W12. Calgary, Canada, pp 97–102

  60. Landrieu L, Obozinski G (2017) Cut pursuit: Fast algorithms to learn piecewise constant functions on general weighted graphs. SIAM J Imag Sci 10(4):1724–1766

    Article  MathSciNet  MATH  Google Scholar 

  61. Jaderberg M , Simonyan K, Zisserman A, et al. (2015) Spatial transformer networks. Adv Neural Information Processing Systems, pp 2017–2025

  62. Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167

  63. Liang X, Lin L, Shen X, Feng J, Yan S, Xing EP (2017) Interpretable structure-evolving lstm. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1010–1019

  64. Yue B, Junwei F, Liang J (2018) Residual recurrent neural networks for learning sequential representations. Information 9(3):56

    Article  Google Scholar 

  65. Wang X, Liu S, Shen X, Shen C, Jia J (2019) Associatively segmenting instances and semantics in point clouds. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4096–4105

  66. Armeni I, Sener O, Zamir AR, Jiang H, Brilakis I, Fischer M, Savarese S (2016) 3d semantic parsing of large-scale indoor spaces. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1534–1543

  67. Lawin FJ, Danelljan M, Tosteberg P, Bhat G, Khan FS, Felsberg M (2017) Deep projective 3d semantic segmentation. In: International conference on computer analysis of images and patterns, pp 95–107

  68. Boulch LSB, Audebert N (2017) Unstructured point cloud semantic labeling using deep segmentation networks. 3DOR 2:7

    Google Scholar 

  69. Contreras J, Denzler J (2019) Edge-convolution point net for semantic segmentation of large-scale point clouds. In: IGARSS 2019-2019 IEEE international geoscience and remote sensing symposium, pp 5236–5239

  70. Wang F, Zhuang Y, Hong G, Huosheng H (2019) Octreenet: a novel sparse 3-d convolutional neural network for real-time 3-d outdoor scene analysis. IEEE Trans Autom Sci Eng 17(2):735–747

    Article  Google Scholar 

  71. Thomas H, Goulette F, Deschaud J-E, Marcotegui B, LeGall Y (2018) Semantic classification of 3d point clouds with multiscale spherical neighborhoods. In: 2018 International conference on 3D vision (3DV), pp 390–398

  72. Hackel T, Savinov N, Ladicky L, Wegner JD , Schindler K, Pollefeys M (2017) Semantic3d. net: a new large-scale point cloud classification benchmark. arXiv preprint arXiv:1704.03847

  73. Xie J, Girshick R, Farhadi A (2016) Unsupervised deep embedding for clustering analysis. In: International conference on machine learning, pp 478–487

  74. Hoppe H, DeRose T, Duchamp T, McDonald J, Stuetzle W (1992) Surface reconstruction from unorganized points. In: Proceedings of the 19th annual conference on Computer graphics and interactive techniques, pp 71–78

  75. Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Muhammad Moazam Fraz.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

Fig. 7
figure 7

Example visualizations on S3DIS [66] data-set

Fig. 8
figure 8

Example visualizations on S3DIS [66] data-set

We present more visualization of segmentation results from S3DIS dataset in Figs. 7 and 8 along with input point clouds, geometric grouping and ground truth. Ceilings are hided for clarity of view of point clouds. Points belonging to different classes are color coded differently where floor is shown as blue, column as blue green, wall as brown, windows light green, beam in salmon color, table as dark green, chair as dark blue, sofa in purple, boards in grey, bookcase in red and clutter in light grey color.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mehmood, S., Shahzad, M. & Fraz, M.M. DCARN: Deep Context Aware Recurrent Neural Network for Semantic Segmentation of Large Scale Unstructured 3D Point Cloud. Neural Process Lett 55, 881–904 (2023). https://doi.org/10.1007/s11063-020-10368-8

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-020-10368-8

Keywords

Navigation