Skip to main content
Log in

Semantic segmentation of large-scale point clouds with neighborhood uncertainty

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Large-scale point cloud segmentation is one of the important research directions in the field of computer vision, aiming at segmenting 3D point cloud data into parts with semantic meaning, which is widely used in the fields of robot perception, automated driving, and virtual reality. In practical applications, intelligences often face various uncertainties such as sensor noise, missing data, and uncertain model parameter estimation. However, many current research works do not consider the effects of these uncertainties, which can cause the model to overfit the noisy data and thus affect the model performance. In this paper, we propose a point cloud segmentation method with domain uncertainty that can greatly improve the robustness of the model to noise. Specifically, we first compute the neighborhood uncertainty, which is more reflective of the semantics of a local region than the prediction of a single point, which will reduce the impact of noise. Next, we fuse the uncertainty into the objective function, which allows the model to focus more on relatively deterministic data. Finally, we validate on the large-scale datasets S3DIS and Toronto3D, and the segmentation performance is substantially improved in both cases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data Availability

The S3DIS dataset used in the article can be obtained at http://buildingparser.stanford.edu/dataset.html. The Toronto3D dataset used in the article can be obtained at https://opendatalab.org.cn/Toronto-3D/download. The Semantic3D dataset used in the article can be obtained at http://www.semantic3d.net/view_dbase.php?chl=1. The Scannet-v2 dataset used in the article can be obtained at https://github.com/ScanNet/ScanNet.

References

  1. Chen X, Ma H, Wan J, Li B, Xia T (2016) Multi-view 3d object detection network for autonomous driving. 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 6526–6534

  2. Blanc T, Beheiry ME, Caporal C, Masson J-B, Hajj B (2020) Genuage: visualize and analyze multidimensional single-molecule point cloud data in virtual reality. Nature Methods 1–3

  3. Mildenhall B, Srinivasan PP, Tancik M, Barron JT, Ramamoorthi R, Ng R (2020) Nerf: representing scenes as neural radiance fields for view synthesis. arXiv:2003.08934

  4. Qi C, Su H, Mo K, Guibas LJ (2016) Pointnet: deep learning on point sets for 3d classification and segmentation. 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 77–85

  5. Qi C, Yi L, Su H, Guibas LJ (2017) Pointnet++: deep hierarchical feature learning on point sets in a metric space. In: NIPS

  6. Hu Q, Yang B, Xie L, Rosa S, Guo Y, Wang Z, Trigoni A, Markham A (2019) Randla-net: efficient semantic segmentation of large-scale point clouds. 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 11105–11114

  7. Wang Y, Sun Y, Liu Z, Sarma SE, Bronstein MM, Solomon JM (2018) Dynamic graph cnn for learning on point clouds. ACM Trans Graph (TOG) 38:1–12

    Google Scholar 

  8. Lin Y, Yan Z, Huang H, Du D, Liu L, Cui S, Han X (2020) Fpconv: learning local flattening for point convolution. 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 4292–4301

  9. Thomas H, Qi C, Deschaud J-E, Marcotegui B, Goulette F, Guibas LJ (2019) Kpconv: flexible and deformable convolution for point clouds. 2019 IEEE/CVF international conference on computer vision (ICCV), pp 6410–6419

  10. Choy CB, Gwak J, Savarese S (2019) 4d spatio-temporal convnets: Minkowski convolutional neural networks. 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 3070–3079

  11. Graham B, Engelcke M, Maaten L (2017) 3d semantic segmentation with submanifold sparse convolutional networks. 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 9224–9232

  12. Huang S-S, Ma Z-Y, Mu T-J, Fu H, Hu S (2021) Supervoxel convolution for online 3d semantic segmentation. ACM Trans Graph (TOG) 40:1–15

    Google Scholar 

  13. Zhao H, Jiang L, Jia J, Torr PHS, Koltun V (2020) Point transformer. 2021 IEEE/CVF international conference on computer vision (ICCV), pp 16239–16248

  14. Wu X, Lao Y, Jiang L, Liu X, Zhao H (2022) Point transformer v2: grouped vector attention and partition-based pooling. arXiv:2210.05666

  15. Sirohi K, Marvi S, Büscher D, Burgard W (2022) Uncertainty-aware lidar panoptic segmentation. 2023 IEEE international conference on robotics and automation (ICRA), pp 8277–8283

  16. Liu D, Cui Y, Tan W, Chen Y (2021) Sg-net: spatial granularity network for one-stage video instance segmentation. 2021 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 9811–9820

  17. Wang W, Liang J, Liu D (2022) Learning equivariant segmentation with instance-unique querying. arXiv:2210.00911

  18. Qin Z, Lu X, Nie X, Liu D, Yin Y, Wang W (2023) Coarse-to-fine video instance segmentation with factorized conditional appearance flows. IEEE/CAA J Automatica Sinica 10:1192–1208

    Article  Google Scholar 

  19. Cui Y, Yan L, Cao Z, Liu D (2021) Tf-blender: temporal feature blender for video object detection. 2021 IEEE/CVF international conference on computer vision (ICCV), pp 8118–8127

  20. Liang J, Zhou T, Liu D, Wang, W (2023) Clustseg: clustering for universal segmentation. arXiv:2305.02187

  21. Cui Y, Liu X, Liu H, Zhang J, Zare A, Fan B (2021) Geometric attentional dynamic graph convolutional neural networks for point cloud analysis. Neurocomputing 432:300–310

    Article  Google Scholar 

  22. Xu C, Wu B, Wang Z, Zhan W, Vajda P, Keutzer K, Tomizuka M (2020) Squeezesegv3: spatially-adaptive convolution for efficient point-cloud segmentation. In: European conference on computer vision. https://api.semanticscholar.org/CorpusID:214802232

  23. Cui Y, Ruan L, Dong H, Li Q, Wu Z, Zeng T, Fan F (2023) Cloud-rain: point cloud analysis with reflectional invariance. arXiv:2305.07814

  24. Uy MA, Lee GH (2018) Pointnetvlad: deep point cloud based retrieval for large-scale place recognition. 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 4470–4479

  25. Liu D, Cui Y, Yan L, Mousas C, Yang B, Chen Y (2020) Densernet: weakly supervised visual localization using multi-scale feature aggregation. In: AAAI conference on artificial intelligence. https://api.semanticscholar.org/CorpusID:227305257

  26. Yan L, Cui Y, Chen Y, Liu D (2021) Hierarchical attention fusion for geo-localization. ICASSP 2021 - 2021 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 2220–2224

  27. Cheng Z, Choi H, Liang J, Feng S, Tao G, Liu D, Zuzak M, Zhang X (2023) Fusion is not enough: single-modal attacks to compromise fusion models in autonomous driving. arXiv:2304.14614

  28. Armeni I, Sax S, Zamir AR, Savarese S (2017) Joint 2d-3d-semantic data for indoor scene understanding

  29. Tan W, Qin N, Ma L, Li Y, Du J, Cai G, Yang K, Li J (2020) Toronto-3d: a large-scale mobile lidar dataset for semantic segmentation of urban roadways. In: 2020 IEEE/CVF conference on computer vision and pattern recognition workshops (CVPRW), pp 797–806. https://doi.org/10.1109/CVPRW50498.2020.00109

  30. Li Y, Bu R, Sun M, Chen B (2018) Pointcnn. arXiv:1801.07791

  31. Landrieu L, Simonovsky M (2017) Large-scale point cloud semantic segmentation with superpoint graphs. 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 4558–4567

  32. Lei H, Akhtar N, Mian AS (2019) Spherical kernel for efficient graph convolution on 3d point clouds. IEEE Trans Patt Anal Mach Intell 43:3664–3680

    Article  Google Scholar 

  33. Zhao H, Jiang L, Fu C-W, Jia J (2019) Pointweb: enhancing local neighborhood features for point cloud processing. 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 5560–5568

  34. Hackel T, Savinov N, Ladicky L, Wegner JD, Schindler K, Pollefeys M (2017) Semantic3d.net: a new large-scale point cloud classification benchmark. ISPRS annals of photogrammetry, remote sensing and spatial information sciences

  35. Dai A, Chang AX, Savva M, Halber M, Funkhouser T, Nießner M (2017) Scannet: richly-annotated 3d reconstructions of indoor scenes. In: Proc. computer vision and pattern recognition (CVPR). IEEE

Download references

Acknowledgements

This research was funded by the Natural Science Project of China for Young and middle-aged(No.2022KY1786).

Author information

Authors and Affiliations

Authors

Contributions

Yong Bao: Conceptualisation, Methodology, Software, Validation, Formal analysis, Investigation, Data Curation, Writing original draft. Wen Haibiao: Conceptualisation, Methodology. Zhang Bao Qing: Validation, Software.

Corresponding author

Correspondence to Haibiao Wen.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflicts of interest.

Ethical standard

Data used in the present study are publicly available, and ethical approval and informed consent were obtained in each original study.

Competing interests

The authors have no relevant financial or nonfinancial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bao, Y., Wen, H. & Zhang, B. Semantic segmentation of large-scale point clouds with neighborhood uncertainty. Multimed Tools Appl (2023). https://doi.org/10.1007/s11042-023-17814-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11042-023-17814-4

Keywords

Navigation