Semantic segmentation-assisted instance feature fusion for multi-level 3D part instance segmentation

Sun, Chun-Yu; Tong, Xin; Liu, Yang

doi:10.1007/s41095-022-0300-x

Semantic segmentation-assisted instance feature fusion for multi-level 3D part instance segmentation

Research Article
Open access
Published: 30 June 2023

Volume 9, pages 699–715, (2023)
Cite this article

Download PDF

You have full access to this open access article

Computational Visual Media Aims and scope Submit manuscript

Semantic segmentation-assisted instance feature fusion for multi-level 3D part instance segmentation

Download PDF

Chun-Yu Sun¹,
Xin Tong² &
Yang Liu²

1878 Accesses
1 Citation
Explore all metrics

Abstract

Recognizing 3D part instances from a 3D point cloud is crucial for 3D structure and scene understanding. Several learning-based approaches use semantic segmentation and instance center prediction as training tasks and fail to further exploit the inherent relationship between shape semantics and part instances. In this paper, we present a new method for 3D part instance segmentation. Our method exploits semantic segmentation to fuse nonlocal instance features, such as center prediction, and further enhances the fusion scheme in a multi- and cross-level way. We also propose a semantic region center prediction task to train and leverage the prediction results to improve the clustering of instance points. Our method outperforms existing methods with a large-margin improvement in the PartNet benchmark. We also demonstrate that our feature fusion scheme can be applied to other existing methods to improve their performance in indoor scene instance segmentation tasks.

Article PDF

Learning Regional Purity for Instance Segmentation on 3D Point Clouds

Semi-supervised 3D shape segmentation with multilevel consistency and part substitution

Article Open access 03 January 2023

Instance-Aware Embedding for Point Cloud Instance Segmentation

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Availability of data and materials

PartNet, ScanNet, and S3DIS are all publicly released datasets.

References

Tchapmi, L.; Choy, C.; Armeni, I.; Gwak, J.; Savarese, S. SEGCloud: Semantic segmentation of 3D point clouds. In: Proceedings of the International Conference on 3D Vision, 537–547, 2017.
Yang, B.; Wang, J. N.; Clark, R.; Hu, Q. Y.; Wang, S.; Markham, A.; Trigoni, N. Learning object bounding boxes for 3D instance segmentation on point clouds. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems, Article No. 605, 6740–6749, 2019.
Lahoud, J.; Ghanem, B.; Oswald, M. R.; Pollefeys, M. 3D instance segmentation via multi-task metric learning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 9255–9265, 2019.
Zhang, F. H.; Guan, C. Y.; Fang, J.; Bai, S.; Yang, R. G.; Torr, P. H. S.; Prisacariu, V. Instance segmentation of LiDAR point clouds. In: Proceedings of the IEEE International Conference on Robotics and Automation, 9448–9455, 2020.
Tan, J. G.; Chen, L. L.; Wang, K. R.; Li, J. M.; Zhang, X. L. SASO: Joint 3D semantic-instance segmentation via multi-scale semantic association and salient point clustering optimization. IET Computer Vision Vol. 15, No. 5, 366–379, 2021.
Article Google Scholar
Engelmann, F.; Bokeloh, M.; Fathi, A.; Leibe, B.; Nießner, M. 3D-MPA: Multi-proposal aggregation for 3D semantic instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 9028–9037, 2020.
Liu, S. H.; Yu, S. Y.; Wu, S. C.; Chen, H. T.; Liu, T. L. Learning Gaussian instance segmentation in point clouds. arXiv preprint arXiv:2007.09860, 2020.
Jiang, L.; Zhao, H. S.; Shi, S. S.; Liu, S.; Fu, C. W.; Jia, J. Y. PointGroup: Dual-set point grouping for 3D instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4866–4875, 2020.
Zhang, B.; Wonka, P. Point cloud instance segmentation using probabilistic embeddings. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8879–8888, 2021.
He, T.; Shen, C. H.; van den Hengel, A. DyCo3D: Robust instance segmentation of 3D point clouds through dynamic convolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 354–363, 2021.
Wang, X. L.; Liu, S.; Shen, X. Y.; Shen, C. H.; Jia, J. Y. Associatively segmenting instances and semantics in point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4091–4100, 2019.
Zhao, L.; Tao, W. B. JSNet: Joint instance and semantic segmentation of 3D point clouds. Proceedings of the AAAI Conference on Artificial Intelligence Vol. 34, No. 7, 12951–12958, 2020.
Article Google Scholar
Mo, K. C.; Zhu, S. L.; Chang, A. X.; Yi, L.; Tripathi, S.; Guibas, L. J.; Su, H. PartNet: A large-scale benchmark for fine-grained and hierarchical part-level 3D object understanding. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 909–918, 2019.
Dai, A.; Chang, A. X.; Savva, M.; Halber, M.; Funkhouser, T.; Nießner, M. ScanNet: Richly-annotated 3D reconstructions of indoor scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2432–2443, 2017.
Armeni, I.; Sener, O.; Zamir, A. R.; Jiang, H.; Brilakis, I.; Fischer, M.; Savarese, S. 3D semantic parsing of large-scale indoor spaces. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1534–1543, 2016.
Hafiz, A. M.; Bhat, G. M. A survey on instance segmentation: State of the art. International Journal of Multimedia Information Retrieval Vol. 9, No. 3, 171–189, 2020.
Article Google Scholar
Girshick, R. Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, 1440–1448, 2015.
Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. In: Proceedings of the 28th International Conference on Neural Information Processing Systems, 91–99, 2015.
Wang, X. L.; Kong, T.; Shen, C. H.; Jiang, Y. N.; Li, L. SOLO: Segmenting objects by locations. In: Computer Vision - ECCV 2020. Lecture Notes in Computer Science, Vol. 12363. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J. M. Eds. Springer Cham, 649–665, 2020.
Chapter Google Scholar
He, K. M.; Gkioxari, G.; Dollar, P.; Girshick, R. Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, 2980–2988, 2017.
Bai, M.; Urtasun, R. Deep watershed transform for instance segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2858–2866, 2017.
Dai, J. F.; He, K. M.; Li, Y.; Ren, S. Q.; Sun, J. Instance-sensitive fully convolutional networks. In: Computer Vision - ECCV 2016. Lecture Notes in Computer Science, Vol. 9910. Leibe, B.; Matas, J.; Sebe, N.; Welling, M. Eds. Springer Cham, 534–549, 2016.
Google Scholar
Chen, X. L.; Girshick, R.; He, K. M.; Dollar, P. TensorMask: A foundation for dense object segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2061–2069, 2019.
Zhang, H.; Sun, H.; Ao, W.; Dimirovski, G. A survey on instance segmentation: Recent advances and challenges. International Journal of Innovative Computing, Information and Control Vol. 17, No. 3, 1041–1053, 2021.
Google Scholar
Minaee, S.; Boykov, Y.; Porikli, F.; Plaza, A.; Kehtarnavaz, N.; Terzopoulos, D. Image segmentation using deep learning: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 44, No. 7, 3523–3542, 2022.
Google Scholar
Guo, Y.; Wang, H.; Hu, Q.; Liu, H.; Liu, L.; Bennamoun, M. Deep learning for 3D point clouds: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 43, No. 12, 4338–4364, 2021.
Article Google Scholar
He, Y.; Yu, H. S.; Liu, X. Y.; Yang, Z. G.; Sun, W.; Wang, Y. N.; Fu, Q.; Zou, Y. M.; Mian, A. Deep learning based 3D segmentation: A survey. arXiv preprint arXiv:2103.05423, 2021.
Jiang, H. Y.; Yan, F. L.; Cai, J. F.; Zheng, J. M.; Xiao, J. End-to-end 3D point cloud instance segmentation without detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 12793–12802, 2020.
Hou, J.; Dai, A.; Nieýner, M. 3D-SIS: 3D semantic instance segmentation of RGB-D scans. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4416–4425, 2019.
Yi, L.; Zhao, W.; Wang, H.; Sung, M.; Guibas, L. J. GSPN: Generative shape proposal network for 3D instance segmentation in point cloud. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3942–3951, 2019.
Wang, W. Y.; Yu, R.; Huang, Q. G.; Neumann, U. SGPN: Similarity group proposal network for 3D point cloud instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2569–2578, 2018.
Liu, C.; Furukawa, Y. MASC: Multi-scale affinity with sparse convolution for 3D instance segmentation. arXiv preprint arXiv:1902.04478, 2019.
Han, L.; Zheng, T.; Xu, L.; Fang, L. OccuSeg: Occupancy-aware 3D instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2937–2946, 2020.
Chen, S. Y.; Fang, J. M.; Zhang, Q.; Liu, W. Y.; Wang, X. G. Hierarchical aggregation for 3D instance segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 15447–15456, 2021.
Liang, Z. H.; Li, Z. H.; Xu, S. C.; Tan, M. K.; Jia, K. Instance segmentation in 3D scenes using semantic superpoint tree networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2763–2772, 2021.
Yu, F. G.; Liu, K.; Zhang, Y.; Zhu, C. Y.; Xu, K. PartNet: A recursive part decomposition network for fine-grained and hierarchical shape segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 9483–9492, 2019.
Comaniciu, D.; Meer, P. Mean shift: A robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 24, No. 5, 603–619, 2002.
Article Google Scholar
Wang, P.-S.; Liu, Y.; Guo, Y.-X.; Sun, C.-Y.; Tong, X. O-CNN: Octree-based convolutional neural networks for 3D shape analysis. ACM Transactions on Graphics Vol. 36, No. 4, Article No. 72, 2017.
Google Scholar
Wang, P. S.; Liu, Y.; Tong, X. Deep octree-based CNNs with output-guided skip connections for 3D shape and scene completion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 1074–1081, 2020.
Graham, B.; van der Maaten, L. Submanifold sparse convolutional networks. arXiv preprint arXiv:1706.01307, 2017.
Choy, C.; Gwak, J.; Savarese, S. 4D spatio-temporal ConvNets: Minkowski convolutional neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3070–3079, 2019.
Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Corrado, G. S.; Davis, A.; Dean, J.; Devin, M., et al. TensorFlow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467, 2016.
Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V., et al. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research Vol. 12, 2825–2830, 2011.
MathSciNet MATH Google Scholar
Qi, C. R.; Yi, L.; Su, H.; Guibas, L. J. PointNet++: Deep hierarchical feature learning on point sets in a metric space. arXiv preprint arXiv:1706.02413, 2017.
Sharma, G.; Liu, D. F.; Maji, S.; Kalogerakis, E.; Chaudhuri, S.; Mĕch, R. ParSeNet: A parametric surface fitting network for 3D point clouds. In: Computer Vision - ECCV 2020. Lecture Notes in Computer Science, Vol. 12352. Vedaldi, A.; Bischof, H.; Brox, T.; Frahm, J. M. Eds. Springer Cham, 261–276, 2020.
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Institute for Advanced Study, Tsinghua University, Beijing, 100084, China
Chun-Yu Sun
Microsoft Research Asia, Beijing, 100080, China
Xin Tong & Yang Liu

Authors

Chun-Yu Sun
View author publications
You can also search for this author in PubMed Google Scholar
Xin Tong
View author publications
You can also search for this author in PubMed Google Scholar
Yang Liu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Chun-Yu Sun proposed and implemented the key idea, conducted the main experiments, and contributed to paper writing. Xin Tong supervised the findings of this work and verified the key concept. Yang Liu led the project and contributed to the key concept, experimental design, and paper writing.

Corresponding author

Correspondence to Yang Liu.

Ethics declarations

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Chun-Yu Sun received his bachelor degree in computer science and technology from Xidian University in 2015. He is currently a Ph.D. student at Institute for Advanced Study, Tsinghua University. His research interests include computer graphics and 3D vision.

Xin Tong is a principal researcher manager with Microsoft Research Asia, where he leads the Internet Graphics Group. He received his Ph.D. degree from Tsinghua University in 1999. His research interests include computer graphics and computer vision, including texture synthesis, appearance modeling, light transport simulation and acquisition, 3D facial animation, and data-driven geometric processing. He was on the editorial boards of IEEE Transactions on Visualization and Computer graphics, ACM Transactions on Graphics, and Computer Graphics Forum.

Yang Liu is a principal researcher at Microsoft Research Asia. He received his Ph.D. degree from The University of Hong Kong in 2008, master and bachelor degrees in computational mathematics from University of Science and Technology of China in 2003 and 2000, respectively. His recent research focuses on geometry processing and 3D learning. He is on the editorial boards of IEEE Transactions on Visualization and Computer graphics and ACM Transactions on Graphics.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095.

To submit a manuscript, please go to https://www.editorialmanager.com/cvmj.

Reprints and permissions

About this article

Cite this article

Sun, CY., Tong, X. & Liu, Y. Semantic segmentation-assisted instance feature fusion for multi-level 3D part instance segmentation. Comp. Visual Media 9, 699–715 (2023). https://doi.org/10.1007/s41095-022-0300-x

Download citation

Received: 08 April 2022
Accepted: 03 June 2022
Published: 30 June 2023
Issue Date: December 2023
DOI: https://doi.org/10.1007/s41095-022-0300-x

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Semantic segmentation-assisted instance feature fusion for multi-level 3D part instance segmentation

Abstract

Article PDF

Similar content being viewed by others

Learning Regional Purity for Instance Segmentation on 3D Point Clouds

Semi-supervised 3D shape segmentation with multilevel consistency and part substitution

Instance-Aware Embedding for Point Cloud Instance Segmentation

Availability of data and materials

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Semantic segmentation-assisted instance feature fusion for multi-level 3D part instance segmentation

Abstract

Article PDF

Similar content being viewed by others

Learning Regional Purity for Instance Segmentation on 3D Point Clouds

Semi-supervised 3D shape segmentation with multilevel consistency and part substitution

Instance-Aware Embedding for Point Cloud Instance Segmentation

Availability of data and materials

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation