A point cloud self-learning network based on contrastive learning for classification and segmentation

Zhou, Haoran; Wang, Wenju; Chen, Gang; Wang, Xiaolin

doi:10.1007/s00371-023-03248-4

A point cloud self-learning network based on contrastive learning for classification and segmentation

Original article
Published: 23 January 2024

(2024)
Cite this article

The Visual Computer Aims and scope Submit manuscript

Haoran Zhou¹,
Wenju Wang ORCID: orcid.org/0000-0002-8549-4710¹,
Gang Chen¹ &
…
Xiaolin Wang¹

188 Accesses
Explore all metrics

Abstract

In the field of point cloud representation learning, many self-supervised learning methods aim to address the issue of conventional supervised learning methods relying heavily on labeled data. Particularly in recent years, contrastive learning-based methods have gained an increasing popularity. However, most of the current contrastive learning methods solely rely on conventional random augmentation, limiting the effectiveness of representation learning. Moreover, to prevent model collapse, they construct positive and negative sample pairs or explicit clustering centers, which adds complexity to data preprocessing operations. To address these challenges effectively and achieve accurate point cloud classification and segmentation, we propose PointSL, a self-learning network for point clouds based on contrastive learning. PointSL incorporates a learnable point cloud augmentation (LPA) module, which transforms samples with high precision, significantly improving the augmentation effect. To further enhance feature discrimination, PointSL introduces a self-learning process along a refined feature predictor (FFP). This innovative approach leverages the attention mechanism to facilitate mutual feature prediction between pairs of point clouds, thereby continuously improving discriminant performance. Additionally, the network constructed a simple yet effective self-adaptive loss function that optimizes the entire network through gradient feedback. For pretraining, it is beneficial to obtain encoders with a better generalization and a higher accuracy. We evaluate PointSL on benchmark datasets such as ModelNet40, Sydney Urban Objects and ShapeNetPart. Experimental results demonstrate that PointSL outperforms state-of-the-art self-supervised methods and supervised counterparts, achieving exceptional performance in classification and segmentation tasks. Notably, on the Sydney Urban Objects and ModelNet40 datasets, PointSL achieves OA and AA metrics of 80.6%, 69.9%, 94.2% and 91.4%, respectively. On the ShapeNetPart dataset, PointSL achieves Inst.mIoU and Cls.mIoU metrics of 86.3% and 85.1%, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Unsupervised contrastive learning with simple transformation for 3D point cloud data

Article 31 July 2023

Deep 3D point cloud classification and segmentation network based on GateNet

Article 23 March 2023

Self-Prediction for Joint Instance and Semantic Segmentation of Point Clouds

Availability of data and materials

The ShapeNetCore, ShapeNetPart, ModelNet40 and Sydney Urban Objects datasets used in this study were obtained from public domains and are available online at https://shapenet.org, https://www.shapenet.org/download/parts, https://modelnet.cs.princeton.edu and https://www.acfr.usyd.edu.au/papers/SydneyUrban ObjectsDataset.shtml accessed on 20 Feb 2023.

References

Chen, H., Lu, P.: Real-time identification and avoidance of simultaneous static and dynamic obstacles on point cloud for UAVs navigation. Robot. Auton. Syst. 154, 104124 (2022)
Article Google Scholar
Chen, S., Liu, B., Feng, C., Vallespi-Gonzalez, C., Wellington, C.: 3D point cloud processing and learning for autonomous driving: impacting map creation, localization, and perception. IEEE Signal Process. Mag. 38(1), 68–86 (2020)
Article Google Scholar
Zheng, Y., Li, Y., Yang, S., Lu, H.: Global-pbnet: a novel point cloud registration for autonomous driving. IEEE Trans. Intell. Transp. Syst. 23(11), 22312–22319 (2022)
Article Google Scholar
Geng, Z., Sabbaghi, A., Bidanda, B.: Automated variance modeling for three-dimensional point cloud data via Bayesian neural networks. IISE Trans. 55(9), 912–925 (2023)
Article Google Scholar
Nguyen, V.-T., Fournier, R.A., Côté, J.-F., Pimont, F.: Estimation of vertical plant area density from single return terrestrial laser scanning point clouds acquired in forest environments. Remote Sens. Environ. 279, 113115 (2022)
Article Google Scholar
Qian, G., Hammoud, H., Li, G., Thabet, A., Ghanem, B.: Assanet: an anisotropic separable set abstraction for efficient point cloud representation learning. Adv. Neural. Inf. Process. Syst. 34, 28119–28130 (2021)
Google Scholar
Singh, S.A., Kumar, A.S., Desai, K.: Comparative assessment of common pre-trained CNNs for vision-based surface defect detection of machined components. Expert Syst. Appl. 218, 119623 (2023)
Article Google Scholar
Zhang, J., Xie, W., Wang, C., Tu, R., Tu, Z.: Graph-aware transformer for skeleton-based action recognition. Vis. Comput. pp. 1–12 (2022)
Li, C., Guan, Y., Yang, S., Li, Y.: A dynamic learning framework integrating attention mechanism for point cloud registration. Vis. Comput. pp. 1–15 (2023)
Wang, C., Cheng, M., Sohel, F., Bennamoun, M., Li, J.: Normalnet: a voxel-based CNN for 3D object classification and retrieval. Neurocomputing 323, 139–147 (2019)
Article Google Scholar
Su, H., Maji, S., Kalogerakis, E., Learned-Miller, E.: Multi-view convolutional neural networks for 3D shape recognition. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 945–953 (2015)
Feng, Y., Zhang, Z., Zhao, X., Ji, R., Gao, Y.: Gvcnn: Group-view convolutional neural networks for 3D shape recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 264–272 (2018)
Yu, T., Meng, J., Yuan, J.: Multi-view harmonized bilinear network for 3D object recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 186–194 (2018)
Hamdi, A., Giancola, S., Ghanem, B.: Mvtn: Multi-view transformation network for 3D shape recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1–11 (2021)
Wang, W., Zhou, H., Chen, G., Wang, X.: Fusion of a static and dynamic convolutional neural network for multiview 3D point cloud classification. Remote Sens. 14(9), 1996 (2022)
Article Google Scholar
Qi, C.R., Yi, L., Su, H., Guibas, L.J.: Pointnet++: Deep hierarchical feature learning on point sets in a metric space. Adv. Neural Inf. Process. Syst. 30 (2017)
Wang, Y., Sun, Y., Liu, Z., Sarma, S.E., Bronstein, M.M., Solomon, J.M.: Dynamic graph CNN for learning on point clouds. ACM Trans. Graphics 38(5), 1–12 (2019)
Article Google Scholar
Zhao, H., Jiang, L., Jia, J., Torr, P.H., Koltun, V.: Point transformer. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 16259–16268 (2021)
Ma, X., Qin, C., You, H., Ran, H., Fu, Y.: Rethinking network design and local geometry in point cloud: A simple residual MLP framework. arXiv preprint arXiv:2202.07123 (2022)
Xie, S., Gu, J., Guo, D., Qi, C.R., Guibas, L., Litany, O.: Pointcontrast: Unsupervised pre-training for 3D point cloud understanding. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part III 16, pp. 574–591 (2020). Springer
Yu, X., Tang, L., Rao, Y., Huang, T., Zhou, J., Lu, J.: Point-bert: Pre-training 3D point cloud transformers with masked point modeling. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 19313–19322 (2022)
Yan, S., Yang, Z., Li, H., Guan, L., Kang, H., Hua, G., Huang, Q.: Iae: Implicit autoencoder for point cloud self-supervised representation learning. arXiv preprint arXiv:2201.00785 (2022)
Yang, Y., Feng, C., Shen, Y., Tian, D.: Foldingnet: Point cloud auto-encoder via deep grid deformation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 206–215 (2018)
Zhang, Z., Girdhar, R., Joulin, A., Misra, I.: Self-supervised pretraining of 3D features on any point-cloud. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10252–10263 (2021)
Long, F., Yao, T., Qiu, Z., Li, L., Mei, T.: Pointclustering: Unsupervised point cloud pre-training using transformation invariance in clustering. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 21824–21834 (2023)
Pang, Y., Wang, W., Tay, F.E., Liu, W., Tian, Y., Yuan, L.: Masked autoencoders for point cloud self-supervised learning. In: European Conference on Computer Vision, pp. 604–621 (2022). Springer
Zeng, Y., Jiang, C., Mao, J., Han, J., Ye, C., Huang, Q., Yeung, D.-Y., Yang, Z., Liang, X., Xu, H.: Clip2: Contrastive language-image-point pretraining from real-world point cloud data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15244–15253 (2023)
Hou, J., Graham, B., Nießner, M., Xie, S.: Exploring data-efficient 3d scene understanding with contrastive scene contexts. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15587–15597 (2021)
Afham, M., Dissanayake, I., Dissanayake, D., Dharmasiri, A., Thilakarathna, K., Rodrigo, R.: Crosspoint: self-supervised cross-modal contrastive learning for 3d point cloud understanding. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9902–9912 (2022)
Rao, Y., Liu, B., Wei, Y., Lu, J., Hsieh, C.-J., Zhou, J.: Randomrooms: Unsupervised pre-training from synthetic shapes and randomized layouts for 3d object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3283–3292 (2021)
Du, B., Gao, X., Hu, W., Li, X.: Self-contrastive learning with hard negative sampling for self-supervised point cloud learning. In: Proceedings of the 29th ACM International Conference on Multimedia, pp. 3133–3142 (2021)
Qi, C.R., Su, H., Mo, K., Guibas, L.J.: Pointnet: Deep learning on point sets for 3D classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 652–660 (2017)
Liu, Z., Tang, H., Lin, Y., Han, S.: Point-voxel CNN for efficient 3D deep learning. Adv. Neural Inf. Process. Syst. 32 (2019)
Liu, Z., Song, W., Tian, Y., Ji, S., Sung, Y., Wen, L., Zhang, T., Song, L., Gozho, A.: Vb-net: voxel-based broad learning network for 3D object classification. Appl. Sci. 10(19), 6735 (2020)
Article Google Scholar
Mohammadi, S.S., Wang, Y., Del Bue, A.: Pointview-GCN: 3D shape classification with multi-view point clouds. In: 2021 IEEE International Conference on Image Processing (ICIP), pp. 3103–3107 (2021). IEEE
Chen, S., Yu, T., Li, P.: MVT: Multi-view vision transformer for 3D object recognition. In: British Machine Vision Conference (2021)
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017)
Goyal, A., Law, H., Liu, B., Newell, A., Deng, J.: Revisiting point cloud shape classification with a simple and effective baseline. In: International Conference on Machine Learning, pp. 3809–3820 (2021). PMLR
Xiang, T., Zhang, C., Song, Y., Yu, J., Cai, W.: Walk in the cloud: learning curves for point clouds shape analysis. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 915–924 (2021)
Ran, H., Zhuo, W., Liu, J., Lu, L.: Learning inner-group relations on point clouds. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 15477–15487 (2021)
Xu, M., Ding, R., Zhao, H., Qi, X.: Paconv: Position adaptive convolution with dynamic kernel assembling on point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3173–3182 (2021)
Zhang, C., Wan, H., Liu, S., Shen, X., Wu, Z.: Pvt: Point-voxel transformer for 3D deep learning. arxiv 2021. arXiv preprint arXiv:2108.060765
Zhao, Y., Birdal, T., Deng, H., Tombari, F.: 3D point capsule networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1009–1018 (2019)
Shi, Y., Xu, M., Yuan, S., Fang, Y.: Unsupervised deep shape descriptor with point distribution learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9353–9362 (2020)
Yang, J., Ahn, P., Kim, D., Lee, H., Kim, J.: Progressive seed generation auto-encoder for unsupervised point cloud learning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6413–6422 (2021)
Wang, H., Liu, Q., Yue, X., Lasenby, J., Kusner, M.J.: Unsupervised point cloud pre-training via occlusion completion. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9782–9792 (2021)
Sanghi, A.: Info3D: Representation learning on 3d objects using mutual information maximization and contrastive learning. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXIX 16, pp. 626–642 (2020). Springer
Jiang, J., Lu, X., Ouyang, W., Wang, M.: Unsupervised contrastive learning with simple transformation for 3D point cloud data. The Visual Computer, pp. 1–18 (2023)
Chen, Y., Hu, V.T., Gavves, E., Mensink, T., Mettes, P., Yang, P., Snoek, C.G.: Pointmixup: Augmentation for point clouds. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part III 16, pp. 330–345 (2020). Springer
Sheshappanavar, S.V., Singh, V.V., Kambhamettu, C.: Patchaugment: Local neighborhood augmentation in point cloud classification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2118–2127 (2021)
Choi, J., Song, Y., Kwak, N.: Part-aware data augmentation for 3d object detection in point cloud. In: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 3391–3397 (2021). IEEE
Li, R., Li, X., Heng, P.-A., Fu, C.-W.: Pointaugment: an auto-augmentation framework for point cloud classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6378–6387 (2020)
Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: International Conference on Machine Learning, pp. 1597–1607 (2020). PMLR
Caron, M., Misra, I., Mairal, J., Goyal, P., Bojanowski, P., Joulin, A.: Unsupervised learning of visual features by contrasting cluster assignments. Adv. Neural. Inf. Process. Syst. 33, 9912–9924 (2020)
Google Scholar
Chen, X., He, K.: Exploring simple Siamese representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15750–15758 (2021)
Caron, M., Touvron, H., Misra, I., Jégou, H., Mairal, J., Bojanowski, P., Joulin, A.: Emerging properties in self-supervised vision transformers. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9650–9660 (2021)
Grill, J.-B., Strub, F., Altché, F., Tallec, C., Richemond, P., Buchatskaya, E., Doersch, C., Avila Pires, B., Guo, Z., Gheshlaghi Azar, M., et al.: Bootstrap your own latent-a new approach to self-supervised learning. Adv. Neural. Inf. Process. Syst. 33, 21271–21284 (2020)
Google Scholar
Gutmann, M., Hyvärinen, A.: Noise-contrastive estimation: a new estimation principle for unnormalized statistical models. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 297–304 (2010). JMLR Workshop and Conference Proceedings
Oord, A.v.d., Li, Y., Vinyals, O.: Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748 (2018)
Chang, A.X., Funkhouser, T., Guibas, L., Hanrahan, P., Huang, Q., Li, Z., Savarese, S., Savva, M., Song, S., Su, H., et al.: Shapenet: An information-rich 3d model repository. arXiv preprint arXiv:1512.03012 (2015)
Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., Xiao, J.: 3d shapenets: A deep representation for volumetric shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1912–1920 (2015)
De Deuge, M., Quadros, A., Hung, C., Douillard, B.: Unsupervised feature learning for classification of outdoor 3d scans. In: Australasian Conference on Robitics and Automation, vol. 2 (2013). University of New South Wales Kensington, Australia
Han, Z., Shang, M., Liu, Y.-S., Zwicker, M.: View inter-prediction GAN: Unsupervised representation learning for 3D shapes by learning global shape memories to support local view predictions. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 8376–8384 (2019)
Sauder, J., Sievers, B.: Self-supervised deep learning on point clouds by reconstructing space. Adv. Neural Inf. Process. Syst. 32 (2019)

Download references

Acknowledgements

The authors would like to thank the relevant researchers from Princeton University, Stanford University and TTIC for providing ShapeNetCore, ShapeNetPart and ModelNet40 datasets, and the relevant researchers from the University of Sydney for providing the Sydney Urban Objects dataset.

Funding

This work was sponsored by Natural Science Foundation of Shanghai under Grant No. 19ZR1435900.

Author information

Authors and Affiliations

College of Communication and Art Design, University of Shanghai for Science and Technology, Shanghai, 200093, China
Haoran Zhou, Wenju Wang, Gang Chen & Xiaolin Wang

Authors

Haoran Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Wenju Wang
View author publications
You can also search for this author in PubMed Google Scholar
Gang Chen
View author publications
You can also search for this author in PubMed Google Scholar
Xiaolin Wang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

HZ contributed to conceptualization, methodology, resources and software; HZ, GC and XW performed data curation; WW carried out formal analysis and funding acquisition; WW, GC and XW performed supervision and validation; GC and XW contributed to visualization; HZ performed writing—original draft; HZ and WW performed writing—review and editing. All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Wenju Wang.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Ethics approval

Not applicable.

Consent to participate

Not applicable.

Consent for publication

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhou, H., Wang, W., Chen, G. et al. A point cloud self-learning network based on contrastive learning for classification and segmentation. Vis Comput (2024). https://doi.org/10.1007/s00371-023-03248-4

Download citation

Accepted: 19 December 2023
Published: 23 January 2024
DOI: https://doi.org/10.1007/s00371-023-03248-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A point cloud self-learning network based on contrastive learning for classification and segmentation

Abstract

Access this article

Similar content being viewed by others

Unsupervised contrastive learning with simple transformation for 3D point cloud data

Deep 3D point cloud classification and segmentation network based on GateNet

Self-Prediction for Joint Instance and Semantic Segmentation of Point Clouds

Availability of data and materials

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethics approval

Consent to participate

Consent for publication

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A point cloud self-learning network based on contrastive learning for classification and segmentation

Abstract

Access this article

Similar content being viewed by others

Unsupervised contrastive learning with simple transformation for 3D point cloud data

Deep 3D point cloud classification and segmentation network based on GateNet

Self-Prediction for Joint Instance and Semantic Segmentation of Point Clouds

Availability of data and materials

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethics approval

Consent to participate

Consent for publication

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation