RS-TNet: point cloud transformer with relation-shape awareness for fine-grained 3D visual processing

Wang, Xu; Zeng, Yuqiao; Jin, Yi; Cen, Yigang; Liu, Baifu; Wan, Shaohua

doi:10.1007/s00500-022-07543-5

RS-TNet: point cloud transformer with relation-shape awareness for fine-grained 3D visual processing

Data analytics and machine learning
Published: 02 November 2022

Volume 27, pages 1005–1013, (2023)
Cite this article

Soft Computing Aims and scope Submit manuscript

Xu Wang¹,
Yuqiao Zeng¹,
Yi Jin ORCID: orcid.org/0000-0001-8408-3816¹,
Yigang Cen¹,
Baifu Liu¹ &
…
Shaohua Wan²

337 Accesses
Explore all metrics

Abstract

Point cloud representation is a challenge to extracting sufficient semantic information while ensuring that the sparsely point cloud spatial structure is complete. Benefiting from the Transformer network, recent studies have promoted the development of point cloud representation by extracting refined attention features based on global context. However, there is still undesired semantic information loss in the feature extraction stage. Hence, this paper proposes a novel architecture for 3D point cloud representation, namely Relation-Shape Transformer Network (RS-TNet), to address above problem while maintaining the merits of relation-shape embedding mechanism so as to generate rich and robust local semantic features. Specifically, RS-TNet can achieve coarse-to-fine grained semantic information coverage by integrating the global multi-head self-attention and local Relation-Feature extraction module simultaneously. Moreover, theoretical analysis demonstrates that RS-TNet can explicitly introduce the spatial relation of points by learning underlying shapes. In this way, extracted features are of more shape awareness and robustness. As a result, the proposed RS-TNet achieves 90.9% class accuracy and 85.6% Intersection-over-Union on ModelNet40 and ShapeNet datasets, respectively. Further, ablation experiments verify the effectiveness of our RS-TNet in point cloud classification and part segmentation tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

PointFusionNet: Point feature fusion network for 3D point clouds analysis

Article 26 October 2020

Pan Liang, Zhijun Fang, … Cengsi Zhong

PointPAVGG: An Incremental Algorithm for Extraction of Points’ Positional Feature Using VGG on Point Clouds

Combine Local and Global Feature Extraction for Point Cloud Classification

Data availability

Enquiries about data availability should be directed to the authors.

References

Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M et al. (2016) Tensorflow: A system for large-scale machine learning. In: 12th \(\{\)USENIX\(\}\) symposium on operating systems design and implementation (\(\{OSDI\}\) 16), pp 265–283
Atzmon M, Maron H, Lipman Y (2018) Point convolutional neural networks by extension operators. ACM Trans Graph, 37(4), 71:1
Chen J, Qin J, Shen Y, Liu L, Zhu F, Shao L (2020) Learning attentive and hierarchical representations for 3d shape recognition. In Computer Vision-ECCV, (2020) 16th European conference, Glasgow, UK, 23–28 Aug 2020. Proceedings, Part XV 16, 105–122
Engel N, Belagiannis V, Dietmayer K (2020) Point transformer. IEEE Access 9:134826–134840
Esteves C, Xu Y, Allen-Blanchette C, Daniilidis K (2019) Equivariant multi-view networks. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 1568–1577
Fuchs F, Worrall D, Fischer V, Welling M (2020) Se (3)-transformers: 3d roto-translation equivariant attention networks. Adv Neural Inf Process Syst
Guan T, Wang J, Lan S, Chandra R, Wu Z, Davis L, Manocha D (2021) M3detr: Multi-representation, multi-scale, mutual-relation 3d object detection with transformers. arXiv preprint: arXiv:2104.11896
Guo M-H, Cai J-X, Liu Z-N, Mu T-J, Martin RR, Hu S-M (2021) Pct: Point cloud transformer. Comput Vis Media, pp 187–199
Han X-F, Jin Y-F, Cheng H-X, Xiao G-Q (2021) Dual transformer for point cloud analysis. arXiv preprint: arXiv:2104.13044
Han X-F, Kuang Y-J, Xiao GQ (2021) Point cloud learning with transformer. arXiv preprint: arXiv:2104.13636
Kaul C, Pears N, Manandhar S (2021) Fatnet: A feature-attentive network for 3d point cloud processing. In: 2020 25th International conference on pattern recognition (ICPR), pp 7211–7218
Klokov R, Lempitsky V (2017) Escape from cells: deep kd-networks for the recognition of 3d point cloud models. In: Proceedings of the IEEE international conference on computer vision, 2017, pp 863–872
Li J, Chen BM, Lee GH (2018) So-net: Self-organizing network for point cloud analysis. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 9397–9406
Lin J, Rickert M, Perzylo A, Knoll A (2021) Pctma-net: Point cloud transformer with morphing atlas-based point generation network for dense point cloud completion. In: Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS)
Liu Y, Fan B, Xiang S, Pan C (2019) Relation-shape convolutional neural network for point cloud analysis. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8895–8904
Liu X, Han Z, Liu Y-S, Zwicker M (2019) Point2sequence: learning the shape representation of 3d point clouds with an attention-based sequence to sequence network. In: Proceedings of the AAAI conference on artificial intelligence, pp 8778–8785
Liu Z, Zhao X, Huang T, Hu R, Zhou Y, Bai X (2020) Tanet: Robust 3d object detection from point clouds with triple attention. In: Proceedings of the AAAI conference on artificial intelligence, pp 11677–11684
Loshchilov I, Hutter F (2017) Sgdr: Stochastic gradient descent with warm restarts. In: International conference on learning representations
Luo Z, Liu D, Li J, Chen Y, Xiao Z, Junior JM, Goncalves WN, Wang C (2020) Learning sequential slice representation with an attention-embedding network for 3d shape recognition and retrieval in mls point clouds. In: ISPRS J Photogramm Remote Sens, pp 147–163
Maturana D, Scherer S (2015) Voxnet: A 3d convolutional neural network for real-time object recognition. In: IEEE/RSJ international conference on intelligent robots and systems (IROS) 2015:922–928
Qi CR, Su H, Mo K, Guibas LJ (2017) Pointnet: deep learning on point sets for 3d classification and segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 652–660
Qi CR, Su H, Nießner M, Dai A, Yan M, Guibas LJ (2016) Volumetric and multi-view cnns for object classification on 3d data. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5648–5656
Qi CR, Yi L, Su H, Guibas LJ (2017) Pointnet++: Deep hierarchical feature learning on point sets in a metric space. In: Annual conference on neural information processing systems, pp 5099–5108
Qin C, You H, Wang L, Kuo C-CJ, Fu Y (2019) “Pointdan: A multi-scale 3d domain adaption network for point cloud representation. Adv Neural Inf Process Syst, pp 7192–7203
Riegler G, Osman Ulusoy A, Geiger A (2017) Octnet: Learning deep 3d representations at high resolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3577–3586
Su H, Jampani V, Sun D, Maji S, Kalogerakis E, Yang M-H, Kautz J (2018) Splatnet: Sparse lattice networks for point cloud processing. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2530–2539
Su H, Maji S, Kalogerakis E, Learned-Miller E (2015) Multi-view convolutional neural networks for 3d shape recognition. In: Proceedings of the IEEE international conference on computer vision, pp 945–953
Sun W, Zhang Z, Huang J (2020) Robnet: real-time road-object 3d point cloud segmentation based on squeezenet and cyclic CRF. Soft Comput 24(8):5805–5818
Article Google Scholar
Wang J, Fu X, Wang X, Liu S, Gao L, Zhang W (2020) Enabling energy-efficient and reliable neural network via neuron-level voltage scaling. IEEE transactions on computers, pp 1460–1473
Wang X, Jin Y, Cen Y, Lang C, Li Y (2021) PST-NET: point cloud sampling via point-based transformer. In: International conference on image and graphics, vol 12890, pp 57–69
Wang X, Jin Y, Cen Y, Wang T, Li Y (2021) Attention models for point clouds in deep learning: a survey. arXiv preprint arXiv:2102.10788
Wang X, Jin Y, Cen Y, Wang T, Tang B, Li Y (2022) Lightn: Light-weight transformer network for performance-overhead tradeoff in point cloud downsampling. CoRR, vol abs/2202.06263
Wang X, Jin Y, Li C, Cen Y, Li Y (2022) Vsln: View-aware sphere learning network for cross-view vehicle re-identification. Int J Intell Syst, pp 1–21
Wang Y, Sun Y, Liu Z, Sarma SE, Bronstein MM, Solomon JM (2019) Dynamic graph cnn for learning on point clouds. Acm Trans Graphics (tog), pp 1–12
Wen X, Han Z, Youk G, Liu Y-S (2020) Cf-sis: Semantic-instance segmentation of 3d point clouds by context fusion with self-attention. In: Proceedings of the 28th ACM international conference on multimedia, pp 1661–1669
Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J (2015) 3d shapenets: A deep representation for volumetric shapes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1912–1920
Xiang P, Wen X, Liu Y-S, Cao Y-P, Wan P, Zheng W, Han Z (2021) Snowflakenet: Point cloud completion by snowflake point deconvolution with skip-transformer. arXiv preprint: arXiv:2108.04444
Xie S, Liu S, Chen Z, Tu Z (2018) Attentional shapecontextnet for point cloud recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4606–4615
Yang Z, Dai Z, Yang Y, Carbonell J, Salakhutdinov RR, Le QV (2019) Xlnet: generalized autoregressive pretraining for language understanding. Adv Neural Inf Process Syst
Yang B, Luo W, Urtasun R (2018) Pixor: real-time 3d object detection from point clouds. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7652–7660
Yang J, Zhang Q, Ni B, Li L, Liu J, Zhou M, Tian Q (2019) Modeling point clouds with self-attention and gumbel subset sampling. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3323–3332
Yan X, Zheng C, Li Z, Wang S, Cui S (2020) Pointasnl: Robust point clouds processing using nonlocal neural networks with adaptive sampling. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5589–5598
Yi L, Kim VG, Ceylan D, Shen I-C, Yan M, Su H, Lu C, Huang Q, Sheffer A, Guibas L (2016) A scalable active framework for region annotation in 3d shape collections. ACM Trans Graphics (ToG), pp 1–12
Yue K, Sun M, Yuan Y, Zhou F, Ding E, Xu F (2018) Compact generalized non-local network. In: Annual conference on neural information processing systems, pp 6511–6520
Zhang Y, Jin Y, Chen J, Kan S, Cen Y, Cao Q (2020) PGAN: part-based nondirect coupling embedded GAN for person reidentification. IEEE Multim 27(3):23–33
Article Google Scholar
Zhao H, Jia J, Koltun V (2020) Exploring self-attention for image recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10076–10085
Zhao H, Jiang L, Fu C-W, Jia J (2019) Pointweb: Enhancing local neighborhood features for point cloud processing. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5565–5573
Zhao H, Jiang L, Jia J, Torr P, Koltun V (2020) Point transformer. arXiv preprint: arXiv:2012.09164

Download references

Funding

This work was supported by the National Natural Science Foundation of China under Grant No.61972030.

Author information

Authors and Affiliations

School of Computer and Information Technology, Beijing Jiaotong University, No.3 Shangyuancun Haidian District, Beijing, 100044, People’s Republic of China
Xu Wang, Yuqiao Zeng, Yi Jin, Yigang Cen & Baifu Liu
School of Information and Safety Engineering, Zhongnan University of Economics and Law, Nanhu Avenue, East Lake High-tech Development Zone, Wuhan, 430073, Hubei, People’s Republic of China
Shaohua Wan

Authors

Xu Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yuqiao Zeng
View author publications
You can also search for this author in PubMed Google Scholar
Yi Jin
View author publications
You can also search for this author in PubMed Google Scholar
Yigang Cen
View author publications
You can also search for this author in PubMed Google Scholar
Baifu Liu
View author publications
You can also search for this author in PubMed Google Scholar
Shaohua Wan
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by XW, YJ, YZ, YC, BL and SW. The first draft of the manuscript was written by XW and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Yi Jin or Yigang Cen.

Ethics declarations

Conflict of interest

The authors declare that they have no conflicts of interest.

Ethics approval

Not applicable.

Consent to participate

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wang, X., Zeng, Y., Jin, Y. et al. RS-TNet: point cloud transformer with relation-shape awareness for fine-grained 3D visual processing. Soft Comput 27, 1005–1013 (2023). https://doi.org/10.1007/s00500-022-07543-5

Download citation

Accepted: 17 April 2022
Published: 02 November 2022
Issue Date: January 2023
DOI: https://doi.org/10.1007/s00500-022-07543-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

RS-TNet: point cloud transformer with relation-shape awareness for fine-grained 3D visual processing

Abstract

Access this article

Similar content being viewed by others

PointFusionNet: Point feature fusion network for 3D point clouds analysis

PointPAVGG: An Incremental Algorithm for Extraction of Points’ Positional Feature Using VGG on Point Clouds

Combine Local and Global Feature Extraction for Point Cloud Classification

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Conflict of interest

Ethics approval

Consent to participate

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

RS-TNet: point cloud transformer with relation-shape awareness for fine-grained 3D visual processing

Abstract

Access this article

Similar content being viewed by others

PointFusionNet: Point feature fusion network for 3D point clouds analysis

PointPAVGG: An Incremental Algorithm for Extraction of Points’ Positional Feature Using VGG on Point Clouds

Combine Local and Global Feature Extraction for Point Cloud Classification

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Conflict of interest

Ethics approval

Consent to participate

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation