Abstract
The key challenge in processing point clouds lies in the inherent lack of ordering and irregularity of the 3D points. By relying on perpoint multi-layer perceptions (MLPs), most existing point-based approaches only address the first issue yet ignore the second one. Directly convolving kernels with irregular points will result in loss of shape information. This paper introduces a novel point-based bidirectional learning network (BLNet) to analyze irregular 3D points. BLNet optimizes the learning of 3D points through two iterative operations: feature-guided point shifting and feature learning from shifted points, so as to minimise intra-class variances, leading to a more regular distribution. On the other hand, explicitly modeling point positions leads to a new feature encoding with increased structure-awareness. Then, an attention pooling unit selectively combines important features. This bidirectional learning alternately regularizes the point cloud and learns its geometric features, with these two procedures iteratively promoting each other for more effective feature learning. Experiments show that BLNet is able to learn deep point features robustly and efficiently, and outperforms the prior state-of-the-art on multiple challenging tasks.
![](http://media.springernature.com/lw685/springer-static/image/art%3A10.1007%2Fs41095-021-0260-6/MediaObjects/41095_2021_260_Fig1_HTML.jpg)
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Szegedy, C.; Liu, W.; Jia, Y. Q.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1–9, 2015.
Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
He, K. M.; Zhang, X. Y.; Ren, S. Q.; Sun, J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770–778, 2016.
Qi, C. R.; Su, H.; Nießner, M.; Dai, A.; Yan, M. Y.; Guibas, L. J. Volumetric and multi-view CNNs for object classification on 3D data. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5648–5656, 2016.
Su, H.; Maji, S.; Kalogerakis, E.; Learned-Miller, E. Multi-view convolutional neural networks for 3D shape recognition. In: Proceedings of the IEEE International Conference on Computer Vision, 945–953, 2015.
Kalogerakis, E.; Averkiou, M.; Maji, S.; Chaudhuri, S. 3D shape segmentation with projective convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 6630–6639, 2017.
Maturana, D.; Scherer, S. VoxNet: A 3D Convolutional Neural Network for real-time object recognition. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 922–928, 2015.
Wu, Z. R.; Song, S. R.; Khosla, A.; Yu, F.; Zhang, L. G.; Tang, X. O.; Xiao, J. 3D ShapeNets: A deep representation for volumetric shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1912–1920, 2015.
Dai, A.; Chang, A. X.; Savva, M.; Halber, M.; Funkhouser, T.; Nießner, M. ScanNet: Richly-annotated 3D reconstructions of indoor scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2432–2443, 2017.
Klokov, R.; Lempitsky, V. Escape from cells: Deep kd-networks for the recognition of 3D point cloud models. In: Proceedings of the IEEE International Conference on Computer Vision, 863–872, 2017.
Riegler, G.; Ulusoy, A. O.; Geiger, A. OctNet: Learning deep 3D representations at high resolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 6620–6629, 2017.
Charles, R. Q.; Hao, S.; Mo, K. C.; Guibas, L. J. PointNet: Deep learning on point sets for 3D classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 77–85, 2017.
Qi, C. R.; Yi, L.; Su, H.; Guibas, L. J. PointNet++: Deep hierarchical feature learning on point sets in a metric space. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, 5105–5114, 2017.
Li, Y. Y.; Bu, R.; Sun, M. C.; Wu, W.; Di, X. H.; Chen, B. Q. PointCNN: Convolution on X-transformed points. In: Proceedings of the 32nd Conference on Neural Information Processing Systems, 2018.
Zhao, H. S.; Jiang, L.; Fu, C. W.; Jia, J. Y. PointWeb: Enhancing local neighborhood features for point cloud processing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5560–5568, 2019.
Wang, Y.; Sun, Y. B.; Liu, Z. W.; Sarma, S. E.; Bronstein, M. M.; Solomon, J. M. Dynamic graph CNN for learning on point clouds. ACM Transactions on Graphics Vol. 38, No. 5, Article No. 146, 2019.
Liu, Z. J.; Tang, H. T.; Lin, Y. J.; Han, S. Point-voxel CNN for efficient 3D deep learning. In: Proceedings of the 33rd Conference on Neural Information Processing Systems, 2019.
Wang, X.; He, J. M.; Ma, L. Exploiting local and global structure for point cloud semantic segmentation with contextual point representations. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems, Article No. 411, 4571–4581, 2019.
Yi, L.; Kim, V. G., Ceylan, D., Shen, I. C., Yan, M. Y.; Su, H.; Lu, A.; Huang, Q.; Sheffer, A. A scalable active framework for region annotation in 3D shape collections. ACM Transactions on Graphics Vol. 35, No. 6, Article No. 210, 2016.
Armeni, I.; Sener, O.; Zamir, A. R.; Jiang, H.; Brilakis, I.; Fischer, M.; Savarese, S. 3D semantic parsing of large-scale indoor spaces. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1534–1543, 2016.
Jiang, M. Y.; Wu, Y. R.; Lu, C. W. PointSIFT: A SIFT-like network module for 3D point cloud semantic segmentation. arXiv preprint arXiv:1807.00652, 2018.
Zhang, Z. Y.; Hua, B. S.; Yeung, S. K. ShellNet: Efficient point cloud convolutional neural networks using concentric shells statistics. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 1607–1616, 2019.
Wang, L.; Huang, Y. C.; Hou, Y. L.; Zhang, S. M.; Shan, J. Graph attention convolution for point cloud semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10288–10297, 2019.
Kipf, T. N.; Welling, M. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907, 2016.
Shen, Y. R.; Feng, C.; Yang, Y. Q.; Tian, D. Mining point cloud local structures by kernel correlation and graph pooling. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4548–4557, 2018.
Thomas, H.; Qi, C. R.; Deschaud, J. E.; Marcotegui, B.; Goulette, F.; Guibas, L. KPConv: Flexible and deformable convolution for point clouds. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 6410–6419, 2019.
Wu, W. X.; Qi, Z. A.; Li, F. X. PointConv: Deep convolutional networks on 3D point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 9613–9622, 2019.
Liu, Y. C.; Fan, B.; Xiang, S. M.; Pan, C. H. Relation-shape convolutional neural network for point cloud analysis. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8887–8896, 2019.
Guo, M. H.; Cai, J. X.; Liu, Z. N.; Mu, T. J.; Martin, R. R.; Hu, S. M. PCT: Point cloud transformer. Computational Visual Media Vol. 7, No. 2, 187–199, 2021.
Engel, N.; Belagiannis, V.; Dietmayer, K. Point transformer. IEEE Access Vol. 9, 134826–134840, 2021.
He, D.; Xia, Y.; Qin, T.; Wang, L.; Yu, N.; Liu, T.; Ma, W. Dual learning for machine translation. In: Proceedings of the 30th International Conference on Neural Information Processing Systems, 820–828, 2016.
Niu, X.; Denkowski, M.; Carpuat, M. Bi-directional neural machine translation with synthetic parallel data. In: Proceedings of the 2nd Workshop on Neural Machine Translation and Generation, 84–91, 2018.
Pontes-Filho, S.; Liwicki, M. Bidirectional learning for robust neural networks. In: Proceedings of the International Joint Conference on Neural Networks, 1–8, 2019.
Russo, P.; Carlucci, F. M.; Tommasi, T.; Caputo, B. From source to target and back: Symmetric bidirectional adaptive GAN. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8099–8108, 2018.
Lin, T. Y.; Dollár, P.; Girshick, R.; He, K. M.; Hariharan, B.; Belongie, S. Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 936–944, 2017.
Tchapmi, L.; Choy, C.; Armeni, I.; Gwak, J.; Savarese, S. SEGCloud: Semantic segmentation of 3D point clouds. In: Proceedings of the International Conference on 3D Vision, 537–547, 2017.
Liu, X. H.; Han, Z. Z.; Liu, Y. S.; Zwicker, M. Point2Sequence: Learning the shape representation of 3D point clouds with an attention-based sequence to sequence network. Proceedings of the AAAI Conference on Artificial Intelligence Vol. 33, 8778–8785, 2019.
Komarichev, A.; Zhong, Z. C.; Hua, J. A-CNN: Annularly convolutional neural networks on point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 7413–7422, 2019.
Yan, X.; Zheng, C. D.; Li, Z.; Wang, S.; Cui, S. G. PointASNL: Robust point clouds processing using nonlocal neural networks with adaptive sampling. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5588–5597, 2020.
Xu, Q. G.; Sun, X. D.; Wu, C. Y.; Wang, P. Q.; Neumann, U. Grid-GCN for fast and scalable point cloud learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5660–5669, 2020.
Mao, J. G.; Wang, X. G.; Li, H. S. Interpolated convolutional networks for 3D point cloud understanding. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 1578–1587, 2019.
Han, W. K.; Wen, C. L.; Wang, C.; Li, X.; Li, Q. Point2Node: Correlation learning of dynamic-node for point cloud feature modeling. Proceedings of the AAAI Conference on Artificial Intelligence Vol. 34, No. 7, 10925–10932, 2020.
Li, J. X.; Chen, B. M.; Lee, G. H. SO-Net: Self-organizing network for point cloud analysis. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 9397–9406, 2018.
Srivastava, N.; Hinton, G.; Krizhevsky, A.; Sutskever, I.; Salakhutdinov, R. Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research Vol. 15, 1929–1958, 2014.
Huang, Q. G.; Wang, W. Y.; Neumann, U. Recurrent slice networks for 3D segmentation of point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2626–2635, 2018.
Landrieu, L.; Simonovsky, M. Large-scale point cloud semantic segmentation with superpoint graphs. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4558–4567, 2018.
Jiang, L.; Zhao, H. S.; Liu, S.; Shen, X. Y.; Fu, C. W.; Jia, J. Y. Hierarchical point-edge interaction network for point cloud semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 10432–10440, 2019.
Hu, Q. Y.; Yang, B.; Xie, L. H.; Rosa, S.; Guo, Y. L.; Wang, Z. H.; Trigoni, N.; Markham, A. RandLA-net: Efficient semantic segmentation of large-scale point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 11105–11114, 2020.
Acknowledgements
This work was supported by the National Natural Science Foundation of China (Grant No. 62171393), and National Key R&D Program of China (Grant No. 2021YFF0704600).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
The authors have no competing interests to declare that are relevant to the content of this article.
Additional information
Wenkai Han received his B.Eng. degree in computer science from Xiamen University, China, in 2018. He is currently working towards his M.Sc. degree in the School of Information Science and Engineering, Xiamen University. His research interests include computer vision, machine learning, and 3D point cloud processing.
Hai Wu received his B.Eng. degree in information computing and science from Sichuan University of Science and Technology, China, in 2018. He is currently working towards his M.Sc. degree in the School of Informatics, Xiamen University. His research interests include computer vision, machine learning, and 3D point cloud processing.
Chenglu Wen received her Ph.D. degree in mechanical engineering from China Agricultural University, Beijing, China, in 2009. She is currently an associate professor with the School of Informatics, Xiamen University. Her current research interests include 3D vision, 3D point cloud processing, and intelligent robots. She has coauthored about 80 research papers and is currently an associate editor of IEEE-GRSL and IEEE T-ITS.
Cheng Wang received his Ph.D. degree in information and communication engineering from the National University of Defense Technology, Changsha, China in 2002. He is currently a professor and associate dean of the School of Informatics, and director of Fujian Key Laboratory of Sensing and Computing for Smart Cities, Xiamen University. His current research interests include 3D vision, LiDAR data analysis, and multisensor fusion. He chaired the ISPRS Working Group I/6 on Multi-sensor Integration and Fusion (2016–2020), and is a council member of the China Society of Image and Graphics. He has coauthored over 150 papers.
Xin Li received his B.S. degree in computer science from the University of Science and Technology of China in 2003, and his M.S. and Ph.D. degrees in computer science from Stony Brook University (SUNY) in 2008. He is currently a professor with the School of Electrical Engineering and Computer Science and the Center for Computation and Technology, Louisiana State University (LSU), USA. He leads the Geometric and Visual Computing Laboratory at LSU. His research interests include geometric and visual data processing and analysis, computer graphics, and computer vision.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.
The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www.editorialmanager.com/cvmj.
About this article
Cite this article
Han, W., Wu, H., Wen, C. et al. BLNet: Bidirectional learning network for point clouds. Comp. Visual Media 8, 585–596 (2022). https://doi.org/10.1007/s41095-021-0260-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41095-021-0260-6