Deep Positional and Relational Feature Learning for Rotation-Invariant Point Cloud Analysis

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12355)


In this paper we propose a rotation-invariant deep network for point clouds analysis. Point-based deep networks are commonly designed to recognize roughly aligned 3D shapes based on point coordinates, but suffer from performance drops with shape rotations. Some geometric features, e.g., distances and angles of points as inputs of network, are rotation-invariant but lose positional information of points. In this work, we propose a novel deep network for point clouds by incorporating positional information of points as inputs while yielding rotation-invariance. The network is hierarchical and relies on two modules: a positional feature embedding block and a relational feature embedding block. Both modules and the whole network are proven to be rotation-invariant when processing point clouds as input. Experiments show state-of-the-art classification and segmentation performances on benchmark datasets, and ablation studies demonstrate effectiveness of the network design .


Rotation-invariance Point cloud Deep feature learning 



This work was supported by NSFC (11971373, 11690011, U1811461, 61721002) and National Key R&D Program 2018AAA0102201.

Supplementary material

504449_1_En_13_MOESM1_ESM.pdf (224 kb)
Supplementary material 1 (pdf 224 KB)


  1. 1.
    Qi, C.R., Yi, L., Su, H., Guibas, L.J.: Pointnet++: deep hierarchical feature learning on point sets in a metric space. In: Advances in Neural Information Processing Systems, pp. 5099–5108 (2017)Google Scholar
  2. 2.
    Li, Y., Bu, R., Sun, M., Wu, W., Di, X., Chen, B.: PointCNN: convolution on x-transformed points. In: Advances in Neural Information Processing Systems, pp. 820–830 (2018)Google Scholar
  3. 3.
    Pham, Q.H., Nguyen, T., Hua, B.S., Roig, G., Yeung, S.K.: JSIS3D: joint semantic-instance segmentation of 3D point clouds with multi-task pointwise networks and multi-value conditional random fields. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8827–8836 (2019)Google Scholar
  4. 4.
    Su, H., Maji, S., Kalogerakis, E., Learned-Miller, E.: Multi-view convolutional neural networks for 3D shape recognition. In: IEEE International Conference on Computer Vision (ICCV), (2015) 945–953Google Scholar
  5. 5.
    Esteves, C., Xu, Y., Allen-Blanchette, C., Daniilidis, K.: Equivariant multi-view networks. In: IEEE International Conference on Computer Vision (ICCV), pp. 1568–1577 (2019)Google Scholar
  6. 6.
    Wu, Z., et al.: 3D shapenets: a deep representation for volumetric shapes. In: CVPR, pp. 1912–1920 (2015)Google Scholar
  7. 7.
    Qi, C.R., Su, H., Nießner, M., Dai, A., Yan, M., Guibas, L.J.: Volumetric and multi-view CNNs for object classification on 3D data. In: CVPR, pp. 5648–5656 (2016)Google Scholar
  8. 8.
    Qi, C.R., Su, H., Mo, K., Guibas, L.J.: Pointnet: deep learning on point sets for 3D classification and segmentation. In: CVPR, pp. 652–660 (2017)Google Scholar
  9. 9.
    Xu, Y., Fan, T., Xu, M., Zeng, L., Qiao, Yu.: SpiderCNN: deep learning on point sets with parameterized convolutional filters. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11212, pp. 90–105. Springer, Cham (2018). Scholar
  10. 10.
    Wang, Y., Sun, Y., Liu, Z., Sarma, S.E., Bronstein, M.M., Solomon, J.M.: Dynamic graph CNN for learning on point clouds. ACM Trans. Graph. (TOG) 38(5), 1–12 (2019)CrossRefGoogle Scholar
  11. 11.
    Rao, Y., Lu, J., Zhou, J.: Spherical fractal convolutional neural networks for point cloud recognition. In: CVPR, pp. 452–460 (2019)Google Scholar
  12. 12.
    Zhou, Y., Jiang, G., Lin, Y.: A novel finger and hand pose estimation technique for real-time hand gesture recognition. Pattern Recogn. 49, 102–114 (2016)CrossRefGoogle Scholar
  13. 13.
    Xiang, Y., Schmidt, T., Narayanan, V., Fox, D.: PoseCNN: a convolutional neural network for 6D object pose estimation in cluttered scenes. arXiv:1711.00199 (2017)
  14. 14.
    Harald, W., Didier, S.: Tracking of industrial objects by using cad models. J. Virtual Reality Broadcast. 4(1) (2007) Google Scholar
  15. 15.
    Bero, S.A., Muda, A.K., Choo, Y., Muda, N.A., Pratama, S.F.: Rotation analysis of moment invariant for 2D and 3D shape representation for molecular structure of ATS drugs. In: 4th World Congress on Information and Communication Technologies (WICT), pp. 308–313 (2014)Google Scholar
  16. 16.
    Berenger, F., Voet, A., Lee, X.Y., Zhang, K.Y.J.: A rotation-translation invariant molecular descriptor of partial charges and its use in ligand-based virtual screening. J. Cheminform. 6(1), 1–12 (2014). Scholar
  17. 17.
    Esteves, C., Allen-Blanchette, C., Makadia, A., Daniilidis, K.: Learning SO(3) equivariant representations with spherical CNNs. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11217, pp. 54–70. Springer, Cham (2018). Scholar
  18. 18.
    Chen, C., Li, G., Xu, R., Chen, T., Wang, M., Lin, L.: Clusternet: deep hierarchical cluster network with rigorously rotation-invariant representation for point cloud analysis. In: CVPR, pp. 4994–5002 (2019)Google Scholar
  19. 19.
    Liu, M., Yao, F., Choi, C., Sinha, A., Ramani, K.: Deep learning 3D shapes using ALT-AZ anisotropic 2-sphere convolution. In: ICLR (2019)Google Scholar
  20. 20.
    Zhang, Z., Hua, B.S., Rosen, D.W., Yeung, S.K.: Rotation invariant convolutions for 3D point clouds deep learning. In: International Conference on 3D Vision (3DV), pp. 204–213 (2019)Google Scholar
  21. 21.
    Shen, Y., Feng, C., Yang, Y., Tian, D.: Mining point cloud local structures by kernel correlation and graph pooling. In: CVPR, pp. 4548–4557 (2018)Google Scholar
  22. 22.
    Wu, W., Qi, Z., Fuxin, L.: Pointconv: deep convolutional networks on 3D point clouds. In: CVPR, pp. 9621–9630 (2019)Google Scholar
  23. 23.
    Liu, Y., Fan, B., Xiang, S., Pan, C.: Relation-shape convolutional neural network for point cloud analysis. In: CVPR, pp. 8895–8904 (2019)Google Scholar
  24. 24.
    Jaderberg, M., Simonyan, K., Zisserman, A., et al.: Spatial transformer networks. In: Advances in Neural Information Processing Systems, pp. 2017–2025 (2015)Google Scholar
  25. 25.
    Bas, A., Huber, P., Smith, W.A., Awais, M., Kittler, J.: 3D morphable models as spatial transformer networks. In: IEEE International Conference on Computer Vision Workshops, pp. 904–912 (2017)Google Scholar
  26. 26.
    Mukhaimar, A., Tennakoon, R., Lai, C.Y., Hoseinnezhad, R., Bab-Hadiashar, A.: PL-net3D: robust 3D object class recognition using geometric models. IEEE Access 7, 163757–163766 (2019)CrossRefGoogle Scholar
  27. 27.
    Stein, F., Medioni, G.: Structural indexing: efficient 3-D object recognition. IEEE Trans. Pattern Anal. Mach. Intell. (2), 125–145 (1992)Google Scholar
  28. 28.
    Sun, Y., Abidi, M.A.: Surface matching by 3D point’s fingerprint. In: IEEE International Conference on Computer Vision (ICCV), pp. 263–269 (2001)Google Scholar
  29. 29.
    Zhong, Y.: Intrinsic shape signatures: a shape descriptor for 3D object recognition. In: International Conference on Computer Vision Workshops, pp. 689–696 (2009)Google Scholar
  30. 30.
    Rusu, R.B., Blodow, N., Beetz, M.: Fast point feature histograms (FPFH) for 3D registration. In: IEEE International Conference on Robotics and Automation, pp. 3212–3217 (2009)Google Scholar
  31. 31.
    Tombari, F., Salti, S., Di Stefano, L.: Unique signatures of histograms for local surface description. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6313, pp. 356–369. Springer, Heidelberg (2010). Scholar
  32. 32.
    Deng, H., Birdal, T., Ilic, S.: PPF-FoldNet: unsupervised learning of rotation invariant 3D local descriptors. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11209, pp. 620–638. Springer, Cham (2018). Scholar
  33. 33.
    Thomas, N., et al.: Tensor field networks: rotation-and translation-equivariant neural networks for 3D point clouds. arXiv:1802.08219 (2018)
  34. 34.
    Cohen, T., Welling, M.: Group equivariant convolutional networks. In: International Conference on Machine Learning (ICML), pp. 2990–2999 (2016)Google Scholar
  35. 35.
    Worrall, D., Brostow, G.: CubeNet: equivariance to 3D rotation and translation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11209, pp. 585–602. Springer, Cham (2018). Scholar
  36. 36.
    Zimmermann, B.P.: On finite groups acting on spheres and finite subgroups of orthogonal groups. arXiv:1108.2602 (2011)
  37. 37.
    Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)Google Scholar
  38. 38.
    Savva, M., et al.: Shrec16 track: large-scale 3D shape retrieval from shapenet core55. In: Eurographics Workshop on 3D Object Retrieval (2016)Google Scholar
  39. 39.
    Chang, A.X., et al.: Shapenet: an information-rich 3D model repository. arXiv preprint arXiv:1512.03012 (2015)
  40. 40.
    Maturana, D., Scherer, S.: Voxnet: a 3D convolutional neural network for real-time object recognition. In: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 922–928 (2015)Google Scholar
  41. 41.
    Li, G., Muller, M., Thabet, A., Ghanem, B.: DeepGCNs: can GCNs go as deep as CNNs? In: CVPR, pp. 9267–9276 (2019)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Xi’an Jiaotong UniversityXi’anChina
  2. 2.Technical University of MunichMunichGermany
  3. 3.GoogleZürichSwitzerland

Personalised recommendations