Skip to main content
Log in

RegGeoNet: Learning Regular Representations for Large-Scale 3D Point Clouds

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Deep learning has proven an effective tool for 3D point cloud processing. Currently, most deep set architectures are developed for sparse inputs (typically with a few thousand points), which are unable to provide sufficient structural statistics and semantic cues due to low resolutions. Since these architectures suffer from unacceptable computational and memory costs when consuming dense inputs, there is a pressing need in real-world applications to handle large-scale 3D point clouds. To bridge this gap, this paper presents a novel unsupervised neural architecture called RegGeoNet to parameterize an unstructured point set into a completely regular image structure dubbed as deep geometry image (DeepGI), such that spatial coordinates of unordered points are recorded in three-channel grid pixels. Intuitively, our goal is to embed irregular 3D surface points onto uniform 2D lattice grids, while trying to preserve local neighborhood consistency. Functionally, DeepGI serves as a generic representation modality for raw point cloud data and can be conveniently integrated into mature image processing pipelines. Driven by its unique structural characteristics, we are motivated to customize a set of efficient feature extractors that directly operate on DeepGIs for achieving a rich variety of downstream tasks. To demonstrate the potential and universality of our proposed learning paradigms built upon DeepGIs for large-scale point cloud processing, we conduct extensive experiments on various downstream tasks, including shape classification, object part segmentation, scene semantic segmentation, normal estimation, and geometry compression, where our frameworks achieve highly competitive performance, compared with state-of-the-art methods. The source code will be publicly available at https://github.com/keeganhk/RegGeoNet.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

References

  • Armeni, I., Sener, O., Zamir, A. R., Jiang, H., Brilakis, I., Fischer, M., Savarese, S. (2016). 3d semantic parsing of large-scale indoor spaces. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1534–1543).

  • Atzmon, M., Maron, H., & Lipman, Y. (2018). Point convolutional neural networks by extension operators. ACM Transactions on Graphics, 37(71), 12.

    Google Scholar 

  • Boscaini, D., Masci, J., Rodoià, E., Bronstein, M. (2016). Learning shape correspondence with anisotropic convolutional neural networks. In Proceedings of the NeurIPS (pp. 3197–3205).

  • Bross, B., Wang, Y. K., Ye, Y., Liu, S., Chen, J., Sullivan, G. J., & Ohm, J. R. (2021). Overview of the versatile video coding (vvc) standard and its applications. IEEE Transactions on Circuits and Systems for Video Technology, 31(10), 3736–3764.

    Article  Google Scholar 

  • Chang, A. X., Funkhouser, T., Guibas, L., Hanrahan, P., Huang, Q., Li, Z., Savarese, S., Savva, M., Song, S., & Su, H. (2015). Shapenet: An information-rich 3d model repository. arXiv preprint arXiv:1512.03012.

  • Chen, C., Li, G., Xu, R., Chen, T., Wang, M., Lin, L. (2019). Clusternet: Deep hierarchical cluster network with rigorously rotation-invariant representation for point cloud analysis. In Proceedings of the CVPR (pp. 4994–5002).

  • Chu, L., Pan, H., & Wang, W. (2021). Unsupervised shape completion via deep prior in the neural tangent kernel perspective. ACM Transactions on Graphics, 40(3), 1–17.

    Article  Google Scholar 

  • Ezuz, D., Solomon, J., Kim, V. G., & Ben-Chen, M. (2017). Gwcnn: A metric alignment layer for deep shape analysis. Computer Graphics Forum, 36, 49–57.

    Article  Google Scholar 

  • Fan, H., Su, H., Guibas, L.J. (2017). A point set generation network for 3d object reconstruction from a single image. In Proceedings of the CVPR (pp 605–613).

  • Fan, S., Dong, Q., Zhu, F., Lv, Y., Ye, P., Wang, F. Y. (2021). Scf-net: Learning spatial contextual features for large-scale point cloud segmentation. In Proceedings of the CVPR (pp. 14504–14513).

  • Gadelha, M., Wang, R., Maji, S. (2019). Shape reconstruction using differentiable projections and deep priors. In Proceedings of the ICCV (pp. 22–30).

  • Gandelsman, Y., Shocher, A., Irani, M. (2019). “Double-dip”: Unsupervised image decomposition via coupled deep-image-priors. In Proceedings of the CVPR (pp. 11026–11035).

  • Gojcic, Z., Zhou, C., Wegner, J. D., Guibas, L. J., Birdal, T. (2020). Learning multiview 3d point cloud registration. In Proceedings of the CVPR (pp. 1759–1769).

  • Gu, X., Gortler, S. J., & Hoppe, H. (2002). Geometry images. Proceedings of the SIGGRAPH, 21(3), 355–361.

    Article  Google Scholar 

  • Gu, X., Wang, Y., Wu, C., Lee, Y. J., Wang, P. (2019). Hplflownet: Hierarchical permutohedral lattice flownet for scene flow estimation on large-scale point clouds. In Proceedings of the CVPR (pp. 3254–3263).

  • Guo, M. H., Cai, J. X., Liu, Z. N., Mu, T. J., Martin, R. R., Hu, S. M. (2020), Pct: Point cloud transformer. arXiv preprint arXiv:2012.09688

  • Haim, N., Segol, N., Ben-Hamu, H., Maron, H., Lipman, Y. (2019) Surface networks via general covers. In Proceedings of the ICCV (pp. 632–641).

  • Hanocka, R., Metzer, G., Giryes, R., & Cohen-Or, D. (2020). Point2mesh: A self-prior for deformable meshes. ACM Transactions on Graphics, 39(4), 1–12.

    Article  Google Scholar 

  • He, K., Zhang, X., Ren, S., Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the CVPR (pp. 770–778).

  • Heckel, R., Hand, P. (2019). Deep decoder: Concise image representations from untrained non-convolutional networks. In Proceedings of the ICLR.

  • Hu, Q., Yang, B., Xie, L., Rosa, S., Guo, Y., Wang, Z., Trigoni, N., Markham, A. (2020). Randla-net: Efficient semantic segmentation of large-scale point clouds. In Proceedings of the CVPR (pp. 11108–11117).

  • Hua, B. S., Tran, M. K., Yeung, S. K. (2018). Pointwise convolutional neural networks. In Proceedings of the CVPR (pp. 984–993).

  • Jang, E., Gu, S., Poole, B. (2017). Categorical reparameterization with gumbel-softmax. In: Proceedings of the ICLR.

  • Jiang, L., Zhao, H., Liu, S., Shen, X., Fu, C. W., Jia, J. (2019). Hierarchical point-edge interaction network for point cloud semantic segmentation. In Proceedings of the ICCV (pp. 10433–10441).

  • Kalogerakis, E., Averkiou, M., Maji, S., Chaudhuri, S. (2017). 3d shape segmentation with projective convolutional networks. In Proceedings of the CVPR (pp. 3779–3788).

  • Kanezaki, A., Matsushita, Y., Nishida, Y. (2018) Rotationnet: Joint object categorization and pose estimation using multiviews from unsupervised viewpoints. In Proceedings of the CVPR (pp. 5010–5019).

  • Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L. (2014). Large-scale video classification with convolutional neural networks. In Proceedings of the CVPR (pp. 1725–1732).

  • Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Proceedings of the NeurIPS (Vol. 25, pp. 1097–1105).

  • Landrieu, L., Simonovsky, M. (2018) Large-scale point cloud semantic segmentation with superpoint graphs. In Proceedings of the CVPR (pp. 4558–4567).

  • Le, E. T., Kokkinos, I., Mitra, N. J. (2020). Going deeper with lean point networks. In Proceedings of the CVPR (pp. 9503–9512).

  • Lei, H., Akhtar, N., & Mian, A. (2020). Spherical kernel for efficient graph convolution on 3d point clouds. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(10), 3664–3680.

    Article  Google Scholar 

  • Li, J., Chen, B. M., Lee, G. H. (2018a). So-net: Self-organizing network for point cloud analysis. In Proceedings of the CVPR (pp. 9397–9406).

  • Li, Y., Bu, R., Sun, M., Wu, W., Di, X., Chen, B. (2018b). Pointcnn: Convolution on \(\chi \)-transformed points. In Proceedings of the NeurIPS (pp. 828–838).

  • Lin, Y., Yan, Z., Huang, H., Du, D., Liu, L., Cui, S., Han, X. (2020). Fpconv: Learning local flattening for point convolution. In Proceedings of the CVPR (pp. 4293–4302).

  • Liu, H., Yuan, H., Liu, Q., Hou, J., & Liu, J. (2019). A comprehensive study and comparison of core technologies for mpeg 3-d point cloud compression. IEEE Transactions on Broadcasting, 66(3), 701–717.

    Article  Google Scholar 

  • Liu, Y., Fan, B., Meng, G., Lu, J., Xiang, S., Pan, C. (2019b). Densepoint: Learning densely contextual representation for efficient point cloud processing. In Proceedings of the ICCV (pp. 5239–5248).

  • Liu, Y., Fan, B., Xiang, S., Pan, C. (2019c). Relation-shape convolutional neural network for point cloud analysis. In Proceedings of the CVPR (pp. 8895–8904).

  • Liu, Z., Tang, H., Lin, Y., Han, S. (2019d) Point-voxel cnn for efficient 3d deep learning. In Proceedings of the NeurIPS.

  • Liu, Z., Hu, H., Cao, Y., Zhang, Z., Tong, X. (2020), A closer look at local aggregation operators in point cloud analysis. In Proceedings of the ECCV (pp. 326–342).

  • Long, J., Shelhamer, E., Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of the CVPR (pp. 3431–3440).

  • Maddison, C. J., Mnih, A., Teh, Y. W. (2017). The concrete distribution: A continuous relaxation of discrete random variables. In Proceedings of the ICLR.

  • Mao, J., Wang, X., Li, H. (2019). Interpolated convolutional networks for 3d point cloud understanding. In Proceedings of the ICCV (pp. 1578–1587).

  • Maron, H., Galun, M., Aigerman, N., Trope, M., Dym, N., Yumer, E., Kim, V. G., & Lipman, Y. (2017). Convolutional neural networks on surfaces via seamless toric covers. ACM Transactions on Graphics, 36(4), 71–1.

    Article  Google Scholar 

  • Masci, J., Boscaini, D., Bronstein, M., Vandergheynst, P. (2015). Geodesic convolutional neural networks on riemannian manifolds. In Proceedings of the ICCVW (pp. 37–45).

  • Maturana, D,, Scherer, S. (2015). Voxnet: A 3d convolutional neural network for real-time object recognition. In Proceedings of the IROS (pp. 922–928).

  • Monti, F., Boscaini, D., Masci, J., Rodola, E., Svoboda, J., Bronstein, M. M. (2017). Geometric deep learning on graphs and manifolds using mixture model cnns. In Proceedings of the CVPR (pp. 5115–5124).

  • Nezhadarya, E., Taghavi, E., Razani, R., Liu, B., Luo, J. (2020). Adaptive hierarchical down-sampling for point cloud classification. In Proceedings of the CVPR (pp. 12956–12964).

  • Park, C., Jeong, Y., Cho, M., Park, J. (2022). Fast point transformer. In Proceedings of the CVPR.

  • Qi, C. R., Su, H., Nießner, M., Dai, A., Yan, M., Guibas, L. J. (2016). Volumetric and multi-view cnns for object classification on 3d data. In Proceedings of the CVPR (pp. 5648–5656).

  • Qi, C. R., Su, H., Mo, K., Guibas, L. J. (2017a). Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the CVPR (pp. 652–660).

  • Qi, C. R., Yi, L., Su, H., Guibas, L. J. (2017b). Pointnet++: Deep hierarchical feature learning on point sets in a metric space. In Proceedings of the NeurIPS (pp. 5105–5114).

  • Qiu, S., Anwar, S., Barnes, N. (2021). Semantic segmentation for real point cloud scenes via bilateral augmentation and adaptive fusion. In Proceedings of the CVPR (pp. 1757–1767).

  • Rao, Y., Lu, J., Zhou, J. (2019). Spherical fractal convolutional neural networks for point cloud recognition. In Proceedings of the CVPR (pp. 452–460).

  • Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell, 39(06), 1137–1149.

    Article  Google Scholar 

  • Rethage, D., Wald, J., Sturm, J., Navab, N., Tombari, F. (2018). Fully-convolutional point networks for large-scale point clouds. In Proceedings of the ECCV (pp. 596–611).

  • Riba, E., Mishkin, D., Ponsa, D., Rublee, E., Bradski, G. (2020) Kornia: An open source differentiable computer vision library for pytorch. In Proceedings of the WACV (pp. 3674–3683).

  • Riegler, G., Osman Ulusoy, A., Geiger, A. (2017) Octnet: Learning deep 3d representations at high resolutions. In Proceedings of the CVPR (pp. 3577–3586).

  • Ronneberger, O., Fischer, P., Brox, T. (2015). U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the MICCAI (pp. 234–241).

  • Roweis, S. T., & Saul, L. K. (2000). Nonlinear dimensionality reduction by locally linear embedding. Science, 290(5500), 2323–2326.

    Article  Google Scholar 

  • Schwarz, S., Preda, M., Baroncini, V., Budagavi, M., Cesar, P., Chou, P. A., Cohen, R. A., Krivokuća, M., Lasserre, S., Li, Z., et al. (2018). Emerging mpeg standards for point cloud compression. IEEE Journal on Emerging and Selected Topics in Circuits and Systems, 9(1), 133–148.

    Article  Google Scholar 

  • Simonyan, K., Zisserman, A. (2014). Two-stream convolutional networks for action recognition in videos. In Proceedings of the NeurIPS.

  • Sinha, A., Bai, J., Ramani, K. (2016), Deep learning 3d shape surfaces using geometry images. In Proceedings of the ECCV (pp. 223–240).

  • Sinha, A., Unmesh, A., Huang, Q., Ramani, K. (2017). Surfnet: Generating 3d shape surfaces using deep residual networks. In Proceedings of the CVPR, (pp. 6040–6049).

  • Su, H., Maji, S., Kalogerakis, E., Learned-Miller, E. (2015). Multi-view convolutional neural networks for 3d shape recognition. In Proceedings of the ICCV (pp. 945–953).

  • Su, H., Jampani, V., Sun, D., Maji, S., Kalogerakis, E., Yang, M. H., Kautz, J. (2018). Splatnet: Sparse lattice networks for point cloud processing. In Proceedings of the CVPR (pp. 2530–2539).

  • Sullivan, G. J., Ohm, J. R., Han, W. J., & Wiegand, T. (2012). Overview of the high efficiency video coding (hevc) standard. IEEE Transactions on Circuits and Systems for Video Technology, 22(12), 1649–1668.

    Article  Google Scholar 

  • Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z. (2016). Rethinking the inception architecture for computer vision. In Proceedings of the CVPR (pp. 2818–2826).

  • Tatarchenko, M., Park, J., Koltun, V., Zhou, Q. Y. (2018). Tangent convolutions for dense prediction in 3d. In Proceedings of the CVPR (pp. 3887–3896).

  • Thomas, H., Qi. C. R., Deschaud, J. E., Marcotegui, B., Goulette, F., Guibas, L. J. (2019). Kpconv: Flexible and deformable convolution for point clouds. In Proceedings of the ICCV (pp. 6411–6420).

  • Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M. (2015) Learning spatiotemporal features with 3d convolutional networks. In Proceedings of the ICCV (pp. 4489–4497).

  • Ulyanov, D., Vedaldi, A., Lempitsky, V. (2018). Deep image prior. In Proceedings of the CVPR (pp. 9446–9454).

  • Uy, M A., Pham, Q. H., Hua, B. S., Nguyen, T., Yeung, S. K. (2019). Revisiting point cloud classification: A new benchmark dataset and classification model on real-world data. In Proceedings of the ICCV (pp. 1588–1597).

  • Verma, N., Boyer, E., Verbeek, J. (2018). Feastnet: Feature-steered graph convolutions for 3d shape analysis. In Proceedings of the CVPR (pp. 2598–2606).

  • Vlasic, D., Baran, I., Matusik, W., Popović, J. (2008). Articulated mesh animation from multi-view silhouettes. In ACM SIGGRAPH 2008 papers (pp. 1–9).

  • Wang, C., Smari, B., Siddiqi, K. (2018). Local spectral graph convolution for point set feature learning. In Proceedings of the ECCV (pp. 52–66).

  • Wang, P. S., Liu, Y., Guo, Y. X., Sun, C. Y., & Tong, X. (2017). O-cnn: Octree-based convolutional neural networks for 3d shape analysis. ACM Transactions on Graphics, 36(4), 1–11.

    Google Scholar 

  • Wang, Y., Sun, Y., Liu, Z., Sarma, S. E., Bronstein, M. M., & Solomon, J. M. (2019). Dynamic graph cnn for learning on point clouds. ACM Transactions on Graphics, 38(5), 1–12.

  • Wu, W., Qi, Z., Fuxin, L. (2019). Pointconv: Deep convolutional networks on 3d point clouds. In Proceedings of the CVPR (pp. 9621–9630).

  • Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., Xiao, J. (2015). 3d shapenets: A deep representation for volumetric shapes. In Proceedings of the CVPR (pp. 1912–1920).

  • Xiang, T., Zhang, C., Song, Y., Yu, J., Cai, W. (2021). Walk in the cloud: Learning curves for point clouds shape analysis. In Proceedings of the ICCV (pp. 915–924).

  • Xu, M., Ding, R., Zhao, H., Qi, X. (2021a). Paconv: Position adaptive convolution with dynamic kernel assembling on point clouds. In Proceedings of the CVPR (pp. 3173–3182).

  • Xu, M., Zhang, J., Zhou, Z., Xu, M., Qi, X., Qiao, Y. (2021b). Learning geometry-disentangled representation for complementary understanding of 3d object point cloud. In Proceedings of the AAAI.

  • Xu, Q., Sun, X., Wu, C. Y., Wang, P., Neumann, U. (2020). Grid-gcn for fast and scalable point cloud learning. In Proceedings of the CVPR (pp. 5661–5670).

  • Xu, Y., Fan, T., Xu, M., Zeng, L., Qiao, Y. (2018). Spidercnn: Deep learning on point sets with parameterized convolutional filters. In Proceedings of the ECCV (pp. 87–102).

  • Yan, X., Zheng, C., Li, Z., Wang, S., Cui, S. (2020). Pointasnl: Robust point clouds processing using nonlocal neural networks with adaptive sampling. In Proceedings of the CVPR (pp. 5589–5598).

  • Yang, J., Zhang, Q., Ni, B., Li, L., Liu, J., Zhou, M., Tian, Q. (2019). Modeling point clouds with self-attention and gumbel subset sampling. In Proceedings of the CVPR (pp. 3323–3332).

  • Yi, L., Kim, V. G., Ceylan, D., Shen, I. C., Yan, M., Su, H., Lu, C., Huang, Q., Sheffer, A., & Guibas, L. (2016). A scalable active framework for region annotation in 3d shape collections. ACM Transactions on Graphics, 35(6), 1–12.

    Article  Google Scholar 

  • Yu, F., Liu, K., Zhang, Y., Zhu, C., Xu, K. (2019) Partnet: A recursive part decomposition network for fine-grained and hierarchical shape segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 9491–9500).

  • Yu, T., Meng, J., Yuan, J. (2018). Multi-view harmonized bilinear network for 3d object recognition. In Proceedings of the CVPR (pp. 186–194).

  • Zhang, Z., Hua, B. S., Yeung, S. K. (2019) Shellnet: Efficient point cloud convolutional neural networks using concentric shells statistics. In Proceedings of the ICCV (pp. 1607–1616).

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Junhui Hou.

Additional information

Communicated by Yasutaka Furukawa.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This project was supported in part by the Hong Kong Research Grants Council under Grants 11219422, 11202320 and 11218121, and in part by the Natural Science Foundation of China under Grant 61871342.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, Q., Hou, J., Qian, Y. et al. RegGeoNet: Learning Regular Representations for Large-Scale 3D Point Clouds. Int J Comput Vis 130, 3100–3122 (2022). https://doi.org/10.1007/s11263-022-01682-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-022-01682-w

Keywords

Navigation