Abstract
The problems of shape classification and part segmentation from 3D point clouds have garnered increasing attention in the last few years. Both of these problems, however, suffer from relatively small training sets, creating the need for statistically efficient methods to learn 3D shape representations. In this paper, we investigate the use of Approximate Convex Decompositions (ACD) as a self-supervisory signal for label-efficient learning of point cloud representations. We show that using ACD to approximate ground truth segmentation provides excellent self-supervision for learning 3D point cloud representations that are highly effective on downstream tasks. We report improvements over the state-of-the-art for unsupervised representation learning on the ModelNet40 shape classification dataset and significant gains in few-shot part segmentation on the ShapeNetPart dataset. Our source code is publicly available (https://github.com/matheusgadelha/PointCloudLearningACD).
M. Gadelha and A. RoyChowdhury—Equal contribution.
A. RoyChowdhury—Now at Amazon, work done prior to joining.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Achlioptas, P., Diamanti, O., Mitliagkas, I., Guibas, L.: Representation learning and adversarial generation of 3D point clouds. arXiv preprint arXiv:1707.02392 (2017)
Alliegro, A., Boscaini, D., Tommasi, T.: Joint supervised and self-supervised learning for 3D real-world challenges (2020). https://arxiv.org/abs/2004.07392
Arthur, D., Vassilvitskii, S.: k-means++: The advantages of careful seeding. Technical report (2007)
Au, O.K.C., Zheng, Y., Chen, M., Xu, P., Tai, C.L.: Mesh segmentation with concavity-aware fields. IEEE Trans. Vis. Comput. Graph. 18(7), 1125–1134 (2011). https://doi.org/10.1109/TVCG.2011.131
Bernard, C.: Convex partitions of polyhedra a lower bound and worst-case optimal algorithm. SIAM J. Comput. 13(3), 488–507 (1984). https://dl.acm.org/doi/10.1137/0213031
Caron, M., Bojanowski, P., Joulin, A., Douze, M.: Deep clustering for unsupervised learning of visual features. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) Computer Vision – ECCV 2018. LNCS, vol. 11218, pp. 139–156. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01264-9_9
Caron, M., Bojanowski, P., Mairal, J., Joulin, A.: Unsupervised pre-training of image features on non-curated data. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2959–2968 (2019)
Chang, A.X., et al.: ShapeNet: An information-rich 3D model repository. CoRR abs/1512.03012 (2015)
Chen, X., Golovinskiy, A., Funkhouser, T.: A benchmark for 3D mesh segmentation. ACM Trans. Graph. (Proc. SIGGRAPH) 28(3), 1–12 (2009)
Chen, Z., Tagliasacchi, A., Zhang, H.: BSP-Net: generating compact meshes via binary space partitioning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2019)
Chen, Z., Yin, K., Fisher, M., Chaudhuri, S., Zhang, H.: BAE-NET: branched autoencoder for shape co-segmentation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 8490–8499 (2019)
Chopra, S., Hadsell, R., LeCun, Y.: Learning a similarity metric discriminatively, with application to face verification. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, vol. 1, pp. 539–546. IEEE (2005)
Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley, Hoboken (2012)
Demir, İ., Aliaga, D.G., Benes, B.: Near-convex decomposition and layering for efficient 3D printing. Addit. Manuf. 21, 383–394 (2018). https://doi.org/10.1016/j.addma.2018.03.008. http://www.sciencedirect.com/science/article/pii/S2214860417300386
Deng, B., Genova, K., Yazdani, S., Bouaziz, S., Hinton, G., Tagliasacchi, A.: CvxNet: learnable convex decomposition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2020)
Doersch, C., Gupta, A., Efros, A.A.: Unsupervised visual representation learning by context prediction. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1422–1430 (2015)
Donahue, J., Simonyan, K.: Large scale adversarial representation learning. In: Advances in Neural Information Processing Systems, pp. 10541–10551 (2019)
Gadelha, M., Maji, S., Wang, R.: 3D shape generation using spatially ordered point clouds. In: British Machine Vision Conference (BMVC) (2017)
Gadelha, M., Wang, R., Maji, S.: Multiresolution tree networks for 3D point cloud processing. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 105–122. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_7
Ghosh, M., Amato, N.M., Lu, Y., Lien, J.M.: Fast approximate convex decomposition using relative concavity. Comput. Aided Des. 45, 494–504 (2013)
Goyal, P., Mahajan, D., Gupta, A., Misra, I.: Scaling and benchmarking self-supervised visual representation learning. arXiv preprint arXiv:1905.01235 (2019)
Groueix, T., Fisher, M., Kim, V.G., Russell, B., Aubry, M.: AtlasNet: a Papier-Mâché approach to learning 3D surface generation. In: Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Gupta, S., Hoffman, J., Malik, J.: Cross modal distillation for supervision transfer. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2827–2836 (2016)
Hadsell, R., Chopra, S., LeCun, Y.: Dimensionality reduction by learning an invariant mapping. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2006, vol. 2, pp. 1735–1742. IEEE (2006)
Hassani, K., Haley, M.: Unsupervised multi-task feature learning on point clouds. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 8160–8171 (2019)
Hoffman, D.D., Richards, W.: Parts of recognition. Cognition 18, 65–96 (1983)
Huang, H., Kalogerakis, E., Chaudhuri, S., Ceylan, D., Kim, V.G., Yumer, E.: Learning local shape descriptors from part correspondences with multiview convolutional networks. ACM Trans. Graph. 37(1), 1–14 (2018)
Huang, J., Yagel, R., Filippov, V., Kurzion, Y.: An accurate method for voxelizing polygon meshes. In: IEEE Symposium on Volume Visualization (October 1998)
Jiang, H., Larsson, G., Maire, M., Shakhnarovich, G., Learned-Miller, E.: Self-supervised relative depth learning for urban scene understanding. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11215, pp. 20–37. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01252-6_2
Kaick, O.V., Fish, N., Kleiman, Y., Asafi, S., Cohen-OR, D.: Shape segmentation by approximate convexity analysis. ACM Trans. Graph. 34(1), 1–11 (2014)
Kalogerakis, E., Averkiou, M., Maji, S., Chaudhuri, S.: 3D shape segmentation with projective convolutional networks. In: Proceedings of the CVPR (2017)
Klokov, R., Lempitsky, V.: Escape from cells: deep Kd-networks for the recognition of 3D point cloud models. In: Proceedings of the ICCV (2017)
Kolesnikov, A., Zhai, X., Beyer, L.: Revisiting self-supervised visual representation learning. arXiv preprint arXiv:1901.09005 (2019)
Kong, S., Fowlkes, C.C.: Recurrent pixel embedding for instance grouping. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9018–9028 (2018)
Larsson, G., Maire, M., Shakhnarovich, G.: Learning representations for automatic colorization. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_35
Li, J., Chen, B.M., Hee Lee, G.: SO-Net: self-organizing network for point cloud analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9397–9406 (2018)
Lien, J.M., Amato, N.M.: Approximate convex decomposition of polyhedra. In: Proceedings of the 2007 ACM Symposium on Solid and Physical Modeling, SPM 2007 (2007)
Luo, L., Baran, I., Rusinkiewicz, S., Matusik, W.: Chopper: Partitioning models into 3D-printable parts. ACM Trans. Graph. 31(6), 1–9 (2012)
Mamou, K.: Volumetric approximate convex decomposition, chap. 12. In: Lengyel, E. (ed.) Game Engine Gems, vol. 3, pp. 141–158. A K Peters/CRC Press (2016)
Maturana, D., Scherer, S.: 3D convolutional neural networks for landing zone detection from LiDAR. In: Proceedings of the ICRA (2015)
Müllner, D., et al.: fastcluster: fast hierarchical, agglomerative clustering routines for R and Python. J. Stat. Softw. 53(9), 1–18 (2013)
Muralikrishnan, S., Kim, V.G., Chaudhuri, S.: Tags2parts: Discovering semantic regions from shape tags. CoRR abs/1708.06673 (2017)
Muralikrishnan, S., Kim, V.G., Chaudhuri, S.: Tags2Parts: discovering semantic regions from shape tags. In: Proceedings of the CVPR. IEEE (2018)
Noroozi, M., Favaro, P.: Unsupervised learning of visual representations by solving jigsaw puzzles. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9910, pp. 69–84. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46466-4_5
Pathak, D., Girshick, R., Dollár, P., Darrell, T., Hariharan, B.: Learning features by watching objects move. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2701–2710 (2017)
Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T., Efros, A.A.: Context encoders: feature learning by inpainting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2536–2544 (2016)
Qi, C.R., Su, H., Mo, K., Guibas, L.J.: PointNet: deep learning on point sets for 3D classification and segmentation. In: Proceedings of the CVPR (2017)
Qi, C.R., Yi, L., Su, H., Guibas, L.: PointNet++: deep hierarchical feature learning on point sets in a metric space. In: Proceedings of the NIPS (2017)
Riegler, G., Ulusoys, A.O., Geiger, A.: OctNet: learning deep 3D representations at high resolutions. In: Proceedings of the CVPR (2017)
Sánchez, J., Perronnin, F., Mensink, T., Verbeek, J.: Image classification with the fisher vector: theory and practice. Int. J. Comput. Vis. 105(3), 222–245 (2013)
Saxe, A.M., Koh, P.W., Chen, Z., Bhand, M., Suresh, B., Ng, A.Y.: On random weights and unsupervised feature learning. In: ICML, vol. 2, p. 6 (2011)
Sharma, A., Grau, O., Fritz, M.: VConv-DAE: deep volumetric shape learning without object labels. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9915, pp. 236–250. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-49409-8_20
Sharma, G., Kalogerakis, E., Maji, S.: Learning point embeddings from shape repositories for few-shot segmentation. In: 2019 International Conference on 3D Vision (3DV), pp. 67–75 (2019)
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)
Su, H., et al.: SPLATNet: sparse lattice networks for point cloud processing. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018)
Su, H., Maji, S., Kalogerakis, E., Learned-Miller, E.G.: Multi-view convolutional neural networks for 3D shape recognition. In: Proceedings of the ICCV (2015)
Su, H., Qi, C., Mo, K., Guibas, L.: PointNet: deep Learning on point sets for 3D classification and segmentation. In: CVPR (2017)
Su, J.C., Gadelha, M., Wang, R., Maji, S.: A deeper look at 3D shape classifiers. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11131. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11015-4_49
Su, J.C., Maji, S., Hariharan, B.: When does self-supervision improve few-shot learning? arXiv preprint arXiv:1910.03560 (2019)
Tatarchenko, M., Park, J., Koltun, V., Zhou., Q.Y.: Tangent convolutions for dense prediction in 3D. In: CVPR (2018)
Thabet, A., Alwassel, H., Ghanem, B.: MortonNet: Self-Supervised Learning of Local Features in 3D Point Clouds. arXiv (March 2019). https://arxiv.org/abs/1904.00230
Trimble Inc.: Trimble 3D Warehouse (2008). https://3dwarehouse.sketchup.com/
Trinh, T.H., Luong, M.T., Le, Q.V.: Selfie: Self-supervised pretraining for image embedding. arXiv preprint arXiv:1906.02940 (2019)
Vinh, N.X., Epps, J., Bailey, J.: Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance. J. Mach. Learn. Res. 11, 2837–2854 (2010)
Von Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17(4), 395–416 (2007)
Wang, P.S., Liu, Y., Guo, Y.X., Sun, C.Y., Tong, X.: O-CNN: octree-based convolutional neural networks for 3D shape analysis. ACM Trans. Graph. 36(4), 1–11 (2017)
Wang, P.S., Sun, C.Y., Liu, Y., Tong, X.: Adaptive O-CNN: a patch-based deep representation of 3D shapes. ACM Trans. Graph. 37(6), 1–11 (2018)
Wang, X., Gupta, A.: Unsupervised learning of visual representations using videos. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2794–2802 (2015)
Wang, X., He, K., Gupta, A.: Transitive invariance for self-supervised visual representation learning. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1329–1338 (2017)
Wang, Y., Sun, Y., Liu, Z., Sarma, S.E., Bronstein, M.M., Solomon, J.M.: Dynamic graph CNN for learning on point clouds. ACM Trans. Graph. (TOG) 38(5), 1–12 (2019)
Weller, R.: A brief overview of collision detection. In: Ferre, M., Ernst, M.O., Wing, A. (eds.) New Geometric Data Structures for Collision Detection and Haptics. Springer Series on Touch and Haptic Systems. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-319-01020-5_2
Wu, J., Zhang, C., Xue, T., Freeman, B., Tenenbaum, J.: Learning a probabilistic latent space of object shapes via 3D generative-adversarial modeling. In: Advances in Neural Information Processing Systems, pp. 82–90 (2016)
Wu, Z., et al.: 3D ShapeNets: a deep representation for volumetric shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1912–1920 (2015)
Xie, Q., Hovy, E., Luong, M.T., Le, Q.V.: Self-training with noisy student improves imagenet classification. arXiv preprint arXiv:1911.04252 (2019)
Yang, G., Huang, X., Hao, Z., Liu, M.Y., Belongie, S., Hariharan, B.: PointFlow: 3D point cloud generation with continuous normalizing flows. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4541–4550 (2019)
Yang, Y., Feng, C., Shen, Y., Tian, D.: FoldingNet: point cloud auto-encoder via deep grid deformation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 206–215 (2018)
Yi, L., Guibas, L., Hertzmann, A., Kim, V.G., Su, H., Yumer, E.: Learning hierarchical shape segmentation and labeling from online repositories. ACM Trans. Graph. 36, 1–12 (2017)
Zhang, R., Isola, P., Efros, A.A.: Colorful image colorization. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_40
Zhang, R., Isola, P., Efros, A.A.: Split-brain autoencoders: unsupervised learning by cross-channel prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1058–1067 (2017)
Zhao, Y., Birdal, T., Deng, H., Tombari, F.: 3D point capsule networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1009–1018 (2019)
Zhou, Y., Yin, K., Huang, H., Zhang, H., Gong, M., Cohen-Or, D.: Generalized cylinder decomposition. ACM Trans. Graph. 34, 6 (2015)
Ren, Z., Yuan, J., Li, C., Liu, W.: Minimum near-convex decomposition for robust shape representation. In: 2011 International Conference on Computer Vision (November 2011)
Zhu, C., Xu, K., Chaudhuri, S., Yi, L., Guibas, L.J., Zhang, H.: AdaCoSeg: Adaptive shape co-segmentation with group consistency loss. CoRR abs/1903.10297 (2019). http://arxiv.org/abs/1903.10297
Acknowledgements
The project is supported in part by the National Science Foundation (NSF) through grants #1908669, #1749833, #1617333. Our experiments used the UMass GPU cluster obtained under the Collaborative Fund managed by the Massachusetts Technology Collaborative.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Gadelha, M. et al. (2020). Label-Efficient Learning on Point Clouds Using Approximate Convex Decompositions. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12355. Springer, Cham. https://doi.org/10.1007/978-3-030-58607-2_28
Download citation
DOI: https://doi.org/10.1007/978-3-030-58607-2_28
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58606-5
Online ISBN: 978-3-030-58607-2
eBook Packages: Computer ScienceComputer Science (R0)