Advertisement

Label-Efficient Learning on Point Clouds Using Approximate Convex Decompositions

Conference paper
  • 763 Downloads
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12355)

Abstract

The problems of shape classification and part segmentation from 3D point clouds have garnered increasing attention in the last few years. Both of these problems, however, suffer from relatively small training sets, creating the need for statistically efficient methods to learn 3D shape representations. In this paper, we investigate the use of Approximate Convex Decompositions (ACD) as a self-supervisory signal for label-efficient learning of point cloud representations. We show that using ACD to approximate ground truth segmentation provides excellent self-supervision for learning 3D point cloud representations that are highly effective on downstream tasks. We report improvements over the state-of-the-art for unsupervised representation learning on the ModelNet40 shape classification dataset and significant gains in few-shot part segmentation on the ShapeNetPart dataset. Our source code is publicly available (https://github.com/matheusgadelha/PointCloudLearningACD).

Notes

Acknowledgements

The project is supported in part by the National Science Foundation (NSF) through grants #1908669, #1749833, #1617333. Our experiments used the UMass GPU cluster obtained under the Collaborative Fund managed by the Massachusetts Technology Collaborative.

Supplementary material

504449_1_En_28_MOESM1_ESM.pdf (3.7 mb)
Supplementary material 1 (pdf 3738 KB)

References

  1. 1.
    Achlioptas, P., Diamanti, O., Mitliagkas, I., Guibas, L.: Representation learning and adversarial generation of 3D point clouds. arXiv preprint arXiv:1707.02392 (2017)
  2. 2.
    Alliegro, A., Boscaini, D., Tommasi, T.: Joint supervised and self-supervised learning for 3D real-world challenges (2020). https://arxiv.org/abs/2004.07392
  3. 3.
    Arthur, D., Vassilvitskii, S.: k-means++: The advantages of careful seeding. Technical report (2007)Google Scholar
  4. 4.
    Au, O.K.C., Zheng, Y., Chen, M., Xu, P., Tai, C.L.: Mesh segmentation with concavity-aware fields. IEEE Trans. Vis. Comput. Graph. 18(7), 1125–1134 (2011).  https://doi.org/10.1109/TVCG.2011.131CrossRefGoogle Scholar
  5. 5.
    Bernard, C.: Convex partitions of polyhedra a lower bound and worst-case optimal algorithm. SIAM J. Comput. 13(3), 488–507 (1984). https://dl.acm.org/doi/10.1137/0213031MathSciNetCrossRefGoogle Scholar
  6. 6.
    Caron, M., Bojanowski, P., Joulin, A., Douze, M.: Deep clustering for unsupervised learning of visual features. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) Computer Vision – ECCV 2018. LNCS, vol. 11218, pp. 139–156. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01264-9_9CrossRefGoogle Scholar
  7. 7.
    Caron, M., Bojanowski, P., Mairal, J., Joulin, A.: Unsupervised pre-training of image features on non-curated data. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2959–2968 (2019)Google Scholar
  8. 8.
    Chang, A.X., et al.: ShapeNet: An information-rich 3D model repository. CoRR abs/1512.03012 (2015)Google Scholar
  9. 9.
    Chen, X., Golovinskiy, A., Funkhouser, T.: A benchmark for 3D mesh segmentation. ACM Trans. Graph. (Proc. SIGGRAPH) 28(3), 1–12 (2009)CrossRefGoogle Scholar
  10. 10.
    Chen, Z., Tagliasacchi, A., Zhang, H.: BSP-Net: generating compact meshes via binary space partitioning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2019)Google Scholar
  11. 11.
    Chen, Z., Yin, K., Fisher, M., Chaudhuri, S., Zhang, H.: BAE-NET: branched autoencoder for shape co-segmentation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 8490–8499 (2019)Google Scholar
  12. 12.
    Chopra, S., Hadsell, R., LeCun, Y.: Learning a similarity metric discriminatively, with application to face verification. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, vol. 1, pp. 539–546. IEEE (2005)Google Scholar
  13. 13.
    Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley, Hoboken (2012)zbMATHGoogle Scholar
  14. 14.
    Demir, İ., Aliaga, D.G., Benes, B.: Near-convex decomposition and layering for efficient 3D printing. Addit. Manuf. 21, 383–394 (2018).  https://doi.org/10.1016/j.addma.2018.03.008. http://www.sciencedirect.com/science/article/pii/S2214860417300386CrossRefGoogle Scholar
  15. 15.
    Deng, B., Genova, K., Yazdani, S., Bouaziz, S., Hinton, G., Tagliasacchi, A.: CvxNet: learnable convex decomposition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2020)Google Scholar
  16. 16.
    Doersch, C., Gupta, A., Efros, A.A.: Unsupervised visual representation learning by context prediction. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1422–1430 (2015)Google Scholar
  17. 17.
    Donahue, J., Simonyan, K.: Large scale adversarial representation learning. In: Advances in Neural Information Processing Systems, pp. 10541–10551 (2019)Google Scholar
  18. 18.
    Gadelha, M., Maji, S., Wang, R.: 3D shape generation using spatially ordered point clouds. In: British Machine Vision Conference (BMVC) (2017)Google Scholar
  19. 19.
    Gadelha, M., Wang, R., Maji, S.: Multiresolution tree networks for 3D point cloud processing. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 105–122. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01234-2_7CrossRefGoogle Scholar
  20. 20.
    Ghosh, M., Amato, N.M., Lu, Y., Lien, J.M.: Fast approximate convex decomposition using relative concavity. Comput. Aided Des. 45, 494–504 (2013)MathSciNetCrossRefGoogle Scholar
  21. 21.
    Goyal, P., Mahajan, D., Gupta, A., Misra, I.: Scaling and benchmarking self-supervised visual representation learning. arXiv preprint arXiv:1905.01235 (2019)
  22. 22.
    Groueix, T., Fisher, M., Kim, V.G., Russell, B., Aubry, M.: AtlasNet: a Papier-Mâché approach to learning 3D surface generation. In: Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)Google Scholar
  23. 23.
    Gupta, S., Hoffman, J., Malik, J.: Cross modal distillation for supervision transfer. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2827–2836 (2016)Google Scholar
  24. 24.
    Hadsell, R., Chopra, S., LeCun, Y.: Dimensionality reduction by learning an invariant mapping. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2006, vol. 2, pp. 1735–1742. IEEE (2006)Google Scholar
  25. 25.
    Hassani, K., Haley, M.: Unsupervised multi-task feature learning on point clouds. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 8160–8171 (2019)Google Scholar
  26. 26.
    Hoffman, D.D., Richards, W.: Parts of recognition. Cognition 18, 65–96 (1983)CrossRefGoogle Scholar
  27. 27.
    Huang, H., Kalogerakis, E., Chaudhuri, S., Ceylan, D., Kim, V.G., Yumer, E.: Learning local shape descriptors from part correspondences with multiview convolutional networks. ACM Trans. Graph. 37(1), 1–14 (2018)Google Scholar
  28. 28.
    Huang, J., Yagel, R., Filippov, V., Kurzion, Y.: An accurate method for voxelizing polygon meshes. In: IEEE Symposium on Volume Visualization (October 1998)Google Scholar
  29. 29.
    Jiang, H., Larsson, G., Maire, M., Shakhnarovich, G., Learned-Miller, E.: Self-supervised relative depth learning for urban scene understanding. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11215, pp. 20–37. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01252-6_2CrossRefGoogle Scholar
  30. 30.
    Kaick, O.V., Fish, N., Kleiman, Y., Asafi, S., Cohen-OR, D.: Shape segmentation by approximate convexity analysis. ACM Trans. Graph. 34(1), 1–11 (2014)CrossRefGoogle Scholar
  31. 31.
    Kalogerakis, E., Averkiou, M., Maji, S., Chaudhuri, S.: 3D shape segmentation with projective convolutional networks. In: Proceedings of the CVPR (2017)Google Scholar
  32. 32.
    Klokov, R., Lempitsky, V.: Escape from cells: deep Kd-networks for the recognition of 3D point cloud models. In: Proceedings of the ICCV (2017)Google Scholar
  33. 33.
    Kolesnikov, A., Zhai, X., Beyer, L.: Revisiting self-supervised visual representation learning. arXiv preprint arXiv:1901.09005 (2019)
  34. 34.
    Kong, S., Fowlkes, C.C.: Recurrent pixel embedding for instance grouping. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9018–9028 (2018)Google Scholar
  35. 35.
    Larsson, G., Maire, M., Shakhnarovich, G.: Learning representations for automatic colorization. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46493-0_35CrossRefGoogle Scholar
  36. 36.
    Li, J., Chen, B.M., Hee Lee, G.: SO-Net: self-organizing network for point cloud analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9397–9406 (2018)Google Scholar
  37. 37.
    Lien, J.M., Amato, N.M.: Approximate convex decomposition of polyhedra. In: Proceedings of the 2007 ACM Symposium on Solid and Physical Modeling, SPM 2007 (2007)Google Scholar
  38. 38.
    Luo, L., Baran, I., Rusinkiewicz, S., Matusik, W.: Chopper: Partitioning models into 3D-printable parts. ACM Trans. Graph. 31(6), 1–9 (2012)Google Scholar
  39. 39.
    Mamou, K.: Volumetric approximate convex decomposition, chap. 12. In: Lengyel, E. (ed.) Game Engine Gems, vol. 3, pp. 141–158. A K Peters/CRC Press (2016)Google Scholar
  40. 40.
    Maturana, D., Scherer, S.: 3D convolutional neural networks for landing zone detection from LiDAR. In: Proceedings of the ICRA (2015)Google Scholar
  41. 41.
    Müllner, D., et al.: fastcluster: fast hierarchical, agglomerative clustering routines for R and Python. J. Stat. Softw. 53(9), 1–18 (2013)CrossRefGoogle Scholar
  42. 42.
    Muralikrishnan, S., Kim, V.G., Chaudhuri, S.: Tags2parts: Discovering semantic regions from shape tags. CoRR abs/1708.06673 (2017)Google Scholar
  43. 43.
    Muralikrishnan, S., Kim, V.G., Chaudhuri, S.: Tags2Parts: discovering semantic regions from shape tags. In: Proceedings of the CVPR. IEEE (2018)Google Scholar
  44. 44.
    Noroozi, M., Favaro, P.: Unsupervised learning of visual representations by solving jigsaw puzzles. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9910, pp. 69–84. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46466-4_5CrossRefGoogle Scholar
  45. 45.
    Pathak, D., Girshick, R., Dollár, P., Darrell, T., Hariharan, B.: Learning features by watching objects move. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2701–2710 (2017)Google Scholar
  46. 46.
    Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T., Efros, A.A.: Context encoders: feature learning by inpainting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2536–2544 (2016)Google Scholar
  47. 47.
    Qi, C.R., Su, H., Mo, K., Guibas, L.J.: PointNet: deep learning on point sets for 3D classification and segmentation. In: Proceedings of the CVPR (2017)Google Scholar
  48. 48.
    Qi, C.R., Yi, L., Su, H., Guibas, L.: PointNet++: deep hierarchical feature learning on point sets in a metric space. In: Proceedings of the NIPS (2017)Google Scholar
  49. 49.
    Riegler, G., Ulusoys, A.O., Geiger, A.: OctNet: learning deep 3D representations at high resolutions. In: Proceedings of the CVPR (2017)Google Scholar
  50. 50.
    Sánchez, J., Perronnin, F., Mensink, T., Verbeek, J.: Image classification with the fisher vector: theory and practice. Int. J. Comput. Vis. 105(3), 222–245 (2013)MathSciNetCrossRefGoogle Scholar
  51. 51.
    Saxe, A.M., Koh, P.W., Chen, Z., Bhand, M., Suresh, B., Ng, A.Y.: On random weights and unsupervised feature learning. In: ICML, vol. 2, p. 6 (2011)Google Scholar
  52. 52.
    Sharma, A., Grau, O., Fritz, M.: VConv-DAE: deep volumetric shape learning without object labels. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9915, pp. 236–250. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-49409-8_20CrossRefGoogle Scholar
  53. 53.
    Sharma, G., Kalogerakis, E., Maji, S.: Learning point embeddings from shape repositories for few-shot segmentation. In: 2019 International Conference on 3D Vision (3DV), pp. 67–75 (2019)Google Scholar
  54. 54.
    Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)CrossRefGoogle Scholar
  55. 55.
    Su, H., et al.: SPLATNet: sparse lattice networks for point cloud processing. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018)Google Scholar
  56. 56.
    Su, H., Maji, S., Kalogerakis, E., Learned-Miller, E.G.: Multi-view convolutional neural networks for 3D shape recognition. In: Proceedings of the ICCV (2015)Google Scholar
  57. 57.
    Su, H., Qi, C., Mo, K., Guibas, L.: PointNet: deep Learning on point sets for 3D classification and segmentation. In: CVPR (2017)Google Scholar
  58. 58.
    Su, J.C., Gadelha, M., Wang, R., Maji, S.: A deeper look at 3D shape classifiers. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11131. Springer, Cham (2019).  https://doi.org/10.1007/978-3-030-11015-4_49CrossRefGoogle Scholar
  59. 59.
    Su, J.C., Maji, S., Hariharan, B.: When does self-supervision improve few-shot learning? arXiv preprint arXiv:1910.03560 (2019)
  60. 60.
    Tatarchenko, M., Park, J., Koltun, V., Zhou., Q.Y.: Tangent convolutions for dense prediction in 3D. In: CVPR (2018)Google Scholar
  61. 61.
    Thabet, A., Alwassel, H., Ghanem, B.: MortonNet: Self-Supervised Learning of Local Features in 3D Point Clouds. arXiv (March 2019). https://arxiv.org/abs/1904.00230
  62. 62.
    Trimble Inc.: Trimble 3D Warehouse (2008). https://3dwarehouse.sketchup.com/
  63. 63.
    Trinh, T.H., Luong, M.T., Le, Q.V.: Selfie: Self-supervised pretraining for image embedding. arXiv preprint arXiv:1906.02940 (2019)
  64. 64.
    Vinh, N.X., Epps, J., Bailey, J.: Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance. J. Mach. Learn. Res. 11, 2837–2854 (2010)MathSciNetzbMATHGoogle Scholar
  65. 65.
    Von Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17(4), 395–416 (2007)MathSciNetCrossRefGoogle Scholar
  66. 66.
    Wang, P.S., Liu, Y., Guo, Y.X., Sun, C.Y., Tong, X.: O-CNN: octree-based convolutional neural networks for 3D shape analysis. ACM Trans. Graph. 36(4), 1–11 (2017)Google Scholar
  67. 67.
    Wang, P.S., Sun, C.Y., Liu, Y., Tong, X.: Adaptive O-CNN: a patch-based deep representation of 3D shapes. ACM Trans. Graph. 37(6), 1–11 (2018)Google Scholar
  68. 68.
    Wang, X., Gupta, A.: Unsupervised learning of visual representations using videos. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2794–2802 (2015)Google Scholar
  69. 69.
    Wang, X., He, K., Gupta, A.: Transitive invariance for self-supervised visual representation learning. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1329–1338 (2017)Google Scholar
  70. 70.
    Wang, Y., Sun, Y., Liu, Z., Sarma, S.E., Bronstein, M.M., Solomon, J.M.: Dynamic graph CNN for learning on point clouds. ACM Trans. Graph. (TOG) 38(5), 1–12 (2019)CrossRefGoogle Scholar
  71. 71.
    Weller, R.: A brief overview of collision detection. In: Ferre, M., Ernst, M.O., Wing, A. (eds.) New Geometric Data Structures for Collision Detection and Haptics. Springer Series on Touch and Haptic Systems. Springer, Heidelberg (2013).  https://doi.org/10.1007/978-3-319-01020-5_2CrossRefGoogle Scholar
  72. 72.
    Wu, J., Zhang, C., Xue, T., Freeman, B., Tenenbaum, J.: Learning a probabilistic latent space of object shapes via 3D generative-adversarial modeling. In: Advances in Neural Information Processing Systems, pp. 82–90 (2016)Google Scholar
  73. 73.
    Wu, Z., et al.: 3D ShapeNets: a deep representation for volumetric shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1912–1920 (2015)Google Scholar
  74. 74.
    Xie, Q., Hovy, E., Luong, M.T., Le, Q.V.: Self-training with noisy student improves imagenet classification. arXiv preprint arXiv:1911.04252 (2019)
  75. 75.
    Yang, G., Huang, X., Hao, Z., Liu, M.Y., Belongie, S., Hariharan, B.: PointFlow: 3D point cloud generation with continuous normalizing flows. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4541–4550 (2019)Google Scholar
  76. 76.
    Yang, Y., Feng, C., Shen, Y., Tian, D.: FoldingNet: point cloud auto-encoder via deep grid deformation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 206–215 (2018)Google Scholar
  77. 77.
    Yi, L., Guibas, L., Hertzmann, A., Kim, V.G., Su, H., Yumer, E.: Learning hierarchical shape segmentation and labeling from online repositories. ACM Trans. Graph. 36, 1–12 (2017)CrossRefGoogle Scholar
  78. 78.
    Zhang, R., Isola, P., Efros, A.A.: Colorful image colorization. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46487-9_40CrossRefGoogle Scholar
  79. 79.
    Zhang, R., Isola, P., Efros, A.A.: Split-brain autoencoders: unsupervised learning by cross-channel prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1058–1067 (2017)Google Scholar
  80. 80.
    Zhao, Y., Birdal, T., Deng, H., Tombari, F.: 3D point capsule networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1009–1018 (2019)Google Scholar
  81. 81.
    Zhou, Y., Yin, K., Huang, H., Zhang, H., Gong, M., Cohen-Or, D.: Generalized cylinder decomposition. ACM Trans. Graph. 34, 6 (2015)CrossRefGoogle Scholar
  82. 82.
    Ren, Z., Yuan, J., Li, C., Liu, W.: Minimum near-convex decomposition for robust shape representation. In: 2011 International Conference on Computer Vision (November 2011)Google Scholar
  83. 83.
    Zhu, C., Xu, K., Chaudhuri, S., Yi, L., Guibas, L.J., Zhang, H.: AdaCoSeg: Adaptive shape co-segmentation with group consistency loss. CoRR abs/1903.10297 (2019). http://arxiv.org/abs/1903.10297

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.University of Massachusetts AmherstAmherstUSA

Personalised recommendations