Advertisement

Mapping in a Cycle: Sinkhorn Regularized Unsupervised Learning for Point Cloud Shapes

Conference paper
  • 767 Downloads
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12355)

Abstract

We propose an unsupervised learning framework with the pretext task of finding dense correspondences between point cloud shapes from the same category based on the cycle-consistency formulation. In order to learn discriminative pointwise features from point cloud data, we incorporate in the formulation a regularization term based on Sinkhorn normalization to enhance the learned pointwise mappings to be as bijective as possible. Besides, a random rigid transform of the source shape is introduced to form a triplet cycle to improve the model’s robustness against perturbations. Comprehensive experiments demonstrate that the learned pointwise features through our framework benefits various point cloud analysis tasks, e.g. partial shape registration and keypoint transfer. We also show that the learned pointwise features can be leveraged by supervised methods to improve the part segmentation performance with either the full training dataset or just a small portion of it.

Keywords

Point cloud Unsupervised learning Dense correspondence Cycle-consistency 

Notes

Acknowledgement

We acknowledge valuable comments from anonymous reviewers. Our work is supported in part by Hong Kong Innovation and Technology Fund ITS/457/17FP.

Supplementary material

504449_1_En_27_MOESM1_ESM.pdf (2.7 mb)
Supplementary material 1 (pdf 2719 KB)

References

  1. 1.
    Adams, R.P., Zemel, R.S.: Ranking via Sinkhorn propagation. arXiv preprint arXiv:1106.1925 (2011)
  2. 2.
    Aoki, Y., Goforth, H., Srivatsan, R.A., Lucey, S.: PointNetLK: robust & efficient point cloud registration using PointNet. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7163–7172 (2019)Google Scholar
  3. 3.
    Arun, K.S., Huang, T.S., Blostein, S.D.: Least-squares fitting of two 3-D point sets. IEEE Trans. Pattern Anal. Mach. Intell. 9(5), 698–700 (1987)CrossRefGoogle Scholar
  4. 4.
    Chen, M., Zou, Q., Wang, C., Liu, L.: EdgeNet: deep metric learning for 3D shapes. Comput. Aided Geom. Des. 72, 19–33 (2019)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. arXiv preprint arXiv:2002.05709 (2020)
  6. 6.
    Choy, C., Park, J., Koltun, V.: Fully convolutional geometric features. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 8958–8966 (2019)Google Scholar
  7. 7.
    Deng, H., Birdal, T., Ilic, S.: PPF-FoldNet: unsupervised learning of rotation invariant 3D local descriptors. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11209, pp. 620–638. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01228-1_37CrossRefGoogle Scholar
  8. 8.
    Deng, H., Birdal, T., Ilic, S.: PPFNet: global context aware local features for robust 3D point matching. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 195–205 (2018)Google Scholar
  9. 9.
    Deprelle, T., Groueix, T., Fisher, M., Kim, V., Russell, B., Aubry, M.: Learning elementary structures for 3D shape generation and matching. In: Advances in Neural Information Processing Systems, pp. 7433–7443 (2019)Google Scholar
  10. 10.
    Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
  11. 11.
    Dwibedi, D., Aytar, Y., Tompson, J., Sermanet, P., Zisserman, A.: Temporal cycle-consistency learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1801–1810 (2019)Google Scholar
  12. 12.
    Groueix, T., Fisher, M., Kim, V.G., Russell, B.C., Aubry, M.: AtlasNet: a papier-mâché approach to learning 3D surface generation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 216–224 (2018)Google Scholar
  13. 13.
    Groueix, T., Fisher, M., Kim, V.G., Russell, B.C., Aubry, M.: Unsupervised cycle-consistent deformation for shape matching. In: Computer Graphics Forum, vol. 38, pp. 123–133. Wiley Online Library (2019)Google Scholar
  14. 14.
    Halimi, O., Litany, O., Rodola, E., Bronstein, A.M., Kimmel, R.: Unsupervised learning of dense shape correspondence. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4370–4379 (2019)Google Scholar
  15. 15.
    Han, Z., Wang, X., Liu, Y.S., Zwicker, M.: Multi-angle point cloud-VAE: unsupervised feature learning for 3D point clouds from multiple angles by joint self-reconstruction and half-to-half prediction. arXiv preprint arXiv:1907.12704 (2019)
  16. 16.
    Hassani, K., Haley, M.: Unsupervised multi-task feature learning on point clouds. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 8160–8171 (2019)Google Scholar
  17. 17.
    He, K., Fan, H., Wu, Y., Xie, S., Girshick, R.: Momentum contrast for unsupervised visual representation learning. arXiv preprint arXiv:1911.05722 (2019)
  18. 18.
    Huang, H., Kalogerakis, E., Chaudhuri, S., Ceylan, D., Kim, V.G., Yumer, E.: Learning local shape descriptors from part correspondences with multiview convolutional networks. ACM Trans. Graph. (TOG) 37(1), 1–14 (2017)Google Scholar
  19. 19.
    Huang, Q.X., Guibas, L.: Consistent shape maps via semidefinite programming. In: Computer Graphics Forum, vol. 32, pp. 177–186. Wiley Online Library (2013)Google Scholar
  20. 20.
    Huang, Q.X., Su, H., Guibas, L.: Fine-grained semi-supervised labeling of large shape collections. ACM Trans. Graph. (TOG) 32(6), 1–10 (2013)Google Scholar
  21. 21.
    Kim, V.G., Li, W., Mitra, N.J., Chaudhuri, S., DiVerdi, S., Funkhouser, T.: Learning part-based templates from large collections of 3D shapes. ACM Trans. Graph. (TOG) 32(4), 1–12 (2013)zbMATHGoogle Scholar
  22. 22.
    Kim, V.G., Li, W., Mitra, N.J., DiVerdi, S., Funkhouser, T.: Exploring collections of 3D models using fuzzy correspondences. ACM Trans. Graph. (TOG) 31(4), 1–11 (2012)Google Scholar
  23. 23.
    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  24. 24.
    Knight, P.A.: The Sinkhorn-Knopp algorithm: convergence and applications. SIAM J. Matrix Anal. Appl. 30(1), 261–275 (2008)MathSciNetCrossRefGoogle Scholar
  25. 25.
    Lee, J., Lee, Y., Kim, J., Kosiorek, A.R., Choi, S., Teh, Y.W.: Set transformer: A framework for attention-based permutation-invariant neural networks. arXiv preprint arXiv:1810.00825 (2018)
  26. 26.
    Li, Y., Bu, R., Sun, M., Wu, W., Di, X., Chen, B.: PointCNN: convolution on X-transformed points. In: Advances in Neural Information Processing Systems, pp. 820–830 (2018)Google Scholar
  27. 27.
    van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)zbMATHGoogle Scholar
  28. 28.
    Mena, G., Belanger, D., Linderman, S., Snoek, J.: Learning latent permutations with Gumbel-Sinkhorn networks. arXiv preprint arXiv:1802.08665 (2018)
  29. 29.
    Misra, I., van der Maaten, L.: Self-supervised learning of pretext-invariant representations. arXiv preprint arXiv:1912.01991 (2019)
  30. 30.
    Muralikrishnan, S., Kim, V.G., Fisher, M., Chaudhuri, S.: Shape unicode: a unified shape representation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3790–3799 (2019)Google Scholar
  31. 31.
    van den Oord, A., Li, Y., Vinyals, O.: Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748 (2018)
  32. 32.
    Qi, C.R., Su, H., Mo, K., Guibas, L.J.: PointNet: deep learning on point sets for 3D classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 652–660 (2017)Google Scholar
  33. 33.
    Qi, C.R., Yi, L., Su, H., Guibas, L.J.: PointNet++: deep hierarchical feature learning on point sets in a metric space. In: Advances in Neural Information Processing Systems, pp. 5099–5108 (2017)Google Scholar
  34. 34.
    Reddi, S.J., Kale, S., Kumar, S.: On the convergence of Adam and beyond. arXiv preprint arXiv:1904.09237 (2019)
  35. 35.
    Sahillioğlu, Y.: Recent advances in shape correspondence. Vis. Comput. 36, 1705–1721 (2019).  https://doi.org/10.1007/s00371-019-01760-0CrossRefGoogle Scholar
  36. 36.
    Sauder, J., Sievers, B.: Self-supervised deep learning on point clouds by reconstructing space. In: Advances in Neural Information Processing Systems, pp. 12942–12952 (2019)Google Scholar
  37. 37.
    Sinkhorn, R.: A relationship between arbitrary positive matrices and doubly stochastic matrices. Ann. Math. Stat. 35(2), 876–879 (1964)MathSciNetCrossRefGoogle Scholar
  38. 38.
    Thewlis, J., Albanie, S., Bilen, H., Vedaldi, A.: Unsupervised learning of landmarks by descriptor vector exchange. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 6361–6371 (2019)Google Scholar
  39. 39.
    Thewlis, J., Bilen, H., Vedaldi, A.: Unsupervised learning of object landmarks by factorized spatial embeddings. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5916–5925 (2017)Google Scholar
  40. 40.
    Tian, Y., Krishnan, D., Isola, P.: Contrastive multiview coding. arXiv preprint arXiv:1906.05849 (2019)
  41. 41.
    Tschannen, M., Djolonga, J., Rubenstein, P.K., Gelly, S., Lucic, M.: On mutual information maximization for representation learning. arXiv preprint arXiv:1907.13625 (2019)
  42. 42.
    Van Kaick, O., Zhang, H., Hamarneh, G., Cohen-Or, D.: A survey on shape correspondence. In: Computer Graphics Forum, vol. 30, pp. 1681–1707. Wiley Online Library (2011)Google Scholar
  43. 43.
    Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)Google Scholar
  44. 44.
    Wang, X., Jabri, A., Efros, A.A.: Learning correspondence from the cycle-consistency of time. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2566–2576 (2019)Google Scholar
  45. 45.
    Wang, Y., Solomon, J.M.: Deep closest point: learning representations for point cloud registration. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3523–3532 (2019)Google Scholar
  46. 46.
    Wang, Y., Solomon, J.M.: PRNet: self-supervised learning for partial-to-partial registration. In: Advances in Neural Information Processing Systems, pp. 8814–8826 (2019)Google Scholar
  47. 47.
    Wang, Y., Sun, Y., Liu, Z., Sarma, S.E., Bronstein, M.M., Solomon, J.M.: Dynamic graph CNN for learning on point clouds. ACM Trans. Graph. (TOG) 38(5), 1–12 (2019)CrossRefGoogle Scholar
  48. 48.
    Wu, W., Qi, Z., Fuxin, L.: PointConv: deep convolutional networks on 3D point clouds. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9621–9630 (2019)Google Scholar
  49. 49.
    Wu, Z., et al.: 3D ShapeNets: a deep representation for volumetric shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1912–1920 (2015)Google Scholar
  50. 50.
    Wu, Z., Xiong, Y., Yu, S.X., Lin, D.: Unsupervised feature learning via non-parametric instance discrimination. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3733–3742 (2018)Google Scholar
  51. 51.
    Yang, Y., Feng, C., Shen, Y., Tian, D.: FoldingNet: point cloud auto-encoder via deep grid deformation, pp. 206–215 (2018)Google Scholar
  52. 52.
    Yi, L., et al.: A scalable active framework for region annotation in 3D shape collections. ACM Trans. Graph. (TOG) 35(6), 1–12 (2016)CrossRefGoogle Scholar
  53. 53.
    Zeng, A., Song, S., Nießner, M., Fisher, M., Xiao, J., Funkhouser, T.: 3DMatch: learning local geometric descriptors from RGB-D reconstructions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1802–1811 (2017)Google Scholar
  54. 54.
    Zhao, Y., Birdal, T., Deng, H., Tombari, F.: 3D point capsule networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1009–1018 (2019)Google Scholar
  55. 55.
    Zhou, T., Krahenbuhl, P., Aubry, M., Huang, Q., Efros, A.A.: Learning dense correspondence via 3D-guided cycle consistency. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 117–126 (2016)Google Scholar
  56. 56.
    Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Department of Computer ScienceThe University of Hong KongHong KongChina
  2. 2.College of Mathematics and Computer ScienceFuzhou UniversityFuzhouChina

Personalised recommendations