Learning local shape descriptors for computing non-rigid dense correspondence

Abstract

A discriminative local shape descriptor plays an important role in various applications. In this paper, we present a novel deep learning framework that derives discriminative local descriptors for deformable 3D shapes. We use local “geometry images” to encode the multi-scale local features of a point, via an intrinsic parameterization method based on geodesic polar coordinates. This new parameterization provides robust geometry images even for badly-shaped triangular meshes. Then a triplet network with shared architecture and parameters is used to perform deep metric learning; its aim is to distinguish between similar and dissimilar pairs of points. Additionally, a newly designed triplet loss function is minimized for improved, accurate training of the triplet network. To solve the dense correspondence problem, an efficient sampling approach is utilized to achieve a good compromise between training performance and descriptor quality. During testing, given a geometry image of a point of interest, our network outputs a discriminative local descriptor for it. Extensive testing of non-rigid dense shape matching on a variety of benchmarks demonstrates the superiority of the proposed descriptors over the state-of-the-art alternatives.

References

  1. [1]

    Corman, É.; Ovsjanikov, M.; Chambolle, A. Supervised descriptor learning for non-rigid shape matching. In. Computer Vision–ECCV 2014 Workshops. Lecture Notes in Computer Science, Vol. 8928. Agapito, L.; Bronstein, M.; Rother, C. Eds. Springer Cham, 283–298, 2015.

    Google Scholar 

  2. [2]

    Guo, Y. L.; Bennamoun, M.; Sohel, F.; Lu, M.; Wan, J. W. 3D object recognition in cluttered scenes with local surface features: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 36, No. 11, 2270–2287, 2014.

    Google Scholar 

  3. [3]

    Lian, Z. H.; Godil, A.; Bustos, B.; Daoudi, M.; Hermans, J.; Kawamura, S.; Kurita, Y.; Lavoué, G.; Van Nguyen, H.; Ohbuchi, R.; et al. A comparison of methods for non-rigid 3D shape retrieval. Pattern Recognition Vol. 46, No. 1, 449–461, 2013.

    Google Scholar 

  4. [4]

    Van Kaick, O.; Zhang, H.; Hamarneh, G.; Cohen-Or, D. A survey on shape correspondence. Computer Graphics Forum Vol. 30, No. 6, 1681–1707, 2011.

    Google Scholar 

  5. [5]

    Wang, Y. Q.; Guo, J. W.; Yan, D. M.; Wang, K.; Zhang, X. P. A robust local spectral descriptor for matching non-rigid shapes with incompatible shape structures. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 6231–6240, 2019.

    Google Scholar 

  6. [6]

    Shah, S. A. A.; Bennamoun, M.; Boussaid, F. A novel 3D vorticity based approach for automatic registration of low resolution range images. Pattern Recognition Vol. 48, No. 9, 2859–2871, 2015.

    Google Scholar 

  7. [7]

    Johnson, A. E.; Hebert, M. Using spin images for efficient object recognition in cluttered 3D scenes. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 21, No. 5, 433–449, 1999.

    Google Scholar 

  8. [8]

    Gal, R.; Cohen-Or, D. Salient geometric features for partial shape matching and similarity. ACM Transactions on Graphics Vol. 25, No. 1, 130–150, 2006.

    Google Scholar 

  9. [9]

    Sun, J.; Ovsjanikov, M.; Guibas, L. A concise and provably informative multi-scale signature based on heat diffusion. Computer Graphics Forum Vol. 28, No. 5, 1383–1392, 2009.

    Google Scholar 

  10. [10]

    Huang, H. B.; Kalogerakis, E.; Chaudhuri, S.; Ceylan, D.; Kim, V. G.; Yumer, E. Learning local shape descriptors from part correspondences with multiview convolutional networks. ACM Transactions on Graphics Vol. 37, No. 1, Article No. 6, 2018.

    Google Scholar 

  11. [11]

    Zeng, A.; Song, S. R.; NieBner, M.; Fisher, M.; Xiao, J. X.; Funkhouser, T. 3DMatch: Learning local geometric descriptors from RGB-D reconstructions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 199–208, 2017.

    Google Scholar 

  12. [12]

    Monti, F.; Boscaini, D.; Masci, J.; Rodola, E.; Svoboda, J.; Bronstein, M. M. Geometric deep learning on graphs and manifolds using mixture model CNNs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5425–5434, 2017.

    Google Scholar 

  13. [13]

    Anguelov, D.; Srinivasan, P.; Koller, D.; Thrun, S.; Rodgers, J.; Davis, J. SCAPE: Shape completion and animation of people. In: Proceedings of the SIGGRAPH’ 05: ACM SIGGRAPH 2005 Papers, 408–416, 2005.

    Google Scholar 

  14. [14]

    Bronstein, A.; Bronstein, M.; Kimmel, R. In the rigid kingdom. In. Numerical Geometry of Non-Rigid Shapes. Springer New York, 119–135, 2008.

    MATH  Google Scholar 

  15. [15]

    Sinha, A.; Bai, J.; Ramani, K. Deep learning 3D shape surfaces using geometry images. In. Computer Vision–ECCV 2016. Lecture Notes in Computer Science, Vol. 9910. Leibe, B.; Matas, J.; Sebe, N.; Welling, M. Eds. Springer Cham, 223–240, 2016.

    Google Scholar 

  16. [16]

    Wang, J.; Song, Y.; Leung, T.; Rosenberg, C.; Wang, J. B.; Philbin, J.; Chen, B; Wu, Y. Learning fine-grained image similarity with deep ranking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1386–1393, 2014.

    Google Scholar 

  17. [17]

    Wang, H. Y.; Guo, J. W.; Yan, D. M.; Quan, W. Z.; Zhang, X. P. Learning 3D keypoint descriptors for non-rigid shape matching. In. Computer Vision–ECCV 2018. Lecture Notes in Computer Science, Vol. 11212. Ferrari, V.; Hebert, M.; Sminchisescu, C.; Weiss, Y. Eds. Springer Cham, 3–20, 2018.

    Google Scholar 

  18. [18]

    Guo, Y. L.; Bennamoun, M.; Sohel, F.; Lu, M.; Wan, J. W.; Kwok, N. M. A comprehensive performance evaluation of 3D local feature descriptors. International Journal of Computer Vision Vol. 116, No. 1, 66–89, 2016.

    MathSciNet  Google Scholar 

  19. [19]

    Frome, A.; Huber, D.; Kolluri, R.; B¨ulow, T.; Malik, J. Recognizing objects in range data using regional point descriptors. In. Computer Vision–ECCV 2004. Lecture Notes in Computer Science, Vol. 3023. Pajdla, T.; Matas, J. Eds. Springer Berlin Heidelberg, 224–237, 2004.

    Google Scholar 

  20. [20]

    Zaharescu, A.; Boyer, E.; Varanasi, K.; Horaud, R. Surface feature detection and description with applications to mesh matching. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 373–380, 2009.

    Google Scholar 

  21. [21]

    Tombari, F.; Salti, S.; di Stefano, L. Unique signatures of histograms for local surface description. In. Computer Vision–ECCV 2010. Lecture Notes in Computer Science, Vol. 6313. Daniilidis K.; Maragos P.; Paragios N. Eds. Springer Berlin Heidelberg, 356–369, 2010.

    Google Scholar 

  22. [22]

    Bronstein, A. M.; Bronstein, M. M.; Guibas, L. J.; Ovsjanikov, M. Shape google: Geometric words and expressions for invariant shape retrieval. ACM Transactions on Graphics Vol. 3, No. 1, Article No. 1, 2011.

    Google Scholar 

  23. [23]

    Guo, Y. L.; Sohel, F.; Bennamoun, M.; Lu, M.; Wan, J. W. Rotational projection statistics for 3D local surface description and object recognition. International Journal of Computer Vision Vol. 105, No. 1, 63–86, 2013.

    MathSciNet  MATH  Google Scholar 

  24. [24]

    Elad, A.; Kimmel, R. On bending invariant signatures for surfaces. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 25, No. 10, 1285–1295, 2003.

    Google Scholar 

  25. [25]

    Aubry, M.; Schlickewei, U.; Cremers, D. The wave kernel signature: A quantum mechanical approach to shape analysis. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, 1626–1633, 2011.

    Google Scholar 

  26. [26]

    Kokkinos, I.; Bronstein, M. M.; Litman, R.; Bronstein, A. M. Intrinsic shape context descriptors for deformable shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 159–166, 2012.

    Google Scholar 

  27. [27]

    Litman, R.; Bronstein, A. M. Learning spectral descriptors for deformable shape correspondence. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 36, No. 1, 171–180, 2014.

    Google Scholar 

  28. [28]

    Gao, L.; Cao, Y. P.; Lai, Y. K.; Huang, H. Z.; Kobbelt, L.; Hu, S. M. Active exploration of large 3D model repositories. IEEE Transactions on Visualization and Computer Graphics Vol. 21, No. 12, 1390–1402, 2015.

    Google Scholar 

  29. [29]

    Huang, Q. X.; Zhang, G. X.; Gao, L.; Hu, S. M.; Butscher, A.; Guibas, L. An optimization approach for extracting and encoding consistent maps in a shape collection. ACM Transactions on Graphics Vol. 31, No. 6, Article No. 167, 2012.

    Google Scholar 

  30. [30]

    Wei, L. Y.; Huang, Q. X.; Ceylan, D.; Vouga, E.; Li, H. Dense human body correspondences using convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1544–1553, 2016.

    Google Scholar 

  31. [31]

    Charles, R. Q.; Hao, S.; Mo, K. C.; Guibas, L. J. PointNet: Deep learning on point sets for 3D classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 77–85, 2017.

    Google Scholar 

  32. [32]

    Khoury, M.; Zhou, Q.Y.; Koltun, V. Learning compact geometric features. In: Proceedings of the IEEE International Conference on Computer Vision, 153–161, 2017.

    Google Scholar 

  33. [33]

    Bai, S.; Bai, X.; Zhou, Z. C.; Zhang, Z. X.; Tian, Q.; Latecki, L. J. GIFT: Towards scalable 3D shape retrieval. IEEE Transactions on Multimedia Vol. 19, No. 6, 1257–1271, 2017.

    Google Scholar 

  34. [34]

    Bronstein, M. M.; Bruna, J.; LeCun, Y.; Szlam, A.; Vandergheynst, P. Geometric deep learning: Going beyond Euclidean data. IEEE Signal Processing Magazine Vol. 34, No. 4, 18–42, 2017.

    Google Scholar 

  35. [35]

    Boscaini, D.; Masci, J.; Melzi, S.; Bronstein, M. M.; Castellani, U.; Vandergheynst, P. Learning class-specific descriptors for deformable shapes using localized spectral convolutional networks. Computer Graphics Forum Vol. 34, No. 5, 13–23, 2015.

    Google Scholar 

  36. [36]

    Masci, J.; Boscaini, D.; Bronstein, M. M.; Vandergheynst, P. Geodesic convolutional neural networks on Riemannian manifolds. In: Proceedings of the IEEE International Conference on Computer Vision Workshop, 37–45, 2015.

    Google Scholar 

  37. [37]

    Boscaini, D.; Masci, J.; Rodolà, E.; Bronstein, M. Learning shape correspondence with anisotropic convolutional neural networks. In: Proceedings of the Advances in Neural Information Processing Systems, 3189–3197, 2016.

    Google Scholar 

  38. [38]

    Litany, O.; Remez, T.; Rodola, E.; Bronstein, A.; Bronstein, M. Deep functional maps: Structured prediction for dense shape correspondence. In: Proceedings of the IEEE International Conference on Computer Vision, 5660–5668, 2017.

    Google Scholar 

  39. [39]

    Biasotti, S.; Cerri, A.; Bronstein, A.; Bronstein, M. Recent trends, applications, and perspectives in 3D shape similarity assessment. Computer Graphics Forum Vol. 35, No. 6, 87–119, 2016.

    Google Scholar 

  40. [40]

    Ovsjanikov, M.; Corman, E.; Bronstein, M.; Rodolà, E.; Ben-Chen, M.; Guibas, L.; Chazal, F.; Bronstein, A. Computing and processing correspondences with functional maps. In: Proceedings of the SIGGRAPH ASIA 2016 Courses, Article No. 9, 2016.

    Google Scholar 

  41. [41]

    Ovsjanikov, M.; Mérigot, Q.; Mémoli, F.; Guibas, L. One point isometric matching with the heat kernel. Computer Graphics Forum Vol. 29, No. 5, 1555–1564, 2010.

    Google Scholar 

  42. [42]

    Mémoli, F.; Sapiro, G. A theoretical and computational framework for isometry invariant recognition of point cloud data. Foundations of Computational Mathematics Vol. 5, No. 3, 313–347, 2005.

    MathSciNet  MATH  Google Scholar 

  43. [43]

    Chen, Q. F.; Koltun, V. Robust nonrigid registration by convex optimization. In: Proceedings of the IEEE International Conference on Computer Vision, 2039–2047, 2015.

    Google Scholar 

  44. [44]

    Vestner, M.; Litman, R.; Rodola, E.; Bronstein, A.; Cremers, D. Product manifold filter: Non-rigid shape correspondence via kernel density estimation in the product space. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 6681–6690, 2017.

    Google Scholar 

  45. [45]

    Coifman, R. R.; Lafon, S.; Lee, A. B.; Maggioni, M.; Nadler, B.; Warner, F.; Zucker, S. W. Geometric diffusions as a tool for harmonic analysis and structure definition of data: Diffusion maps. Proceedings of the National Academy of Sciences of the United States of America Vol. 102, No. 21, 7426–7431, 2005.

    MATH  Google Scholar 

  46. [46]

    Ovsjanikov, M.; Ben-Chen, M.; Solomon, J.; Butscher, A.; Guibas, L. Functional maps: A exible representation of maps between shapes. ACM Transactions on Graphics Vol. 31, No. 4, Article No. 30, 2012.

    Google Scholar 

  47. [47]

    Pokrass, J.; Bronstein, A. M.; Bronstein, M. M.; Sprechmann, P.; Sapiro, G. Sparse modeling of intrinsic correspondences. Computer Graphics Forum Vol. 32, No. 2pt4, 459–468, 2013.

    MATH  Google Scholar 

  48. [48]

    Kovnatsky, A.; Bronstein, M. M.; Bresson, X.; Vandergheynst, P. Functional correspondence by matrix completion. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 905–914, 2015.

    Google Scholar 

  49. [49]

    Nogneng, D.; Ovsjanikov, M. Informative descriptor preservation via commutativity for shape matching. Computer Graphics Forum Vol. 36, No. 2, 259–267, 2017.

    Google Scholar 

  50. [50]

    Rodola, E.; Rota Bulo, S.; Windheuser, T.; Vestner, M.; Cremers, D. Dense non-rigid shape correspondence using random forests. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 4177–4184, 2014.

    Google Scholar 

  51. [51]

    Gu, X. F.; Gortler, S. J.; Hoppe, H. Geometry images. ACM Transactions on Graphics Vol. 21, No. 3, 355–361, 2002.

    Google Scholar 

  52. [52]

    Bogo, F.; Romero, J.; Loper, M.; Black, M. J. FAUST: Dataset and evaluation for 3D mesh registration. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3794–3801, 2014.

    Google Scholar 

  53. [53]

    Sipiran, I.; Bustos, B. Harris 3D: A robust extension of the Harris operator for interest point detection on 3D meshes. The Visual Computer Vol. 27, No. 11, 963–976, 2011.

    Google Scholar 

  54. [54]

    Yan, D. M.; Guo, J. W.; Jia, X. H.; Zhang, X. P.; Wonka, P. Blue-noise remeshing with farthest point optimization. Computer Graphics Forum Vol. 33, No. 5, 167–176, 2014.

    Google Scholar 

  55. [55]

    Boscaini, D.; Masci, J.; Rodolà, E.; Bronstein, M. M.; Cremers, D. Anisotropic diffusion descriptors. Computer Graphics Forum Vol. 35, No. 2, 431–441, 2016.

    Google Scholar 

  56. [56]

    Melvaer, E. L.; Reimers, M. Geodesic polar coordinates on polygonal meshes. Computer Graphics Forum Vol. 31, No. 8, 2423–2435, 2012.

    Google Scholar 

  57. [57]

    Schroff, F.; Kalenichenko, D.; Philbin, J. FaceNet: A unified embedding for face recognition and clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 815–823, 2015.

    Google Scholar 

  58. [58]

    Vijay Kumar, B, G.; Carneiro, G.; Reid, I. Learning local image descriptors with deep Siamese and triplet convolutional networks by minimizing global loss functions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5385–5394, 2016.

    Google Scholar 

  59. [59]

    Svoboda, J.; Masci, J.; Bronstein, M. M. Palmprint recognition via discriminative index learning. In: Proceedings of the 23rd International Conference on Pattern Recognition, 4232–4237, 2016.

    Google Scholar 

  60. [60]

    Ioffe, S.; Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: Proceedings of the 32nd International Conference on Machine Learning, 448–456, 2015.

    Google Scholar 

  61. [61]

    Maas, A. L.; Hannun, A. Y.; Ng, A. Y. Rectifier nonlinearities improve neural network acoustic models. In: Proceedings of the 30th International Conference on Machine Learning, 2013.

    Google Scholar 

  62. [62]

    Glorot, X.; Bengio, Y. Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the 13th International Conference on Artificial Intelligence and Statistics, 249–256, 2010.

    Google Scholar 

  63. [63]

    Kingma, D.; Ba, J. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.

    Google Scholar 

  64. [64]

    Yang, Y. P.; Yu, Y.; Zhou, Y.; Du, S. D.; Davis, J.; Yang, R. G. Semantic parametric reshaping of human body models. In: Proceedings of the 2nd International Conference on 3D Vision, 41–48, 2014.

    Google Scholar 

  65. [65]

    Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Corrado, G. S.; Davis, A.; Dean, J.; Devin, M.; et al. TensorFlow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467, 2016.

    Google Scholar 

  66. [66]

    Cosmo, L.; Rodolà, E.; Bronstein, M. M.; Torsello, A.; Cremers, D.; Sahillioglu, Y. SHREC’16: Partial matching of deformable shapes. In: Proceedings of the Eurographics Workshop on 3D Object Retrieval, 2016.

    Google Scholar 

Download references

Acknowledgements

This work was partially funded by the National Key R&D Program of China (2018YFB2100602), the National Natural Science Foundation of China (61802406, 61772523, 61702488), Beijing Natural Science Foundation (L182059), the CCF–Tencent Open Research Fund, Shenzhen Basic Research Program (JCYJ20180507182222355), and the Open Project Program of the State Key Lab of CAD&CG (A2004) Zhejiang University.

Author information

Affiliations

Authors

Corresponding authors

Correspondence to Zhanglin Cheng or Dong-Ming Yan.

Additional information

Jianwei Guo is an associate professor in the National Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences (CASIA). He received his Ph.D. degree in computer science from CASIA in 2016, and bachelor degree from Shandong University in 2011. His research interests include computer graphics and geometry processing.

Hanyu Wang is working toward M.S. and Ph.D. degrees in computer science at the University of Maryland, College Park. In 2017–2018, he was an intern in CASIA. He obtained his bachelor degree from Xi’an Jiaotong University in 2018. His research interests include 3D computer vision and generative models.

Zhanglin Cheng received his Ph.D. degree from CASIA in 2008. He is currently an associate professor with the Shenzhen VisuCA Key Lab, Shenzhen Institutes of Advanced Technology (SIAT), CAS. His research interests include computer graphics and visualization.

Xiaopeng Zhang is a professor in NLPR at CASIA. He received his Ph.D. degree in computer science from the Institute of Software, CAS, in 1999. He received the National Scientific and Technological Progress Prize (second class) in 2004. His main research interests include computer graphics and image processing.

Dong-Ming Yan is a professor in NLPR at CASIA. He received his Ph.D. degree in computer science from Hong Kong University in 2010, and his master and bachelor degrees in computer science and technology from Tsinghua University in 2005 and 2002, respectively. His research interests include computer graphics, geometric processing, and visualization.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit http:// creativecommons.org/licenses/by/4.0/.

To view a copy of this licence, visit http://creativecomm-ons.org/licenses/by/4.0/.

Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www.editorialmanager.com/cvmj.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Guo, J., Wang, H., Cheng, Z. et al. Learning local shape descriptors for computing non-rigid dense correspondence. Comp. Visual Media 6, 95–112 (2020). https://doi.org/10.1007/s41095-020-0163-y

Download citation

Keywords

  • local feature descriptor
  • triplet CNN
  • dense correspondence
  • geometry image
  • non-rigid shape