Grownbb: Gromov–Wasserstein learning of neural best buddies for cross-domain correspondence

Tang, Ruolan; Wang, Weiwei; Han, Yu; Feng, Xiangchu

doi:10.1007/s00371-023-03251-9

Grownbb: Gromov–Wasserstein learning of neural best buddies for cross-domain correspondence

Research
Published: 12 February 2024

(2024)
Cite this article

The Visual Computer Aims and scope Submit manuscript

Ruolan Tang¹,
Weiwei Wang¹,
Yu Han² &
…
Xiangchu Feng¹

85 Accesses
Explore all metrics

Abstract

Identifying pixel correspondences between two images is a fundamental task in computer vision, and has been widely used for 3D reconstruction, image morphing, and image retrieval. The neural best buddies (NBB) finds sparse correspondences between cross-domain images, which have semantically related local structures, though could be quite different in semantics as well as appearances. This paper presents a new method for cross-domain image correspondence, called GroWNBB, by incorporating the Gromov–Wasserstein learning into the NBB framework. Specifically, we utilize the NBB as the backbone to search feature matching from deep layer and propagate to low layer. While for each layer, we modify the strategy of NBB by further mapping the matching pairs obtained from the NBB within and across images into graphs, then formulate the matches as optimal transport between graphs, and use Gromov–Wasserstein learning to establish matches between these graphs. Consequently, our approach considers the relationships between images as well as the relationships within images, which makes the correspondence more stable. Our experiments demonstrate that GroWNBB achieves state-of-the-art performance on cross-domain correspondence and outperforms other popular methods in intra-class and same object correspondence estimation. Our code is available at https://github.com/NolanInLowland/GroWNBB.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning Contrastive Representation for Semantic Correspondence

Article 24 March 2022

PARN: Pyramidal Affine Regression Networks for Dense Semantic Correspondence

ECO-TR: Efficient Correspondences Finding via Coarse-to-Fine Refinement

References

Heinly, J., Schoenberger, J., Dunn, E., Frahm, J.M.: Reconstructing the world in six days. In: Conference on Computer Vision and Pattern Recognition, pp. 3287–3295 (2015)
Sunnie, S.Y.K., Nicholas, K., Dunn, E., Jason, S., Gregory, S.: Deformable style transfer. In: European Conference on Computer Vision, pp. 246–261 (2020)
Liu, X., Li, X., Cheng, M., Hall, P.: Geometric Style Transfer. https://doi.org/10.48550/arXiv.2007.05471 (2020)
Fan, J., Yang, X., Lu, R., Li, W., Huang, Y.: Long-term visual tracking algorithm for uavs based on kernel correlation filtering and surf features. Vis. Comput. 39, 319–333 (2023)
Article Google Scholar
Noh, H., Araujo, A., Sim, J., Weyand, T., Han, B.: Large scale image retrieval with attentive deep local features. In: International Conference on Computer Vision, pp. 2476–3485 (2017)
Fischler, M., Bolles, R.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981)
Article MathSciNet Google Scholar
Yi, K.M., Trulls, E., Ono, Y., Lepetit, V., Salzmann, M., Fua, P.: Learning to find good correspondences. In: Conference on Computer Vision and Pattern Recognition, pp. 2666–2674 (2018)
Harris, C., Stephens, M.: A combined corner and edge detector. In: Alvey Vision Conference, pp. 147–151 (1988)
Smith, S.M., Brady, J.M.: Susan: a new approach to low level image processing. Int. J. Comput. Vis. 23(1), 45–78 (1997)
Article Google Scholar
Rosten, E., Porter, R., Drummond, T.: Faster and better: a machine learning approach to corner detection. IEEE Trans. Pattern Anal. Mach. Intell. 32(1), 105–119 (2010)
Article PubMed Google Scholar
Rublee, E., Rabaud, V., Konolige, K., Bradski, G.R.: Orb: An effificient alternative to sift or surf. In: IEEE International Conference on Computer Vision, pp. 2564–2571 (2011)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Article Google Scholar
Bay, H., Tuytelaars, T., Van Gool, L.: SURF: speeded up robust features. In: European Conference on Computer Vision, pp. 404–417 (2006)
Agrawal, M., Konolige, K., Blas, M.R.: Censure: center surround extremas for realtime feature detection. In: Proceedings of the European Conference on Computer Vision, pp. 102–115 (2008)
Alcantarilla, P.F., Bartoli, A., Davison, A.J.: Kaze features. In Proceedings of the European Conference on Computer Vision, pp. 214–227 (2012)
Ojala, T., Pietikäinen, M., Mäenpää, T.: Multiresolution grayscale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 971–987 (2002)
Article Google Scholar
Calonder, M., Lepetit, V., Strecha, C., Fua, P.: Brief: binary robust independent elementary features. In: Proceedings of the European Conference on Computer Vision, pp. 778–792 (2010)
Verdie, Y., Yi, K.M., Fua, P., Lepetit, V.: TILDE: a temporally invariant learned detector. In: Conference on Computer Vision and Pattern Recognition, pp. 5279–5288 (2015)
Laguna, A.B., Mikolajczyk, K.: Key.net: Keypoint detection by handcrafted and learned cnn filters revisited. IEEE Trans. Pattern Anal. Mach. Intell. 45(1), 698–711 (2022)
Article Google Scholar
Cho, Y., Faisal, M., Sadiq, U., Arif, T., Hafiz, R., Seo, J., Ali, M.: Learning to detect local features using information change. IEEE Access 9(43), 43898–43908 (2021)
Article Google Scholar
Balntas, V., Riba, E., Ponsa, D., Mikolajczyk, K.: Learning local feature descriptors with triplets and shallow convolutional neural networks. In: British Machine Vision Conference, pp. 119.1–119.11 (2016)
Tian, Y., Fan, F.B., Wu: L2-Net: deep learning of discriminative patch descriptor in Euclidean space. In: Conference on Computer Vision and Pattern Recognition, pp. 6128–6136 (2017)
Mishchuk, A., Mishkin, D., Radenovic, F., Matas, J.: Working hard to know your neighbor’s margins: local descriptor learning loss. In: Conference on Neural Information Processing Systems, pp. 4829–4840 (2017)
Sarlin, P., DeTone, D., Malisiewicz, T., Rabinovich, A.: Superglue: learning feature matching with graph neural networks. In: Conference on Computer Vision and Pattern Recognition, pp. 4937–4946 (2020)
Detone, D., Malisiewicz, T., Rabinovich, A.: Superpoint: self-supervised interest point detection and description. In: Conference on Computer Vision and Pattern Recognition, pp. 337–33712 (2018)
Dusmanu, M., Rocco, I., Pajdla, T., Pollefeys, M., Sivic, J., Torii, A., Sattler, T.: D2-Net: a trainable CNN for joint detection and description of local features. In: Conference on Computer Vision and Pattern Recognition, pp. 8084–8093 (2019)
Ono, Y., Trulls, E., Fua, P., Yi, K.M.: LF-Net: learning local features from images. In: Conference on Neural Information Processing Systems, pp. 6237–6247 (2018)
Shen, Z., Kong, B., Dong, X.: Maim: a mixer mlp architecture for image matching. Vis. Comput. (2023). https://doi.org/10.1007/s00371-023-02851-9
Article Google Scholar
Gao, Y., He, J., Zhang, T., Zhang, Y.: Dynamic keypoint detection network for image matching. IEEE Trans. Pattern Anal. Mach. Intell. 45(12), 14404–14419 (2023)
Article PubMed Google Scholar
Cho, S., Hong, S., Jeon, S., Lee, Y., Sohn, K., Kim, S.: CATs: cost aggregation transformers for visual correspondence. In: Conference on Neural Information Processing Systems. https://doi.org/10.48550/arXiv.2106.02520 (2021)
Truong, P., Danelljan, M., Gool, L.V., Timofte, R.: Learning accurate dense correspondences and when to trust them. In: Conference on Computer Vision and Pattern Recognition, pp. 5710–5720 (2021)
Zhang, P., Zhang, B., Chen, D., Yuan, L., Wen, F.: Cross-domain correspondence learning for exemplar-based image translation. In: Conference on Computer Vision and Pattern Recognition, pp. 5142–5152 (2020)
Aberman, K., Liao, J., Shi, M.: Neural best-buddies: sparse cross-domain correspondence. ACM Trans. Graph. 37(4), 69 (2018)
Article Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations. https://doi.org/10.48550/arXiv.1409.1556 (2015)
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: European Conference on Computer Vision. https://doi.org/10.48550/arXiv.1311.2901 (2013)
Lindeberg, T.: Detecting salient blob-like image structures and their scales with a scale-space primal sketch: a method for focus-of-attention. Int. J. Comput. Vis. 11(3), 283–318 (1993)
Article Google Scholar
Wang, C., Xu, R., Xu, S., W., M., X, Z.: Cndesc: cross normalization for local descriptors learning. IEEE Trans. Multimedia 25, 3989–4001 (2023)
Zhao, X., Wu, X., Miao, J., Chen, W., Chen, P.C.Y., Li, Z.: Alike: accurate and lightweight keypoint detection and descriptor extraction. IEEE Trans. Multimedia 25, 3101–3112 (2023)
Article Google Scholar
Xu, H., Luo, D., Zha, H., Carin, L.: Gromov–Wasserstein learning for graph matching and node embedding. In: International Conference on Machine Learning, pp. 6932–6941 (2019)
Bronstein, A.M., Bronstein, M.M., Kimmel, M.R., Mahmoudi, S.G.: A Gromov-Hausdorff framework with diffusion geometry for topologically-robust non-rigid shape matching. Int. J. Comput. Vis. 89, 266–286 (2010)
Article Google Scholar
Memoli, F.: Gromov–Hausdorff distances in Euclidean spaces. In: Conference on Computer Vision and Pattern Recognition Workshops, pp. 1–8 (2008)
Villani, C.: Optimal Transport: Old and New. Springer, Cham (2008)
Google Scholar
Memoli, F.: Spectral Gromov-Wasserstein distances for shape matching. In: International Conference on Computer Vision Workshops, pp. 256–263 (2009)
Memoli, F.: Gromov–Wasserstein distances and the metric approach to object matching. Found. Comput. Math. 11(4), 417–487 (2011)
Article MathSciNet Google Scholar
Yan, Y., Li, W., Wu, H., Min, H., Tan, M., Wu, Q.: Semi-supervised optimal transport for heterogeneous domain adaptation. In: International Joint Conference on Artificial Intelligence, pp. 2969–2975 (2018)
Vayer, T., Chapel, L., Flamary, R., Tavenard, R., Courty, N.: Optimal transport for structured data with application on graphs. In: International Conference on Machine Learning. https://doi.org/10.48550/arXiv.1805.09114 (2019)
Xiang, Y., Mottaghi, R., Savarese, S.: Beyond pascal: a benchmark for 3d object detection in the wild. In: IEEE Winter Conference on Applications of Computer Vision, pp. 75–82 (2014)
Schaefer, S., McPhail, T., Warren, J.: Image deformation using moving least squares. ACM Trans. Graph. 25(3), 533–540 (2006)
Article Google Scholar
Radenovic, F., Iscen, A., Tolias, G., Avrithis, Y., Chum, O.: Revisiting Oxford and Paris: large-scale image retrieval benchmarking. In: Conference on Computer Vision and Pattern Recognition, pp. 5706–5715 (2018)

Download references

Acknowledgements

The authors would like to thank the editors and the anonymous reviewers for their constructive comments and suggestions. This paper is supported by the National Natural Science Foundation of China (Grant Nos. 61972264, 62072312, 62372302) and Natural Science Foundation of Shenzhen (Grant No. 20200807165235002).

Author information

Authors and Affiliations

School of Mathematics and Statistics, Xidian University, Xi’an, 710071, Shaanxi, China
Ruolan Tang, Weiwei Wang & Xiangchu Feng
School of Mathematical Sciences, Shenzhen University, Shenzhen, 518060, Guangdong, China
Yu Han

Authors

Ruolan Tang
View author publications
You can also search for this author in PubMed Google Scholar
Weiwei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yu Han
View author publications
You can also search for this author in PubMed Google Scholar
Xiangchu Feng
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Ruolan Tang contributed to the conceptualization, methodology, and writing-original draft; Weiwei Wang contributed to the conceptualization, writing—review, editing and supervision; Yu Han contributed to the writing—review and editing; Xiangchu Feng contributed to the conceptualization.

Corresponding author

Correspondence to Weiwei Wang.

Ethics declarations

Conflicts of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Tang, R., Wang, W., Han, Y. et al. Grownbb: Gromov–Wasserstein learning of neural best buddies for cross-domain correspondence. Vis Comput (2024). https://doi.org/10.1007/s00371-023-03251-9

Download citation

Accepted: 24 December 2023
Published: 12 February 2024
DOI: https://doi.org/10.1007/s00371-023-03251-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Grownbb: Gromov–Wasserstein learning of neural best buddies for cross-domain correspondence

Abstract

Access this article

Similar content being viewed by others

Learning Contrastive Representation for Semantic Correspondence

PARN: Pyramidal Affine Regression Networks for Dense Semantic Correspondence

ECO-TR: Efficient Correspondences Finding via Coarse-to-Fine Refinement

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Grownbb: Gromov–Wasserstein learning of neural best buddies for cross-domain correspondence

Abstract

Access this article

Similar content being viewed by others

Learning Contrastive Representation for Semantic Correspondence

PARN: Pyramidal Affine Regression Networks for Dense Semantic Correspondence

ECO-TR: Efficient Correspondences Finding via Coarse-to-Fine Refinement

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation