Skip to main content
Log in

Ensemble Quadratic Assignment Network for Graph Matching

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Graph matching is a commonly used technique in computer vision and pattern recognition. Recent data-driven approaches have improved the graph matching accuracy remarkably, whereas some traditional algorithm-based methods are more robust to feature noises, outlier nodes, and global transformation (e.g. rotation). In this paper, we propose a graph neural network (GNN) based approach to combine the advantage of data-driven and traditional methods. In the GNN framework, we transform traditional graph matching solvers as single-channel GNNs on the association graph and extend the single-channel architecture to the multi-channel network. The proposed model can be seen as an ensemble method that fuses multiple algorithms at every iteration. Instead of averaging the estimates at the end of the ensemble, in our approach, the independent iterations of the ensembled algorithms exchange their information after each iteration via a \(1\,\times \,1\) channel-wise convolution layer. Experiments show that our model improves the performance of traditional algorithms significantly. In addition, we propose a random sampling strategy to reduce the computational complexity and GPU memory usage, so that the model is applicable to matching graphs with thousands of nodes. We evaluate the performance of our method on three tasks: geometric graph matching, semantic feature matching, and few-shot 3D shape classification. The proposed model performs comparably or outperforms the best existing GNN-based methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Algorithm 1
Algorithm 2
Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  • Alan, Y., & Anand, R. (2003). The concave-convex procedure. Neural Computing, 15(4), 915–936.

    Article  Google Scholar 

  • Bai, S., Bai, X., Zhou, Z., Zhang, Z., & Longin, J. L. (2016). Gift: A real-time and scalable 3d shape search engine. In Proceedings of the IEEE conference on computer vision and pattern recognition.

  • Bengio, Y., Léonard, N., & Courville, A. C. (2013). Estimating or propagating gradients through stochastic neurons for conditional computation. CoRR abs/1308.3432

  • Berg, A., Berg, T., & Malik, J.(2005). Shape matching and object recognition using low distortion correspondences. In Proceedings of the IEEE conference on computer vision and pattern recognition.

  • Bout, V., & Miller, T. K. (1990). Graph partitioning using annealed neural networks. IEEE Transactions on Neural Networks, 1(2), 192–203.

    Article  Google Scholar 

  • Brendel, W., & Todorovic, S. (2011). Learning spatio-temporal graphs of human activities. In IEEE international conference on computer vision.

  • Caelli, T., & Caetano, T. (2005). Graphical models for graph matching: Approximate models and optimal algorithms. Pattern Recognition Letters, 26(3), 339–346.

    Article  Google Scholar 

  • Carcassoni, M., & Hancock, E. R. (2003). Spectral correspondence for point pattern matching. Pattern Recognition, 36(1), 193–204.

    Article  Google Scholar 

  • Charles, Ruizhongtai, Q., Hao, S., & Mo, K. (2017). Leonidas Guibas: Pointnet: deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition

  • Chen, D.-Y., Tian, X.-P., Shen, Y.-T., & Ouhyoung, M. (2003). On visual similarity based 3d model retrieval. Computer Graphics Forum, 22(3), 223–232.

    Article  Google Scholar 

  • Chen, H.-T., Lin, H.-H., & Liu, T.-L. (2001). Multi-object tracking using dynamical graph matching. In Proceedings of the IEEE conference on computer vision and pattern recognition.

  • Cho, M., Alahari, K., & Ponce, J. (2013). Learning graphs to match. In Proceedings of the IEEE conference on computer vision and pattern recognition.

  • Cho, M., Lee, J., & Mu, K. (2010). Reweighted random walks for graph matching. In European conference on computer vision.

  • Chopin, J., Fasquel, J.-B., Mouchére, H., Dahyot, R., & Bloch, I. (2020). Semantic image segmentation based on spatial relationships and inexact graph matching. In Tenth international conference on image processing theory, tools and applications (IPTA).

  • Cour, T., Srinivasan, P., & Jianbo, S. (2007). Balanced graph matching. In Advances in neural information processing systems (Vol. 19).

  • Egozi, A., Keller, Y., & Guterman, H. (2013). A probabilistic approach to spectral graph matching. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(1), 18–27.

    Article  Google Scholar 

  • Emtiyaz, K., Mohammad, P., Baque, F. F., & Fua, P. (2015). Kullback-leibler proximal variational inference. In Advances in neural information processing systems (Vol. 28).

  • Everingham, M., Van Gool, L., Williams, C., WinnAndrew, J., & Zisserman, A. (2010). The pascal visual object classes (voc) challenge. International Journal of Computer Vision, 88(2), 303–338.

    Article  Google Scholar 

  • Fey, M., Lenssen, J., Morris, C., Jonathan M., & NilsM, K. (2020). Deep graph matching consensus. In International conference on learning representations.

  • Fey, M., Lenssen, J. E., Weichert, F., & Müller, H. (2018). Splinecnn: Fast geometric deep learning with continuous b-spline kernels. In Proceedings of the IEEE conference on computer vision and pattern recognition.

  • Fiori, M., Sprechmann, P., Vogelstein, J., Muse, P. S., & Guillermo, R. (2013). Multimodal graph matching: Sparse coding meets graph matching. In Advances in neural information processing systems.

  • Foggia, P., Percannella, G., & Vento, M. (2014). Graph matching and learning in pattern recognition in the last 10 years. International Journal of Pattern Recognition and Artificial Intelligence33(1).

  • Franco, S., Marco, G., Chung, A. T., Markus, H., & Gabriele, M. (2009). The graph neural network model. IEEE Transaction of Neural Networks, 20(1), 61–80.

    Article  Google Scholar 

  • Fu, K., Liu, S., Luo, X., & W. (2021). Manning Robust point cloud registration framework based on deep graph matching. In Proceedings of the IEEE conference on computer vision and pattern recognition.

  • Gao, Q., Wang, F., Xue, N., Yu, J.-G., & Xia, G.-S. (2021). Deep graph matching under quadratic constraint. In Proceedings of the IEEE conference on computer vision and pattern recognition.

  • Gaur, U., Zhu, Y., Song, B., & Roy-Chowdhury, A. A. (2011) “String of feature graphs” model for recognition of complex activities in natural videos. In IEEE international conference on computer vision.

  • Gold, S. N., & Anand, R. (1996). A graduated assignment algorithm for graph matching. IEEE Transactions on Pattern Analysis and Machine Intelligence, 18(4), 377–388.

    Article  Google Scholar 

  • Gomila, C., & Meyer, F. (2003). Graph-based object tracking. In Proceedings international conference on image processing.

  • He, J., Huang, Z., Wang, N., & Zhang, Z. (2021). Learnable graph matching: Incorporating graph partitioning with deep feature learning for multiple object tracking. In Proceedings of the IEEE conference on computer vision and pattern recognition.

  • He, J., Zhang, T., Zheng, Y., Mingliang, X., Zhang, Y., & Feng, W. (2021). Consistency graph modeling for semantic correspondence. IEEE Transactions on Image Processing, 30, 4932–4946.

    Article  Google Scholar 

  • He, T., Liu, W., Gong, C., Yan, N., & Junchi, Z. (2021) Music plagiarism detection via bipartite graph matching. arXiv:2107.09889

  • Huet, B., & Hancock, E. R. (1999). Shape recognition from large image libraries by inexact graph matching. Pattern Recognition Letters, 20(11), 1259–1269.

    Article  Google Scholar 

  • Jiang, B., Sun, P., Tang, J., & Luo, B. (2019). Glmnet: Graph learning-matching networks for feature matching. arXiv:1911.07681

  • Jiang, B., Sun, P., Zhang, Z., Tang, J., & Luo, B. (2021). Gamnet: robust feature matching via graph adversarial-matching network. In MM ’21: ACM multimedia conference 2021 (pp. 5419–5426). ACM.

  • Justin, S., Gabriel, P., Vladimir, K., & Suvrit, S. (2016). Entropic metric alignment for correspondence problems. ACM Transactions on Graphics,35(4).

  • Kazhdan, M., Funkhouser, T., & Rusinkiewicz, S. (2003). Rotation invariant spherical harmonic representation of 3D shape descriptors. In Symposium on geometry processing.

  • Kingma, D. P., & Ba Adam, J. (2015). A method for stochastic optimization. In International conference for learning representations.

  • Koopmans, T. C., & Beckmann, M. (1957). Assignment problems and the location of economic activities. Econometrica, 25(1), 53–76.

    Article  MathSciNet  Google Scholar 

  • Kuhn, H. W. (1955) The Hungarian method for the assignment problem. In Naval research logistics quarterly.

  • Leordeanu, M., & Hebert, M., (2005). A spectral technique for correspondence problems using pairwise constraints. In IEEE international conference on computer vision.

  • Li, H., Kadav, A., Durdanovic, I., Samet, H., Peter & Graf, H. (2017). Pruning filters for efficient convnets. In International conference on learning representations.

  • Liao, C.-S., Lu, K., Baym, M., Singh, R., & Berger, B. (2009). Isorankn: Spectral methods for global alignment of multiple protein networks. In Bioinformatics.

  • Liu, C., Wang, R., Jiang, Z., & Yan, J. (2020). Deep reinforcement learning of graph matching. arxiv preprint arXiv:2012.08950

  • Liu, L., Hughe, M., Hassoun, S., & Liu, L., (2021). Stochastic iterative graph matching. In Proceedings of the 38th international conference on machine learning.

  • Liu, Z., Sun, M., Zhou, T., Huang, G., & Darrell, T. (2019). Rethinking the value of network pruning. In International conference on learning representations.

  • Marius, L., Martial, H., & Sukthankar, R. (2009). An integer projected fixed point method for graph matching and map inference. In Advances in neural information processing systems.

  • Menke, J., & Yang, A. (2020). Graduated assignment graph matching for realtime matching of image wireframes. In 2020 IEEE/RSJ international conference on intelligent robots and systems.

  • Min, J., Lee, J., Jean, P., & Minsu, C. (2019). Spair-71k: A large-scale benchmark for semantic correspondence. arXiv:1908.10543

  • Nie, J., Ning, X., Zhou, M., Yan, G., & Wei, Z. (2020). 3d model classification based on few-shot learning. Neurocomputing, 398, 539–546.

    Article  Google Scholar 

  • Nie, W., Liu, A., Yuting, S., Luan, H., Yang, Z., Cao, L., & Ji, R. (2014). Single/cross-camera multiple-person tracking by graph matching. Neurocomputing, 139(2), 220–232.

    Article  Google Scholar 

  • Parikh, N., & Boyd, S. (2014). Proximal algorithms. Foundations and Trends in Optimization, 1(3), 127–239.

    Article  Google Scholar 

  • Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., Devito, Z., Lin, Z., Desmaison, A., Antiga, L., Lerer, A., et al. (2017). Automatic differentiation in pytorch. In Advances in neural information processing systems workshop.

  • Perronnin, F., & Dance, C. (2007). Fisher kernels on visual vocabularies for image categorization. In Proceedings of the IEEE conference on computer vision and pattern recognition.

  • Puri, R., Zakhor, A., & Puri, R. (2020). Few shot learning for point cloud data using model agnostic meta learning. In International conference on image processing.

  • Ren, Q., Bao, Q., Wang, R., & Yan, J. (2022). Appearance and structure aware robust deep visual graph matching: Attack, defense and beyond. In 2022 IEEE/CVF conference on computer vision and pattern recognition (CVPR) (pp. 15242–15251).

  • Richard, S., & Paul, K. (1967). Concerning nonnegative matrices and doubly stochastic matrices. Pacific Journal of Mathematics, 21(2), 343–348.

    Article  MathSciNet  Google Scholar 

  • Richard, W., & Edwin, H. (1997). Structural matching by discrete relaxation. IEEE Transactions on Pattern Analysis Machine Intelligence, 19(6), 634–648.

    Article  Google Scholar 

  • Rolínek, M., Swoboda, P., Zietlow, D., Paulus, A., Musil, V., & Martius, G. (2020) Deep graph matching via blackbox differentiation of combinatorial solvers. In European conference on computer vision.

  • Shen, D., & Davatzikos, C. (2002). Hierarchical attribute matching mechanism for elastic registration. IEEE Transactions on Medical Imaging, 21(11), 1421–1439.

    Article  Google Scholar 

  • Shen, Y., Lin, W., Yan, J., Xu, M., Wu, J., & Wang, J. (2015). Person re-identification with correspondence structure learning. In IEEE international conference on computer vision.

  • Siddiqi, K., Shokoufandeh, A., Dickenson, S., & Steven, Z. (1998). Shock graphs and shape matching. In IEEE international conference on computer vision.

  • Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. In International conference on learning representations.

  • Su, H., Maji, S., Kalogerakis, E., & Learned-Miller, E. (2015). Multi-view convolutional neural networks for 3d shape recognition. In IEEE international conference on computer vision.

  • Tan, H.-R., Wang, C., Wu, S.-T., Wang, T.-Q., Zhang, X.-Y., & Liu, C.-L. (2021) Proxy graph matching with proximal matching networks. In Proceedings of the AAAI conference on artificial intelligence.

  • Vento, M. (2015). A long trip in the charming world of graphs for pattern recognition. Pattern Recognition, 48(2), 291–301.

    Article  Google Scholar 

  • Wang, F.-D., Xue, N., Zhang, Y., Xia, G.-S., & Pelillo, M. (2020). A functional representation for graph matching. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(11), 2737–2754.

    Google Scholar 

  • Wang, H., & Hancock, E. (2004). A kernel view of spectral point pattern matching. In Structural, syntactic, and statistical pattern recognition (pp. 361–369).

  • Wang, H. X., Chen, X., & Liu, C. (2021). Pose-guided part matching network via shrinking and reweighting for occluded person re-identification. Image and Vision Computing, 111, 104186.

    Article  Google Scholar 

  • Wang, R., Yan, J., & Yang, X. (2019). Learning combinatorial embedding networks for deep graph matching. In IEEE international conference on computer vision.

  • Wang, R., Yan, J., & Yang, X. (2019). Neural graph matching network: Learning lawler’s quadratic assignment problem with extension to hypergraph and multiple-graph matching. arXiv:1911.11308

  • Wang, R., Yan, J., & Yang, X. (2020). Combinatorial learning of robust deep graph matching: An embedding based approach. In IEEE transactions on pattern analysis and machine intelligence (pp. 1–1).

  • Wang, R., Yan, J., & Yang, X. (2020). Graduated assignment for joint multi-graph matching and clustering with application to unsupervised graph matching network learning. In Advances in neural information processing systems.

  • Wang, R.-Z., Yan, J.-C., & Yang, X.-K.(2020). Combinatorial learning of robust deep graph matching: an embedding based approach. In IEEE transactions on pattern analysis and machine intelligence (pp. 1–1).

  • Wang, T., & Ling, H. (2018). Gracker: A graph-based planar object tracker. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(6), 1494–1501.

    Article  Google Scholar 

  • Wang, T., Liu, H., Li, Y., Jin, Y., Hou, X., & Ling, H. (2020). Learning combinatorial solver for graph matching. In Proceedings of the IEEE conference on computer vision and pattern recognition.

  • Wang, W., Lin, W., Chen, Y., Wu, J., Wang, J., & Sheng, B. (2014). Finding coherent motions and semantic regions in crowd scenes: a diffusion and clustering approach. In European conference on computer vision.

  • Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., & Xiao, J. (2015). 3d shapenets: A deep representation for volumetric shapes. In Proceedings of the IEEE conference on computer vision and pattern recognition.

  • Yan, J., Yin, X., Lin, W., Deng, C., Zha, H., & Yang, X. (2016). A short survey of recent advances in graph matching. In The annual ACM international conference on multimedia retrieval.

  • Yao, B., & Li, F. (2012). Action recognition with exemplar based 2.5 d graph matching. In European conference on computer vision.

  • Yin, P., Lyu, J., Zhang, S., Osher, S., Qi, Y., & Xin, J. (2019). Understanding straight-through estimator in training activation quantized neural nets. In International conference on learning representations.

  • Yu, T., Wang, R., Yan, J., & Li, B. (2020). Learning deep graph matching with channel-independent embedding and hungarian attention. In International conference on learning representations.

  • Yu, T., Wang, R., Yan, J., & Li, B. (2021). Deep latent graph matching. In Proceedings of the 38th international conference on machine learning.

  • Yu, H., Ye, W., Feng, Y., Bao, H., & Zhang, G. (2020). Learning bipartite graph matching for robust visual localization. In IEEE international symposium on mixed and augmented reality

  • Yue, W., Xiao, Z., Liu, S., Miao, Q., Ma, W., Gong, M., Xie, F., & Zhang, Y. (2021). A two-step method for remote sensing images registration based on local and global constraints. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 14, 5194–5206.

    Article  Google Scholar 

  • Zanfir, A., & Sminchisescu, C. (2018) Deep learning of graph matching. In Proceedings of the IEEE conference on computer vision and pattern recognition.

  • Zeng, S., Liu, Z., & Xu, Y. (2021). Supervised learning for parameterized Koopmans-Beckmann’s graph matching. Pattern Recognition Letters, 143, 8–13.

    Article  Google Scholar 

  • Zhang, Z., & Lee, W. S. (2019). Deep graphical feature learning for the feature matching problem. In Proceedings of the IEEE conference on computer vision and pattern recognition.

  • Zhang, Z., Xiang, Y., Wu, L., Xue, B., & Nehorai, A. K. (2019). Kernelized graph matching. In Advances in neural information processing systems.

  • Zhao, J., Han, R., Gan, Y., Wan, L., Feng, W., & Wang, S. (2020). Human identification and interaction detection in cross-view multi-person videos with wearable cameras. In Proceedings of the ACM international conference on multimedia.

  • Zhao, K., Tu, S., & Xu, L. (2021). Ia-gm: A deep bidirectional learning method for graph matching. In Proceedings of the AAAI conference on artificial intelligence.

  • Zhou, F., & Torre, D. (2012). Factorized graph matching. In Proceedings of the IEEE conference on computer vision and pattern recognition.

Download references

Acknowledgements

This work has been supported by the Major Project for New Generation of AI under Grant No. 2018AAA0100400, the National Natural Science Foundation of China (NSFC) grants U20A20223, 61836014, 61721004, the Youth Innovation Promotion Association of CAS under Grant 2019141, and the Pioneer Hundred Talents Program of CAS under Grant Y9S9MS08, Y9J9MS08 and E2S40101.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chuang Wang.

Additional information

Communicated by Laurent Najman.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix: Convergence Analysis and Derivation of the Differentiable Proximal Graph Matching Algorithm

In this section, we show that under a mild assumption, the proximal graph matching algorithm converges to a stationary point within a reasonable number of iterations. The main technique in this analysis follows (Emtiyaz et al., 2015), which studied a general variational inference problem using the proximal-gradient method.

Before presenting the main proposition, we first show a technical lemma.

Lemma 1

There exist a constant \(\alpha >0\) such that for all \(\varvec{z}_{t+1}\), \(\varvec{z}_t\) generated during the forward pass of the solver, we have

$$\begin{aligned} (\varvec{z}_{t+1} - \varvec{z}_{t})^\top \nabla _{\varvec{z}_{t+1}} D(\varvec{z}_{t+1}, \varvec{z}_{t}) \ge \alpha \Vert \varvec{z}_{t+1} - \varvec{z}_t\Vert ^2, \end{aligned}$$

where D is the KL-divergence.

We note that the largest valid \(\alpha \) satisfying Lemma 1 is 1/2. One can find the proofs of both this lemma and the proposition shown below in the appendix. Next, we show the main result of our convergence analysis.

Proposition 1

Let L be the Lipschitz constant of the gradient function \(\nabla g(\varvec{z})\), where g(x) is defined in (3), and let \(\alpha \) be the constant used in Lemma 1. If we choose a constant step-size \(\beta < 2\alpha /L\) for all \(t\in \{0, 1, \ldots , T \} \), then

$$\begin{aligned} \mathbb {E}_{t \sim \text {uniform} \{0, 1, \ldots , T \} } \Vert \varvec{z}_{t+1} - \varvec{z}_t\Vert ^2 \le \frac{C_0}{ T(\alpha - L \beta /2)}, \end{aligned}$$

where \(C_0 = |\mathcal {L}^*- \mathcal {L}(\varvec{z}_0)| \) is the objective gap of (2) between the initial guess \(\mathcal {L}(\varvec{z}_0)\) and the optimal one \(\mathcal {L}^*\).

Proposition 1 guarantees that the iterative process converges to a stationary point within a reasonable number of iterations. In particular, the average difference \( \Vert \varvec{z}_{t+1} - \varvec{z}_t\Vert ^2\) of the iterand converges with a rate 1/T.

1.1 Proof of the Lemma 1

Because the proximal function \(D(\varvec{z}, \varvec{z}_t)\) is convex, the following inequality always hold:

$$\begin{aligned} \begin{aligned} D(\varvec{z}_t,\varvec{z}_t) \ge D(\varvec{z}_{t+1},\varvec{z}_t) + [\nabla _{\varvec{z}= \varvec{z}_{t+1}} D(\varvec{z},\varvec{z}_t)]^\top (\varvec{z}_t - \varvec{z}_{t+1}). \end{aligned} \end{aligned}$$

By using \(D(\varvec{z}_t,\varvec{z}_t) = 0\), we obtain:

$$\begin{aligned} \begin{aligned} D(\varvec{z}_{t+1},\varvec{z}_t) \le (\varvec{z}_{t+1} - \varvec{z}_t)^\top [\nabla _{\varvec{z}= \varvec{z}_{t+1}} D(\varvec{z},\varvec{z}_t)]. \end{aligned} \end{aligned}$$

Let \(l(\varvec{z}) = \varvec{z}^\top \textrm{log} \varvec{z}\). We decompose \(D(\varvec{z}_t,\varvec{z}_t)\) as the following form of bregman divergence:

$$\begin{aligned} \begin{aligned} D(\varvec{z}_{t+1},\varvec{z}_t) = l(\varvec{z}_{t+1}) - l(\varvec{z}_{t}) - \nabla l(\varvec{z}_{t})^\top (\varvec{z}_{t+1} - \varvec{z}_{t}). \end{aligned}\end{aligned}$$

Moreover, due to the strong convexity of \(l(\varvec{z})\),

$$\begin{aligned} \begin{aligned} l(\varvec{z}_{t+1}) - l(\varvec{z}_t) - \nabla l(\varvec{z}_t)^\top (\varvec{z}_{t+1} - \varvec{z}_{t}) \ge \frac{1}{2}||\varvec{z}_{t+1} - \varvec{z}_{t}||^2, \end{aligned} \end{aligned}$$

then we get:

$$\begin{aligned} \begin{aligned} \frac{1}{2}||\varvec{z}_{t+1} - \varvec{z}_{t}||^2 \le D(\varvec{z}_{t+1}, \varvec{z}_t) \le (\varvec{z}_{t+1} - \varvec{z}_{t})^T \nabla D(\varvec{z}_{t+1},\varvec{z}_{t}), \end{aligned} \end{aligned}$$

which proves lemma 1 and suggests that the largest valid \(\alpha \) is \(\frac{1}{2}\).

1.2 Proof of the Proposition 1

Before the proof of the proposition 1, we first present a technical lemma.

Lemma 2

For any real-valued vector \(\varvec{g}\) which has the same dimension of \(\varvec{z}\) and \(\beta >0\), considering the convex problem

$$\begin{aligned} \begin{aligned} \varvec{z}_{t+1} =&\mathop \textrm{argmin}\limits _{{\textbf {C}}\varvec{z}= {\textbf {1}}}:~ \{ \varvec{z}^\top {\textbf {g}} - h(\varvec{z}) + \frac{1}{\beta } D(\varvec{z}, \varvec{z}_t) \}, \end{aligned} \end{aligned}$$
(21)

where \(h(\varvec{z}) = -\lambda \varvec{z}^\top \textrm{log }(\varvec{z})\) and D is the KL-divergence, the following inequality always holds:

$$\begin{aligned} \begin{aligned} {\textbf {g}}^\top (\varvec{z}_{t} - \varvec{z}_{t+1}) \ge \frac{\alpha }{\beta }||\varvec{z}_{t} - \varvec{z}_{t+1}||^2 - [h(\varvec{z}_{t+1}) - h(\varvec{z}_{t})]. \end{aligned} \end{aligned}$$

Because of the convexity of the objective in the sub-problem, it is easy to derive that, if \(\varvec{z}^{*}\) is the optimal, the following hold:

$$\begin{aligned} \begin{aligned} (\varvec{z}^{*} - \varvec{z}_t)^T({\textbf {g}} - \nabla h(\varvec{z}^{*}) + \frac{1}{\beta } \nabla D(\varvec{z}^{*}, \varvec{z}_{t})) \le 0. \end{aligned} \end{aligned}$$

Let \(\varvec{z}^{*} = \varvec{z}_{t+1}\), by using lemma 1 we obtain that

$$\begin{aligned}{} & {} {\textbf {g}}^\top (\varvec{z}_t - \varvec{z}_{t+1}) - (\varvec{z}_t - \varvec{z}_{t+1})^\top \nabla h(\varvec{z}_{t+1}) - \frac{\alpha }{\beta }|| \varvec{z}_{t+1} - \varvec{z}_{t} ||^2 \nonumber \\{} & {} \quad \ge 0. \end{aligned}$$
(22)

Because the entropic function \(h(\varvec{z})\) is concave, hence

$$\begin{aligned} \begin{aligned} (\varvec{z}_t - \varvec{z}_{t+1})^\top \nabla h(\varvec{z}_{t+1}) \ge h(\varvec{z}_{t}) - h(\varvec{z}_{t+1}). \end{aligned} \end{aligned}$$
(23)

We can derive the following inequality from (22) and (23):

$$\begin{aligned}{} & {} {\textbf {g}}^\top (\varvec{z}_t - \varvec{z}_{t+1}) - [ h(\varvec{z}_{t}) - h(\varvec{z}_{t+1}) ] - \frac{\alpha }{\beta }|| \varvec{z}_{t+1} - \varvec{z}_{t} ||^2 \\{} & {} \quad \ge 0, \end{aligned}$$

which proves lemma 2.

Now, we start to prove Proposition 1.

Proof

: Because \(g(\varvec{z})\) is L-Lipschitz gradient continuous, we get

$$\begin{aligned} \begin{aligned} g(\varvec{z}_{t{+}1}) \le g(\varvec{z}_{t}) {+} \nabla g(\varvec{z}_{t})^\top (\varvec{z}_{t{+}1} - \varvec{z}_{t}) {+} \frac{L}{2} ||\varvec{z}_{t+1} - \varvec{z}_{t}||^2. \end{aligned}\nonumber \\ \end{aligned}$$
(24)

Let \(\varvec{g}= \nabla g(\varvec{z}_{t})\). By using lemma 2, we bound the right side of (24) by

$$\begin{aligned} \begin{aligned} g(\varvec{z}_{t+1})&\le g(\varvec{z}_{t}) - \frac{\alpha }{\beta }||\varvec{z}_t - \varvec{z}_{t+1}||^2 - [h(\varvec{z}_t) - h(\varvec{z}_{t+1})]\\&+ \frac{L}{2} ||\varvec{z}_{t+1} - \varvec{z}_{t}||^2. \end{aligned} \end{aligned}$$
(25)

Rearranging the terms in (25) and noting that \(g(\varvec{z}) - h(\varvec{z}) = L(\varvec{z})\), we get

$$\begin{aligned} \begin{aligned} (\frac{\alpha }{\beta } - \frac{L}{2}) ||\varvec{z}_{t+1} - \varvec{z}_{t}||^2 \le \mathcal {L}(\varvec{z}_{t}) - \mathcal {L}(\varvec{z}_{t+1}). \end{aligned} \end{aligned}$$

Next, we choose a constant step-size \(\beta < 2\alpha /L\) for all \(t\in \{0, 1, \ldots , T \} \). By summing both side from index of 0 to T, we have

$$\begin{aligned} \begin{aligned} \frac{1}{T}\sum _{t=0}^{T}(\frac{\alpha }{\beta } - \frac{L}{2}) ||\varvec{z}_{t+1} - \varvec{z}_{t}||^2 \le \frac{\mathcal {L}(\varvec{z}_{0}) - \mathcal {L}(\varvec{z}_{T})}{T} \\ \le \frac{\mathcal {L}(\varvec{z}_{0}) - \mathcal {L}^*}{T}, \end{aligned} \end{aligned}$$

and finally reach

$$\begin{aligned} \begin{aligned} \mathbb {E}_{t \sim \text {uniform} \{0, 1, \ldots , T \} } \Vert \varvec{z}_{t+1} - \varvec{z}_t\Vert ^2 \le \frac{ \mathcal {L}(\varvec{z}_{0}) - \mathcal {L}^*}{ T(\alpha - L \beta /2) }. \end{aligned} \end{aligned}$$

\(\square \)

1.3 Derivation of DPGM Algorithm

Our proximal graph matching solves a sequence of convex optimization problems

$$\begin{aligned} \begin{aligned} \textbf{z}_{t+1} =&\mathop \textrm{argmin}_{\varvec{z}\in \mathbb {R}_{+}^{n^2\times 1}} - \varvec{z}^\top (\varvec{u}+ \varvec{P}\varvec{z}_t)+ \tfrac{1+ \beta _t}{\beta _t} \varvec{z}^\top \log (\varvec{z}) - \tfrac{1}{\beta _t} \varvec{z}^\top \log (\varvec{z}_t) \\ \text {s.t. }\quad&\varvec{C}\varvec{z}= \varvec{1}. \end{aligned} \end{aligned}$$

where the binary matrix \(\varvec{C}\in \{0,1\}^{n^2 \times n^2}\) encodes \(n^2\) linear constraints ensuring that \(\sum _{i\in V_1} x_{ij^\prime } = 1 \) and \(\sum _{j \in V_2} x_{i^\prime j}=1\) for all \(i^\prime \in V_1\) and \(j^\prime \in V_2\). We set

$$\begin{aligned} \begin{aligned} E_t(\varvec{z}) = - \varvec{z}^\top (\varvec{u}+ \varvec{P}\varvec{z}_t)+ \tfrac{1+\beta _t}{\beta _t} \varvec{z}^\top \log (\varvec{z}) - \tfrac{1}{\beta _t} \varvec{z}^\top \log (\varvec{z}_t) \end{aligned} \end{aligned}$$

The matching objective with Lagrange multipliers for each sub-problem is

$$\begin{aligned} \begin{aligned} \mathcal {L}(\varvec{z}, \varvec{\mu }, \varvec{\nu }) = E_t(\varvec{z}) + \varvec{\mu }^\top (\varvec{Z}\varvec{1} - \varvec{1}) + \varvec{{\nu }}^\top (\varvec{Z}^\top \varvec{1} - \varvec{1}), \end{aligned} \end{aligned}$$

where \(\varvec{\mu }\) and \(\varvec{\nu }\) are Lagrange multipliers, and \(\varvec{Z}\) is the matrix form of vector \(\varvec{z}\), which should be a doubly stochastic matrix. Setting derivatives \(\frac{\partial {\mathcal {L}}}{\partial \varvec{z}}\), \(\frac{\partial {\mathcal {L}}}{\partial \varvec{\mu }}\), \(\frac{\partial {\mathcal {L}}}{\partial \varvec{\nu }}\) be 0, we get

$$\begin{aligned}&[\varvec{z}]_{ij} = \exp \Big [ \tfrac{\beta _t}{ 1+ \beta _t} [ \varvec{u}+ \varvec{P}\varvec{z}_t]_{ij} + \tfrac{1}{1+\beta _t} \log ([\varvec{z}_t]_{ij})\\&\quad + 1 +\tfrac{\beta _t}{ 1+ \beta _t}\big ( [\varvec{\mu }]_j + [\varvec{\nu }]_i \big )\Big ]\\&\quad \sum _{i^\prime = 1}^n [\varvec{z}]_{i^\prime } = 1, \sum _{j^\prime = 1}^n [\varvec{z}]_{ij^\prime } = 1, \end{aligned}$$

for all \( i,j \in 1,2,\ldots , n\). The solution of the above equations yields the following update rule (Bout et al., 1990; Alan & Anand, 2003)

$$\begin{aligned} {\widetilde{\varvec{z}}}_{t+1}&= \exp \Big [ \tfrac{\beta _t}{ 1+ \beta _t} ( \varvec{u}+ \varvec{P}\varvec{z}_t) + \tfrac{1}{1+\beta _t} \log (\varvec{z}_t) \Big ] \end{aligned}$$
(26)
$$\begin{aligned} \varvec{z}_{t+1}&= \text {Sinkhorn}({\widetilde{\varvec{z}}}_{t+1}), \end{aligned}$$
(27)

where \(\text {Sinkhorn}(\varvec{z})\) is the Sinkhorn-Knopp transform (Richard & Paul, 1967) that maps a nonnegative matrix of size \(n\times n\) to a doubly stochastic matrix. Here, the input and output variables are \(n^2\) vectors. When using the Sinkhorn-Knopp transform, we reshape the input (output) as an \(n\times n\) matrix ( \(n^2\)-dimensional vector) respectively.

Given a nonnegative matrix \(\varvec{F} \in \mathbb {R}_{+}^{n\times n}\), Sinkhorn algorithm works iteratively. In each iteration, it normalizes all its rows via the following equation:

$$\begin{aligned} \begin{aligned} \varvec{F}^{'}_{ij} = \varvec{F}^{'}_{ij}/\left( \sum _{k=1}^{n} \varvec{F}_{ik}\right) , \end{aligned} \end{aligned}$$

then it takes the column normalization by the following rule:

$$\begin{aligned} \begin{aligned} \varvec{F}_{ij} = \varvec{F}_{ij}/\left( \sum _{k=1}^{n} \varvec{F}^{'}_{kj}\right) , \end{aligned} \end{aligned}$$

After processing iteratively until convergence, the original matrix \(\varvec{F}\) would be transformed into a doubly stochastic matrix. Equations (26) and (27) lead to the proximal iteration (4) and (5).

Additional Pseudo-Code of QAP Solvers

We presented the pseudocodes of the related classical matching algorithms, GAGM and spectral method (SM) for readers’ reference. Details are shown in Algorithm 3 and 4 respectively.

Algorithm 3
figure c

: Graduated assignment (GAGM)

Algorithm 4
figure d

: Power-method based spectral method (SM)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tan, H., Wang, C., Wu, S. et al. Ensemble Quadratic Assignment Network for Graph Matching. Int J Comput Vis (2024). https://doi.org/10.1007/s11263-024-02040-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11263-024-02040-8

Keywords

Navigation