Rationalizing Efficient Compositional Image Alignment

Muñoz, Enrique; Márquez-Neila, Pablo; Baumela, Luis

doi:10.1007/s11263-014-0769-6

Rationalizing Efficient Compositional Image Alignment

The Constant Jacobian Gauss-Newton Optimization Algorithm

Published: 04 October 2014

Volume 112, pages 354–372, (2015)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Enrique Muñoz¹^nAff2,
Pablo Márquez-Neila¹ &
Luis Baumela¹

594 Accesses
5 Citations
Explore all metrics

Abstract

We study the issue of computational efficiency for Gauss-Newton (GN) non-linear least-squares optimization in the context of image alignment. We introduce the Constant Jacobian Gauss-Newton (CJGN) optimization, a GN scheme with constant Jacobian and Hessian matrices, and the equivalence and independence conditions as the necessary requirements that any function of residuals must satisfy to be optimized with this efficient approach. We prove that the Inverse Compositional (IC) image alignment algorithm is an instance of a CJGN scheme and formally derive the compositional and extended brightness constancy assumptions as the necessary requirements that must be satisfied by any image alignment problem so it can be solved with an efficient compositional scheme. Moreover, in contradiction with previous results, we also prove that the forward and inverse compositional algorithms are not equivalent. They are equivalent, however, when the extended brightness constancy assumption is satisfied. To analyze the impact of the satisfaction of these requirements we introduce a new image alignment evaluation framework and the concepts of short- and wide-baseline Jacobian. In wide-baseline Jacobian problems the optimization will diverge if the requirements are not satisfied. However, with a good initialization, a short-baseline Jacobian problem may converge even if the requirements are not satisfied.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SIFT Flow: Dense Correspondence Across Scenes and Its Applications

A Bimodal Co-sparse Analysis Model for Image Processing

Article 22 November 2014

The Conditional Lucas & Kanade Algorithm

References

Amberg, B., & Vetter, T. (2009). On compositional imge alignment, with an application to active appearance models. In Proceedings of Computer Vision and Pattern Recognition Conference.
Baker, S., & Matthews, I. (2001). Equivalence and efficiency of image alignment algorithms. In Proceedings of Computer Vision and Pattern Recognition Conference (vol. 1, pp. 1090–1097). IEEE.
Baker, S., & Matthews, I. (2004). Lucas-kanade 20 years on: A unifiying framework. International Journal of Computer Vision, 56(3), 221–255.
Article Google Scholar
Baker, S., Patil, R., Cheung, G., & Matthews, I. (2004). Lucas-kanade 20 years on: Part 5. Technical Report CMU-RI-TR-04-64, Robotics Institute, Carnegie Mellon University, Pittsburgh, PA.
Bartoli, A. (2008). Groupwise geometric and photometric direct image registration. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(12), 2098–2108.
Article Google Scholar
Benhimane, S., Ladikos, A., Lepetit, V., & Navab, N. (2007). Linear and quadratic subsets for template-based tracking. In Proceedings of Computer Vision and Pattern Recognition Conference.
Benhimane, S., & Malis, E. (2007). Homography-based 2D visual tracking and servoing. International Jounal of Robotics Research, 26(7), 661–676.
Article Google Scholar
Brooks, R., & Arbel, T. (2010). Generalizing inverse compositional and esm image alignment. International Journal of Computer Vision, 11(87), 191–212.
Article Google Scholar
Buenaposada, J., Muñoz, E., & Baumela, L. (2009). Efficient illumination independent appearance-based face tracking. Image and Vision Computing, 27(5), 560–578.
Article Google Scholar
Buenaposada, J. M., & Baumela, L. (2002). Real-time tracking and estimation of plane pose. In Proceedings of Computer Vision and Pattern Recognition Conference (vol. II, pp. 697–700). IEEE, Quebec.
Buenaposada, J. M., Muñoz, E., & Baumela, L. (2004). Efficient appearance-based tracking. In Proceedings of Computer Vision and Pattern Recognition Conference-Workshop on Nonrigid and Articulated Motion. IEEE.
Cobzas, D., Jagersand, M., & Sturm, P. (2009). 3D SSD tracking with estimated 3d planes. Image and Vision Computing, 27(1–2), 69–79.
Article Google Scholar
Cootes, T., Edwards, G., & Taylor, C. (2001). Active appearance models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(6), 681–685.
Article Google Scholar
Dowson, N., & Bowden, R. (2008). Mutual information for lucas-kanade tracking (MILK): An inverse compositional formulation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(1), 180–185.
Article Google Scholar
Gonzalez-Mora, J., Guil, N., & De la Torre, F. (2009). Efficient image alignment using linear appearance models. In Proceedings of Computer Vision and Pattern Recognition Conference.
Gross, R., Matthews, I., & Baker, S. (2006). Active appearance models with occlusion. Image and Vision Computing, 24(6), 593–604.
Article Google Scholar
Hager, G., & Belhumeur, P. (1998). Efficient region tracking with parametric models of geometry and illumination. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(10), 1025–1039.
Hinterstoisser, S., Lepetit, V., Benhimane, S., Fua, P., & Navab, N. (2011). Learning real-time perspective patch rectification. International Journal of Computer Vision, 91(1), 107–130.
Holzer, S., Ilic, S., & Navab, N. (2013). Multilayer adaptive linear predictors for real-time tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(1), 105–117.
Article Google Scholar
Holzer, S., Pollefeys, M., Ilic, S., Tan, D. J., & Navab, N. (2012). Online learning of linear predictors for real-time tracking. In Proceedings of European Conference on Computer Vision. Firenze.
Jurie, F., & Dhome, M. (2002). Hyperplane approximation for template matching. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(7), 996–100.
Article Google Scholar
Lucas, B. D., & Kanade, T. (1981). An iterative image registration technique with an application to stereo vision. In Proceedings of International Joint Conference on Artificial Intelligence (pp. 674–679).
Matthews, I., & Baker, S. (2004). Active appearance models revisited. International Journal of Computer Vision, 60(2), 135–164.
Article Google Scholar
Matthews, I., Xiao, J., & Baker, S. (2007). 2D vs. 3D deformable face models: Representational power, construction, and real-time fitting. International Journal of Computer Vision, 75(1), 93–113.
Article Google Scholar
Megret, R., Authesserre, J., & Berthoumieu, Y. (2010). Bidirectional composition on lie groups for gradient-based image alignment. IEEE Transactions on Image Processing, 19(9), 2369–2381.
Article MathSciNet Google Scholar
Muñoz, E., Buenaposada, J. M., & Baumela, L. (2005). Efficient model-based 3d tracking of deformable objects. In Proceedings of IEEE International Conference on Computer Vision (vol. I, pp. 877–882). Beijing.
Muñoz, E., Buenaposada, J. M., & Baumela, L. (2009). A direct approach for efficiently tracking with 3D morphable models. In Proceedings of IEEE International Conference on Computer Vision (vol. I). Kyoto.
Navarathna, R., Sridharan, S., & Lucey, S. (2011). Fourier active appearance models. In Proceedings of IEEE International Conference on Computer Vision.
Nguyen, M. H., & De la Torre, F. (2010). Metric learning for image alignment. International Journal of Computer Vision, 88(1), 69–84.
Article Google Scholar
Nocedal, J., & Wright, S. (2006). Numerical optimization. New York: Springer.
MATH Google Scholar
Romdhani, S., & Vetter, T. (2003). Efficient, robust and accurate fitting of a 3D morphable model. Proceedings of International Conference on Computer Vision, 1, 59–66.
Shum, H. Y., & Szeliski, R. (2000). Construction of panoramic image mosaics with global and local alignment. International Journal of Computer Vision, 36(2), 101–130.
Article Google Scholar
Tzimiropoulos, G., Zafeiriou, S., & Pantic, M. (2011). Robust and efficient parametric face alignment. In Proceedings of International Conference on Computer Vision, (pp. 1847–1854).
Xu, Y., & Roy-Chowdhury, A. K. (2008). Inverse compositional estimation of 3D pose and lighting in dynamic scenes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(7), 1300–1307.
Zimmermann, K., Matas, J., & Svoboda, T. (2009). Tracking by an optimal sequence of linear predictors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(4), 677–692.
Article Google Scholar

Download references

Acknowledgments

The authors are grateful to Pascal Fua for interesting discussions about this work. They also thank the anonymous reviewers for their comments. Research funded by the Ministerio de Economía y Competitividad of Spain under contract TIN2013-47630-C2-2-R

Author information

Enrique Muñoz
Present address: Pattern Analysis and Computer Vision, Istituto Italiano di Tecnologia, via Morego, 30, 16163, Genoa, Italy

Authors and Affiliations

Departamento de Inteligencia Artificial ETSI Informáticos, Universidad Politécnica de Madrid, Campus Montegancedo s/n, 28660, Boadilla del Monte, Madrid, Spain
Enrique Muñoz, Pablo Márquez-Neila & Luis Baumela

Authors

Enrique Muñoz
View author publications
You can also search for this author in PubMed Google Scholar
Pablo Márquez-Neila
View author publications
You can also search for this author in PubMed Google Scholar
Luis Baumela
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Luis Baumela.

Additional information

Communicated by M. Hebert.

Appendices

Appendix 1: Derivative of Inverse Warps

Let $f({\varvec{x}},\varvec{\phi })$ be a warp function and $f^{-1}({\varvec{x}}, \varvec{\phi })$ its inverse, such that $f(f^{-1}({\varvec{x}}, \varvec{\phi }), \varvec{\phi }) = {\varvec{x}}$, where $\varvec{\phi }$ is a small disturbance around the identity warp, $\varvec{\phi }_0$. The derivative of this expression with respect to $\varvec{\phi }$ is

$$\begin{aligned} \left. \dfrac{\partial f(f^{-1}({\varvec{x}}, \varvec{\phi }), \varvec{\phi })}{\partial \varvec{\phi }} \right| _{\varvec{\phi }= \varvec{\phi }_0} = \dfrac{\partial {\varvec{x}}}{\partial \varvec{\phi }} = \mathbf 0, \end{aligned}$$

that can be expanded using the chain rule:

$$\begin{aligned}&\left. \dfrac{\partial f({\varvec{x}},\varvec{\phi })}{\partial \varvec{\phi }}\right| _{\varvec{\phi }=\varvec{\phi }_0} + \left. \dfrac{\partial f({\varvec{x}}',\varvec{\phi }_0)}{\partial {\varvec{x}}'}\right| _{{\varvec{x}}'={\varvec{x}}}\cdot \left. \dfrac{\partial f^{-1}({\varvec{x}}, \varvec{\phi })}{\partial \varvec{\phi }} \right| _{\varvec{\phi }=\varvec{\phi }_0}\\&= \mathbf 0. \end{aligned}$$

As $f({\varvec{x}}, \varvec{\phi }_0)$ is the identity warp,

$$\begin{aligned} \left. \dfrac{\partial f({\varvec{x}},\varvec{\phi })}{\partial \varvec{\phi }}\right| _{\varvec{\phi }=\varvec{\phi }_0} + \left. \dfrac{\partial f^{-1}({\varvec{x}}, \varvec{\phi })}{\partial \varvec{\phi }} \right| _{\varvec{\phi }=\varvec{\phi }_0} = \mathbf 0. \end{aligned}$$

Finally,

$$\begin{aligned} \left. \dfrac{\partial f({\varvec{x}},\varvec{\phi })}{\partial \varvec{\phi }}\right| _{\varvec{\phi }=\varvec{\phi }_0} = - \left. \dfrac{\partial f^{-1}({\varvec{x}}, \varvec{\phi })}{\partial \varvec{\phi }} \right| _{\varvec{\phi }=\varvec{\phi }_0}. \end{aligned}$$

Appendix 2: In-plane Translation

We consider the case of a plane $\varvec{\pi }$ that moves perpendicular to its normal $\mathbf {n}$ at a distance $d$ from the origin. The set of points of the plane are $\mathcal {V} = \{{\varvec{x}}\in \mathbb {R}^3 : \mathbf {n}^\top {\varvec{x}}+d=0\}$, that is a two-dimensional surface embedded in $\mathbb {R}^3$, and therefore, it is a closed set. The support set is a finite subset of $\mathcal {V}, \mathcal {X}\subset \mathcal {V}$.

The in-plane translation has two degrees of freedom. Thus, the pair of warps $\mathbf{f}$ and $\mathbf{g}$ are parametrized respectively by ${\varvec{\mu }}\in \mathbb {R}^3$ and $\varDelta {\varvec{\phi }}\in \mathbb {R}^2$. The two warps are $\mathbf{f}({\varvec{x}},{\varvec{\mu }})={\varvec{x}}+ {\varvec{\mu }}$ and $\mathbf{g}({\varvec{x}},\varDelta {\varvec{\phi }}) = {\varvec{x}}+ [{\varvec{u}}\ {\varvec{v}}]\cdot \varDelta {\varvec{\phi }}$, where ${\varvec{u}},{\varvec{v}}\in \mathbb {R}^3$ are two independent vectors perpendicular to $\mathbf {n}$.

We will prove that this system satisfies both the CA and the EBCA. The CA states that, for any ${\varvec{\mu }}$ and $\varDelta {\varvec{\phi }}$, there exists a ${\varvec{\mu }}'$ such that $\mathbf{f}({\varvec{x}},{\varvec{\mu }}') = \mathbf{f}(\mathbf{g}({\varvec{x}},\varDelta {\varvec{\phi }}),{\varvec{\mu }})$. This is trivially proved taking ${\varvec{\mu }}' = {\varvec{\mu }}+ [{\varvec{u}}\ {\varvec{v}}]\cdot \varDelta {\varvec{\phi }}$. The identity $\mathbf{g}$-warp is obtained for $\varDelta {\varvec{\phi }}_0=[0\ 0]^T$.

To prove the EBCA, we write the expression for Requirement 2

$$\begin{aligned}&\left. \dfrac{\partial I[{\varvec{x}}, t]}{\partial {\varvec{x}}} \right| _{{\varvec{x}}=f(\mathcal {X}, {\varvec{\mu }})} \cdot \left. \dfrac{\partial \mathbf{f}({\varvec{x}},{\varvec{\mu }})}{\partial {\varvec{x}}}\right| _{{\varvec{x}}=\mathcal {X}} \cdot \left. \dfrac{\partial \mathbf{g}(\mathcal {X},\varvec{\phi })}{\partial \varvec{\phi }}\right| _{\varvec{\phi }=\varvec{\phi }_0}\nonumber \\&= \left. \dfrac{\partial T[{\varvec{x}}]}{\partial {\varvec{x}}} \right| _{{\varvec{x}}=\mathcal {X}} \cdot \left. \dfrac{\partial \mathbf{g}(\mathcal {X},\varvec{\phi })}{\partial \varvec{\phi }}\right| _{\varvec{\phi }=\varvec{\phi }_0}. \end{aligned}$$

(22)

The derivative of the $\mathbf{f}$-warp with respect to ${\varvec{x}}$ is the $3\times 3$ identity. Also, the derivative of the $\mathbf{g}$-warp with respect to $\varDelta {\varvec{\phi }}$ is $[{\varvec{u}}\ {\varvec{v}}]$. Therefore, (22) becomes

$$\begin{aligned} \left. \dfrac{\partial I[{\varvec{x}}, t]}{\partial {\varvec{x}}} \right| _{{\varvec{x}}=f(\mathcal {X}, {\varvec{\mu }})} \cdot [{\varvec{u}}\ {\varvec{v}}] = \left. \dfrac{\partial T[{\varvec{x}}]}{\partial {\varvec{x}}} \right| _{{\varvec{x}}=\mathcal {X}} \cdot [{\varvec{u}}\ {\varvec{v}}]. \end{aligned}$$

(23)

We do not know $I$ nor $T$ but, thanks to the brightness constancy assumption, we know a relation between them $I[f({\varvec{x}},{\varvec{\mu }}),t] = T[{\varvec{x}}],\ \forall {\varvec{x}}\in \mathcal {V}$. The partial derivatives of two functions that are equal in a closed subset $\mathcal {V}$ of their domain are not, in general, equal in that subset. However, the partial derivatives projected onto $\mathcal {V}$ are equal. Thus, given a projection matrix $\mathbf {\Pi }$ onto the plane $\mathcal {V}$ we have that:

$$\begin{aligned} \left. \dfrac{\partial I[{\varvec{x}}, t]}{\partial {\varvec{x}}} \right| _{{\varvec{x}}=f(\mathcal {X}, {\varvec{\mu }}_t)} \cdot \mathbf {\Pi } = \left. \dfrac{\partial T[{\varvec{x}}]}{\partial {\varvec{x}}} \right| _{{\varvec{x}}=\mathcal {X}} \cdot \mathbf {\Pi }. \end{aligned}$$

Since we can choose $\mathbf {\Pi }=[{\varvec{u}}\ {\varvec{v}}]$, expression (23) is true. This proves the EBCA.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Muñoz, E., Márquez-Neila, P. & Baumela, L. Rationalizing Efficient Compositional Image Alignment. Int J Comput Vis 112, 354–372 (2015). https://doi.org/10.1007/s11263-014-0769-6

Download citation

Received: 22 October 2013
Accepted: 15 September 2014
Published: 04 October 2014
Issue Date: May 2015
DOI: https://doi.org/10.1007/s11263-014-0769-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Rationalizing Efficient Compositional Image Alignment

Abstract

Access this article

Similar content being viewed by others

SIFT Flow: Dense Correspondence Across Scenes and Its Applications

A Bimodal Co-sparse Analysis Model for Image Processing

The Conditional Lucas & Kanade Algorithm

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendices

Appendix 1: Derivative of Inverse Warps

Appendix 2: In-plane Translation

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Rationalizing Efficient Compositional Image Alignment

Abstract

Access this article

Similar content being viewed by others

SIFT Flow: Dense Correspondence Across Scenes and Its Applications

A Bimodal Co-sparse Analysis Model for Image Processing

The Conditional Lucas & Kanade Algorithm

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendices

Appendix 1: Derivative of Inverse Warps

Appendix 2: In-plane Translation

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation