Abstract
We revisit the popular Procrustes matching procedure of landmark shape analysis and consider the situation where the landmark coordinates have a completely general covariance matrix, extending previous approaches based on factored covariance structures. Procrustes matching is used to compute the Riemannian metric in shape space and is used more widely for carrying out inference such as estimation of mean shape and covariance structure. Rather than matching using the Euclidean distance we consider a general Mahalanobis distance. This approach allows us to consider different variances at each landmark, as well as covariance structure between the landmark coordinates, and more general covariance structures. Explicit expressions are given for the optimal translation and rotation in two dimensions and numerical procedures are used for higher dimensions. Simultaneous estimation of both mean shape and covariance structure is difficult due to the inherent non-identifiability. The method requires the specification of constraints to carry out inference, and we discuss some possible practical choices. We illustrate the methodology using data from fish silhouettes and mouse vertebra images.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bookstein FL (1986) Size and shape spaces for landmark data in two dimensions (with discussion). Stat Sci 1:181–242
Brignell CJ (2007) Shape analysis and statistical modelling in brain imaging. Ph.D. thesis, University of Nottingham
Davies RH, Twining CJ, Taylor CJ (2008) Statistical models of shape: optimisation and evaluation. Springer, Heidelberg. http://www.springer.com/computer/computer+imaging/book/978-1-84800-137-4
Dryden IL (1989) The statistical analysis of shape data. Ph.D. thesis, University of Leeds
Dryden IL (2014) Shapes: statistical shape analysis. R package version 1.1–10. http://CRAN.R-project.org/package=shapes.
Dryden IL, Mardia KV (1998) Statistical shape analysis. Wiley, Chichester
Dutilleul P (1999) The MLE algorithm for the matrix normal distribution. J Stat Comput Simul 64:105–123
Goodall CR (1991) Procrustes methods in the statistical analysis of shape (with discussion). J R Stat Soc Ser B 53:285–339
Gower JC (1975) Generalized Procrustes analysis. Psychometrika 40:33–50
Kendall DG (1984) Shape manifolds, Procrustean metrics and complex projective spaces. Bull Lond Math Soc 16:81–121
Kendall DG (1989) A survey of the statistical theory of shape (with discussion). Stat Sci 4:87–120
Kendall DG, Barden D, Carne TK, Le H (1999) Shape and shape theory. Wiley, Chichester
Koschat M, Swayne D (1991) A weighted procrustes criterion. Psychometrika 56(2):229–239. doi:10.1007/BF02294460. http://dx.doi.org/10.1007/BF02294460
Krim H, Yezzi AJ (2006) Statistics and analysis of shapes. Springer, Berlin
Lele S (1993) Euclidean distance matrix analysis (EDMA): estimation of mean form and mean form difference. Math Geol 25(5):573–602. DOI 10.1007/BF00890247. http://dx.doi.org/10.1007/BF00890247
Mardia KV, Dryden IL (1989) The statistical analysis of shape data. Biometrika 76:271–282
Sharvit D, Chan J, Tek H, Kimia BB (1998) Symmetry-based indexing of image databases. J Vis Commun Image Represent 9(4):366–380
Srivastava A, Klassen E, Joshi SH, Jermyn IH (2011) Shape analysis of elastic curves in Euclidean spaces. IEEE Trans Pattern Anal Mach Intell 33(7):1415–1428. http://doi.ieeecomputersociety.org/10.1109/TPAMI.2010.184
Srivastava A, Turaga PK, Kurtek S (2012) On advances in differential-geometric approaches for 2D and 3D shape analyses and activity recognition. Image Vis Comput 30(6–7):398–416
Theobald DL, Wuttke DS (2006) Empirical Bayes hierarchical models for regularizing maximum likelihood estimation in the matrix gaussian procrustes problem. Proc Nat Acad Sci 103(49):18521–18527. doi:10.1073/pnas.0508445103. http://www.pnas.org/content/103/49/18521.abstract
Theobald DL, Wuttke DS (2008) Accurate structural correlations from maximum likelihood superpositions. PLoS Comput Biol 4(2):e43
Younes L (2010) Shapes and diffeomorphisms. Applied mathematical sciences, vol 171. Springer, Berlin. doi:10.1007/978-3-642-12055-8. http://dx.doi.org/10.1007/978-3-642-12055-8
Acknowledgements
We acknowledge the support of a Royal Society Wolfson Research Merit Award and EPSRC grant EP/K022547/1.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix
Appendix
Proof of Result 2.1.
Let \(v = \text{vec}(\mu -X\varGamma )\), then
The minimising translation is found by setting the first derivative equal to zero:
The second derivative is clearly positive because \(\varSigma ^{-1}\) is positive definite. Therefore, \(D_{\mathrm{pCWP}}^{2}\) is minimised when \(\gamma = [(I_{m} \otimes 1_{k})^{T}\varSigma ^{-1}(I_{m} \otimes 1_{k})]^{-1}(I_{m} \otimes 1_{k})\varSigma ^{-1}v\). \(\square \)
Proof of Result 2.2.
From Eq. (9.2) the minimising translation is \(\hat{\gamma }= A\text{vec}(\mu -X\varGamma )\), so for m = 2,
Therefore,
and \(D_{\mathrm{pCWP}}^{2}(X,\mu;\varSigma ) = C + P\cos ^{2}\theta + Q\sin ^{2}\theta + R\cos \theta \sin \theta + S\cos \theta + T\sin \theta\) where
Let \(\lambda\) be the real Lagrangian multiplier to enforce the constraint \(\cos ^{2}\theta +\sin ^{2}\theta = 1\) and let \(L = D_{\mathrm{pCWP}}^{2}(X,\mu;\varSigma ) +\lambda (1 -\cos ^{2}\theta -\sin ^{2}\theta )\). Then,
Solving the first two equations simultaneously and substituting the solutions in the third gives the expressions for \(\cos \theta\), \(\sin \theta\) and the quartic equation, respectively. To show this is a minimum of \(D_{\mathrm{pCWP}}^{2}\), consider the matrix of second derivatives,
Let \(\xi _{1} \geq \xi _{2}\) be the eigenvalues of S ∗. Then, \(\vert S^{{\ast}}-\xi _{i}I\vert = (\xi _{i} + 2\lambda )^{2} - 2(P + Q)(\xi _{i} + 2\lambda ) + 4PQ - R^{2}\), so \((\xi _{i} + 2\lambda ) = P + Q \pm \sqrt{(P - Q)^{2 } + R^{2}}\). Given \(\varSigma ^{-1}\) is positive definite, P > 0 and Q > 0, then \(\xi _{2}\) is strictly positive if \(P + Q - 2\lambda -\sqrt{(P - Q)^{2 } + R^{2}} > 0\) which is true if the constraint on \(\lambda\) is satisfied. \(\square \)
Proof of Result 2.3.
Let v = vec(μ) and \(\xi = \text{vec}(X\varGamma )\), then
This implies
Therefore, the minimum is at the solution of
The matrix of second derivatives is clearly positive because \(\varSigma ^{-1}\) is positive definite. \(\square \)
Proof of Result 2.4.
Replacing \(\cos \theta\) with \(\beta \cos \theta\) and \(\sin \theta\) with \(\beta \sin \theta\) in the proof of Result 9.2.2 gives \(D_{\mathrm{CWP}}^{2}(X,\mu;\varSigma ) = C + P\beta ^{2}\cos ^{2}\theta + Q\beta ^{2}\sin ^{2}\theta + R\beta ^{2}\cos \theta \sin \theta + S\beta \cos \theta + T\beta \sin \theta\). Let \(\psi _{1} =\beta \cos \theta\) and \(\psi _{2} =\beta \sin \theta\), then
Setting these expressions equal to zero and solving them simultaneously gives the required expressions for ψ 1 and ψ 2. Solving \(\psi _{1} =\beta \cos \theta\) and \(\psi _{2} =\beta \sin \theta\) subject to the constraint that \(\cos ^{2}\theta +\sin ^{2}\theta = 1\) gives the rotation and scale parameters. Given these, the translation is obtained by letting \(v = \text{vec}(\mu -\beta X\varGamma )\) in the proof of Result 9.2.1. \(\square \)
Proof of Result 2.5.
If \(\varSigma = I_{m} \otimes \varSigma _{k}\), then the similarity transformation estimates of Result 9.2.4 can be simplified. For the translation,
Therefore, \(\hat{\gamma }^{T} = (1_{k}^{T}\varSigma _{k}^{-1}1_{k})^{-1}1_{k}^{T}\varSigma _{k}^{-1}(\mu -\beta X\varGamma )\), which is zero given \(1_{k}^{T}\varSigma _{k}^{-1}X = 0 = 1_{k}^{T}\varSigma _{k}^{-1}\mu\). Referring to the notation of Result 9.2.2, if \(\varSigma = I_{m} \otimes \varSigma _{k}\) then \(A_{11} = A_{22} = (1_{k}^{T}\varSigma _{k}^{-1}1_{k})^{-1}1_{k}^{T}\varSigma _{k}^{-1}\) and \(A_{12} = A_{21} = 0_{k}^{T}\). Then, from Eq. (9.4), if X and μ are located such that \(1_{k}^{T}\varSigma _{k}^{-1}X = 0 = 1_{k}^{T}\varSigma _{k}^{-1}\mu\), then \(\alpha _{i} =\delta _{i} =\zeta _{i} = 0\), for i = 1, 2, and P, Q, R, S and T simplify to
The minimising rotation and scaling can then be obtained and have been derived by Brignell [2]. \(\square \)
Proof of Result 3.1.
Therefore, the minimum of G CWP is when \(\theta\) is a solution of this last equation. \(\square \)
Proof of Result 3.2.
The log-likelihood of the multivariate normal model, \(\text{vec}(X_{i}) \sim N_{km}(\text{vec}(\mu ),\varSigma )\), where X i are shapes invariant under Euclidean similarity transformations, is
Therefore, the MLE of the mean shape is the solution of
Hence, \(\hat{\mu }=\bar{ X} = \frac{1} {n}\sum _{i=1}^{n}(\beta _{ i}X_{i}\varGamma _{i} + 1_{k}\gamma _{i}^{T})\) and
Therefore, minimising G CWP is equivalent to maximising \(L(X_{1},\ldots,X_{n};\mu,\varSigma )\). \(\square \)
Proof of Result 4.1.
Let the m columns of \((I_{m} \otimes 1_{k})\) be 1 j for j = 1, …, m and let γ ij be the jth element of the translation vector for shape X i , then the log of the likelihood, L, for the multivariate normal model can be written:
Now, \(1_{j}^{T}\varSigma ^{-1} = \frac{\sigma _{j}^{-1}} {\sqrt{k}} 1_{j}^{T}1_{j}1_{j}^{T}\) as all the eigenvectors of \(\varSigma\) are orthogonal to 1 j except the one proportional to 1 j . Therefore,
Given X i and μ are all centred, \(1_{j}^{T}\text{vec}(\beta _{i}X_{i}\varGamma _{i}-\mu ) = 0\), and the maximizing translation is clearly γ ij  = 0 for all i = 1, …, n and j = 1, …, m. \(\square \)
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Brignell, C.J., Dryden, I.L., Browne, W.J. (2016). Covariance Weighted Procrustes Analysis. In: Turaga, P., Srivastava, A. (eds) Riemannian Computing in Computer Vision. Springer, Cham. https://doi.org/10.1007/978-3-319-22957-7_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-22957-7_9
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-22956-0
Online ISBN: 978-3-319-22957-7
eBook Packages: EngineeringEngineering (R0)