Abstract
Procrustes analysis (PA) has been a popular technique to align and build 2-D statistical models of shapes. Given a set of 2-D shapes PA is applied to remove rigid transformations. Later, a non-rigid 2-D model is computed by modeling the residual (e.g., PCA). Although PA has been widely used, it has several limitations for modeling 2-D shapes: occluded landmarks and missing data can result in local minima solutions, and there is no guarantee that the 2-D shapes provide a uniform sampling of the 3-D space of rotations for the object. To address previous issues, this paper proposes subspace PA (SPA). Given several instances of a 3-D object, SPA computes the mean and a 2-D subspace that can model rigid and non-rigid deformations of the 3-D object. We propose a discrete (DSPA) and continuous (CSPA) formulation for SPA, assuming that 3-D samples of an object are provided. DSPA extends the traditional PA, and produces unbiased 2-D models by uniformly sampling different views of the 3-D object. CSPA provides a continuous approach to uniformly sample the space of 3-D rotations, being more efficient in space and time. We illustrate the benefits of SPA in two different applications. First, SPA is used to learn 2-D face and body models from 3-D datasets. Experiments on the FaceWarehouse and CMU motion capture (MoCap) datasets show the benefits of our 2-D models against the state-of-the-art PA approaches and conventional 3-D models. Second, SPA learns an unbiased 2-D model from CMU MoCap dataset and it is used to estimate the human pose on the Leeds Sports dataset.
Similar content being viewed by others
Notes
Please refer to Igual et al. (2014) for additional details on uniform/non-uniform sampling of the rotation space and its effect on building biased/unbiased 2-D models.
Bold capital letters denote a matrix \(\mathbf{X}\), bold lower-case letters a column vector \(\mathbf{x}\). \(\mathbf{x}_i\) represents the \(i\mathrm{th}\) column of the matrix \(\mathbf{X}\). \(x_{ij}\) denotes the scalar in the \(i\mathrm{th}\) row and \(j\mathrm{th}\) column of the matrix \(\mathbf{X}\). All non-bold letters represent scalars. \(\mathbf{I}_n \in \mathbb {R}^{n \times n}\) is an identity matrix. \(\Vert \mathbf{x}\Vert _2=\root 2 \of {\sum _i |x_i|^2}\) and \(\Vert \mathbf{X}\Vert _F=\sqrt{\sum _{ij} x_{ij}^2}\) denote the 2-norm for a vector and the Frobenius norm of a matrix, respectively. \(\mathbf{X}\otimes \mathbf{Y}\) is the Kronecker product of matrices and \(\mathbf{X}^{(p)}\) is the vec-transpose operator, detailed in Appendix 1.
See Appendix 1 for an explanation of the vec-transpose operator.
See Appendix 1 for an explanation of the vec-transpose operator.
The code was downloaded from author’s website (http://isit.u-clermont1.fr/~ab).
The code was downloaded from author’s website (http://www.research.rutgers.edu/~feiyang/web2/face_morphing.htm).
The code was downloaded from author’s website (http://www.ics.uci.edu/~dramanan/).
See Appendix A for the vec-transpose operator.
References
Andriluka, M., Roth, S., & Schiele, B. (2009). Pictorial structures revisited: People detection and articulated pose estimation. In IEEE computer vision and pattern recognition (CVPR), pp. 1014–1021.
Bartoli, A., Pizarro, D., & Loog, M. (2013). Stratified generalized procrustes analysis. International Journal of Computer Vision, 101(2), 227–253.
Brand, M. (2001). Morphable 3d models from video. In IEEE computer vision and pattern recognition (CVPR), Vol. 2, pp. II–456.
Cao, C., Weng, Y., Zhou, S., Tong, Y., & Zhou, K. (2013). Facewarehouse: A 3d facial expression database for visual computing. IEEE Transactions on Visualization and Computer Graphics, 1(1), 99.
Carnegie mellon motion capture database. http://mocap.cs.cmu.edu
Cootes, T. F. & Taylor, C. J. (2004). Statistical models of appearance for computer vision.
Cootes, T. F., Edwards, G. J., Taylor, C. J., et al. (2001). Active appearance models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(6), 681–685.
De la Torre, F. (2012). A least-squares framework for component analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(6), 1041–1055.
De la Torre, F., & Black, M. J. (2003). Robust parameterized component analysis: Theory and applications to 2d facial appearance models. Computer Vision and Image Understanding, 91(1), 53–71.
Dryden, I. L., & Mardia, K. V. (1998). Statistical shape analysis (Vol. 4). New York: Wiley.
Frey, B. J., & Jojic, N. (2003). Transformation-invariant clustering using the em algorithm. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(1), 1–17.
Goodall, C. (1991). Procrustes methods in the statistical analysis of shape. Journal of the Royal Statistical Society. Series B (Methodological), pp. 285–339
Gower, J. C., & Dijksterhuis, G. B. (2004). Procrustes problems (Vol. 3). Oxford: Oxford University Press.
Hartley, R., & Zisserman, A. (2003). Multiple view geometry in computer vision. Cambridge: Cambridge University Press.
Igual, L., Perez-Sala, X., Escalera, S., Angulo, C., & De la Torre, F. (2014). Continuous generalized procrustes analysis. Pattern Recognition, 47(2), 659–671.
Jiang, H., Drew, M. S., & Li, Z. N. (2007). Matching by linear programming and successive convexification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(6), 959–975.
Johnson, S., & Everingham, M. (2010). Clustered pose and nonlinear appearance models for human pose estimation. In Proceedings of the British machine vision conference. doi:10.5244/C.24.12.
Kokkinos, I. & Yuille, A. (2007). Unsupervised learning of object deformation models. In IEEE international conference on computer vision (ICCV), pp. 1–8
Korte, B., Lovász, L., & Schrader, R. (1991). Greedoids. 1991. Algorithms and Combinatorics. Berlin: Springer
Learned-Miller, E. G. (2006). Data driven image models through continuous joint alignment. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(2), 236–250.
Li, H., Huang, X., & He, L. (2013). Object matching using a locally affine invariant and linear programming techniques. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(2), 411–424.
Marimont, D. H., & Wandell, B. A. (1992). Linear models of surface and illuminant spectra. JOSA A, 9(11), 1905–1913.
Marques, M., Stosić, M., & Costeira, J. (2009). Subspace matching: Unique solution to point matching with geometric constraints. In IEEE international conference on computer vision (ICCV), pp. 1288–1294.
Matthews, I., Xiao, J., & Baker, S. (2007). 2d vs. 3d deformable face models: Representational power, construction, and real-time fitting. International Journal of Computer Vision, 75(1), 93–113.
Minka, T. P. (2000). Old and new matrix algebra useful for statistics. http://research.microsoft.com/~minka/papers/matrix/.
Naimark, M. A. (1964). Linear representatives of the Lorentz group (translated from Russian). New York: Macmillan.
Park, D. & Ramanan, D. (2011). N-best maximal decoders for part models. In IEEE international conference on computer vision (ICCV), pp. 2627–2634
Pearson, K. (1901). On lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 2(11), 559–572.
Perez-Sala, X., De la Torre, F., Igual, L., Escalera, S., & Angulo, C. (2014). Subspace procrustes analysis. In ECCV Workshop on ChaLearn Looking at People.
Pishchulin, L., Andriluka, M., Gehler, P., & Schiele, B. (2013a). Poselet conditioned pictorial structures. In IEEE conference on computer vision and pattern recognition (CVPR), pp. 588–595
Pishchulin, L., Andriluka, M., Gehler, P., & Schiele, B. (2013b). Strong appearance and expressive spatial models for human pose estimation. In IEEE international conference on computer vision (ICCV), pp. 3487–3494.
Pizarro, D., & Bartoli, A. (2011). Global optimization for optimal generalized procrustes analysis. In IEEE conference on computer vision and pattern recognition CVPR, pp. 2409–2415.
Roig, G., Boix, X., & De la Torre, F. (2009). Optimal feature selection for subspace image matching. In IEEE international conference on computer vision workshops, pp. 200–205.
Torresani, L., Hertzmann, A., & Bregler, C. (2008). Nonrigid structure-from-motion: Estimating shape and motion with hierarchical priors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(5), 878–892.
Xiao, J., Chai, J., & Kanade, T. (2006). A closed-form solution to non-rigid shape and motion recovery. International Journal of Computer Vision, 67(2), 233–246.
Yang, Y., & Ramanan, D. (2013). Articulated human detection with flexible mixtures of parts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(12), 2878–2890.
Yang, F., Shechtman, E., Wang, J., Bourdev, L., & Metaxas, D. (2012). Face morphing using 3d-aware appearance optimization. In Graphics Interface (pp. 93–99). Canadian Information Processing Society
Yezzi, A. J., & Soatto, S. (2003). Deformotion: Deforming motion, shape average and the joint registration and approximation of structures in images. International Journal of Computer Vision, 53(2), 153–167.
Zhou, F., Brandt, J., & Lin, Z. (2013). Exemplar-based graph matching for robust facial landmark localization. In IEEE international conference on computer vision (ICCV), pp. 1025–1032.
Acknowledgments
This work is partly supported by the Spanish Ministry of Science and Innovation (Projects TIN2012-38416-C03-01, TIN2015-66951-C2-1-R, TIN2013-43478-P), Project 2014 SGR 1219, SUR, Departament d’Economia i Coneixement, and Comissionat per a Universitats i Recerca del Departament d’Innovació, Universitats i Empresa de la Generalitat de Catalunya.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by K. Ikeuchi.
Appendices
Appendix 1: Vec-transpose
Vec-transpose \(\mathbf{A}^{(p)}\) is a linear operator that generalizes vectorization and transposition operators (Marimont and Wandell 1992; Minka 2000). It reshapes matrix \(\mathbf{A}\in \mathbb {R}^{m \times n}\) by vectorizing each ith block of p rows, and rearranging it as the ith column of the reshaped matrix, such that \(\mathbf{A}^{(p)} \in \mathbb {R}^{pn \times \frac{m}{p}}\),
Note that \((\mathbf{A}^{(p)})^{(p)} = \mathbf{A}\) and \(\mathbf{A}^{(m)} = {{\mathrm{vec}}}(\mathbf{A})\). A useful rule for pulling a matrix out of nested Kronecker products is, \(((\mathbf{B}\mathbf{A})^{(p)} \mathbf{C})^{(p)}\) \(= (\mathbf{C}^T \otimes \mathbf{I}_p)\mathbf{B}\mathbf{A}= (\mathbf{B}^{(p)} \mathbf{C})^{(p)} \mathbf{A}\) , which leads to \((\mathbf{C}^T \otimes \mathbf{I}_2)\mathbf{B}= (\mathbf{B}^{(2)} \mathbf{C})^{(2)} \) .
Appendix 2: CSPA Formulation
In this Appendix, we detail the steps from Eqs. (21) to (24), as well as the definition of the covariance matrix, introduced in Sect. 3.
Given the value of \(\mathbf{M}^*\) and the optimal expression of \(\mathbf{A}(\varvec{\omega })^*_i\) from Eq. (10), we substitute them in Eq. (21) resulting in:
where \(\mathbf{H}= \mathbf{M}^{*T}(\mathbf{M}^*\mathbf{M}^{*T})^{-1}\mathbf{M}^*\) and \(\mathbf{D}_i \in \mathbb {R}^{3 \times \ell }\). Then,
leads us to Eqs. (23) and (24), where \(\bar{\mathbf{D}}_i = \mathbf{D}_i (\mathbf{I}_\ell - \mathbf{H})\) and \(\bar{\mathbf{d}}_i = {{\mathrm{vec}}}(\bar{\mathbf{D}}_i)\). From Eq. (24), solving \(\frac{{\partial E_{{{\mathrm{CSPA}}}}}}{{\partial \mathbf{c}(\varvec{\omega })_i}} = 0\) we find:
The substitution of \(\mathbf{c}(\varvec{\omega })^*_i\) in Eq. (24) results in:
where:
We can find the global optima of Eq. (35) by solving the eigenvalue problem, \(\varvec{\varSigma }\mathbf{B}= \mathbf{B}\varvec{\varLambda }\), where \(\varvec{\varSigma }\) is the covariance matrix and \(\varvec{\varLambda }\) are the eigenvalues corresponding to columns of \(\mathbf{B}\). However, the definite integral in \(\varvec{\varSigma }\) is data dependent. To be able to compute the integral off-line, we need to rearrange the elements in \(\varvec{\varSigma }\). Using vectorization and vec-transpose operatorFootnote 8:
which finally leads to:
where the definite integral \(\mathbf{Y}= \int _{\varvec{\varOmega }} \mathbf{P}(\varvec{\omega }) \otimes (\mathbf{I}_\ell \otimes \mathbf{P}(\varvec{\omega })) d\varvec{\omega }\in \mathbb {R}^{4\ell \times 9\ell }\) can be computed off-line.
Rights and permissions
About this article
Cite this article
Perez-Sala, X., De la Torre, F., Igual, L. et al. Subspace Procrustes Analysis. Int J Comput Vis 121, 327–343 (2017). https://doi.org/10.1007/s11263-016-0938-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-016-0938-x