Extrinsic Methods for Coding and Dictionary Learning on Grassmann Manifolds

Harandi, Mehrtash; Hartley, Richard; Shen, Chunhua; Lovell, Brian; Sanderson, Conrad

doi:10.1007/s11263-015-0833-x

Extrinsic Methods for Coding and Dictionary Learning on Grassmann Manifolds

Published: 07 June 2015

Volume 114, pages 113–136, (2015)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Mehrtash Harandi ORCID: orcid.org/0000-0002-6937-6300^1,2,
Richard Hartley^1,2,
Chunhua Shen³,
Brian Lovell⁴ &
…
Conrad Sanderson^2,4

1232 Accesses
63 Citations
Explore all metrics

Abstract

Sparsity-based representations have recently led to notable results in various visual recognition tasks. In a separate line of research, Riemannian manifolds have been shown useful for dealing with features and models that do not lie in Euclidean spaces. With the aim of building a bridge between the two realms, we address the problem of sparse coding and dictionary learning in Grassmann manifolds, i.e., the space of linear subspaces. To this end, we propose to embed Grassmann manifolds into the space of symmetric matrices by an isometric mapping. This in turn enables us to extend two sparse coding schemes to Grassmann manifolds. Furthermore, we propose an algorithm for learning a Grassmann dictionary, atom by atom. Lastly, to handle non-linearity in data, we extend the proposed Grassmann sparse coding and dictionary learning algorithms through embedding into higher dimensional Hilbert spaces. Experiments on several classification tasks (gender recognition, gesture classification, scene analysis, face recognition, action recognition and dynamic texture classification) show that the proposed approaches achieve considerable improvements in discrimination accuracy, in comparison to state-of-the-art methods such as kernelized Affine Hull Method and graph-embedding Grassmann discriminant analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Survey on SVM and their application in image classification

Article 11 January 2018

Challenges in Representation Learning: A Report on Three Machine Learning Contests

3D face recognition using image decomposition and POEM descriptor

Article 17 April 2024

Notes

On an abstract Riemannian manifold ${\mathcal {M}}$, the gradient of a smooth real function f at a point $x \in {\mathcal {M}}$, denoted by $\mathrm {grad} f(x)$, is the element of $T_x({\mathcal {M}})$ satisfying $\langle \mathrm {grad}f(x), \zeta \rangle _x = Df_x[\zeta ]$ for all $\zeta \in T_x({\mathcal {M}})$. Here, $Df_x[\zeta ]$ denotes the directional derivative of f at x in the direction of $\zeta $. The interested reader is referred to Absil et al. (2008) for more details on how the gradient of a function on Grassmann manifolds can be computed.
This is acknowledged by Ho et al. (2013).
Matlab codes are available at https://sites.google.com/site/mehrtashharandi/.

References

Absil, P.-A., Mahony, R., & Sepulchre, R. (2004). Riemannian geometry of grassmann manifolds with a view on algorithmic computation. Acta Applicandae Mathematica, 80(2), 199–220.
Article MathSciNet MATH Google Scholar
Absil, P.-A., Mahony, R., & Sepulchre, R. (2008). Optimization algorithms on matrix manifolds. Princeton: Princeton University Press.
Book MATH Google Scholar
Aharon, M., Elad, M., & Bruckstein, A. (2006). K-svd: An algorithm for designing overcomplete dictionaries for sparse representation. IEEE Transactions on Signal Processing, 54(11), 4311–4322.
Article Google Scholar
Arsigny, V., Fillard, P., Pennec, X., & Ayache, N. (2006). Log-euclidean metrics for fast and simple calculus on diffusion tensors. Magnetic Resonance in Medicine, 56(2), 411–421.
Article Google Scholar
Basri, R., & Jacobs, D. W. (2003). Lambertian reflectance and linear subspaces. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(2), 218–233.
Article Google Scholar
Begelfor, E., & Werman, M. (2006). Affine invariance revisited. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 2087–2094).
Candès, E. J., Romberg, J., & Tao, T. (2006). Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information. IEEE Transactions on Information Theory, 52(2), 489–509.
Article MATH Google Scholar
Cetingul, H. E., & Vidal, R. (2009), Intrinsic mean shift for clustering on stiefel and grassmann manifolds. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 1896–1902).
Cetingul, H.E., & Vidal, R. (2011). Sparse riemannian manifold clustering for HARDI segmentation. In IEEE International Symposium on Biomedical Imaging: From Nano to Macro (pp. 1750–1753).
Cetingul, H. E., Wright, M. J., Thompson, P. M., & Vidal, R. (2014). Segmentation of high angular resolution diffusion MRI using sparse riemannian manifold clustering. IEEE Transactions on Medical Imaging, 33(2), 301–317.
Article Google Scholar
Cevikalp, H., & Triggs, B. (2010). Face recognition based on image sets. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 2567–2573).
Chan, A.B., & Vasconcelos, N. (2005). Probabilistic kernels for the classification of auto-regressive visual processes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 846–851).
Chen, S., Sanderson, C., Harandi, M., & Lovell, B. C. (2013). Improved image set classification via joint sparse approximated nearest subspaces. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 452–459).
Chikuse, Y. (2003). Statistics on special manifolds (Vol. 174). New York: Springer.
MATH Google Scholar
Cock, K. D., & Moor, B. D. (2002). Subspace angles between ARMA models. Systems and Control Letters, 46, 265–270.
Article MathSciNet MATH Google Scholar
Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 886–893).
Donoho, D. L. (2006). Compressed sensing. IEEE Transactions on Information Theory, 52(4), 1289–1306.
Article MathSciNet MATH Google Scholar
Doretto, G., Chiuso, A., Wu, Y. N., & Soatto, S. (2003). Dynamic textures. International Journal of Computer Vision, 51, 91–109.
Article MATH Google Scholar
Elad, M. (2010). Sparse and redundant representations—From theory to applications in signal and image processing. New York: Springer.
MATH Google Scholar
Elhamifar, E., & Vidal, R. (2013). Sparse subspace clustering: Algorithm, theory, and applications. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(11), 2765–2781.
Article Google Scholar
Gallivan, K. A., Srivastava, A., Liu, X., & Van Dooren, P. (2003). Efficient algorithms for inferences on Grassmann manifolds. In IEEE Workshop on Statistical Signal Processing (pp. 315–318).
Ghanem, B., & Ahuja, N. (2010). Maximum margin distance learning for dynamic texture recognition. Proceedings of the European Conference on Computer Vision (ECCV), 6312, 223–236.
Google Scholar
Goh, A., & Vidal, R. (2008). Clustering and dimensionality reduction on Riemannian manifolds. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 1–7).
Golub, G. H., & Van Loan, C. F. (1996). Matrix computations (3rd ed.). Baltimore: Johns Hopkins University Press.
MATH Google Scholar
Gong, B., Shi, Y., Sha, F., & Grauman, K. (2012). Geodesic flow kernel for unsupervised domain adaptation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 2066–2073).
Gopalan, R., Li, R., & Chellappa, R. (2014). Unsupervised adaptation across domain shifts by generating intermediate data representations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(11), 2288–2302.
Article Google Scholar
Guo, K., Ishwar, P., & Konrad, J. (2013). Action recognition from video using feature covariance matrices. IEEE Transactions on Image Processing (TIP), 22(6), 2479–2494.
Article MathSciNet Google Scholar
Hamm, J., & Lee, D. D. (2008). Grassmann discriminant analysis: a unifying view on subspace-based learning. In Proceedings of the International Conference on Machine Learning (ICML) (pp. 376–383).
Harandi, M., Sanderson, C., Shen, C., & Lovell, B. C. (2013). Dictionary learning and sparse coding on Grassmann manifolds: An extrinsic solution. In: Proceedings of the International Conference on Computer Vision (ICCV).
Harandi, M.T., Hartley, R., Lovell, B. C., & Sanderson, C. (2015). Sparse coding on symmetric positive definite manifolds using bregman divergences. IEEE Transaction on Neural Networks and Learning Systems (TNNLS) PP(99):1–1.
Harandi, M. T., Sanderson, C., Shirazi, S., & Lovell, B. C. (2011). Graph embedding discriminant analysis on Grassmannian manifolds for improved image set matching. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 2705–2712).
Hartley, R., Trumpf, J., Dai, Y., & Li, H. (2013). Rotation averaging. International Journal of Computer Vision, 103(3), 267–305.
Article MathSciNet MATH Google Scholar
Helmke, U., Hüper, K., & Trumpf, J. (2007). Newtons method on Grassmann manifolds. Preprint: arXiv:0709.2205.
Ho, J., Xie, Y., & Vemuri, B. (2013). On a nonlinear generalization of sparse coding and dictionary learning. In: Proceedings of the International Conference on Machine Learning (ICML) (pp. 1480–1488).
Karcher, H. (1977). Riemannian center of mass and mollifier smoothing. Communications on pure and applied mathematics, 30(5), 509–541.
Article MathSciNet MATH Google Scholar
Kim, M., Kumar, S., Pavlovic, V., & Rowley, H. (2008). Face tracking and recognition with visual constraints in real-world videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 1–8).
Kim, T.-K., & Cipolla, R. (2009). Canonical correlation analysis of video volume tensors for action categorization and detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(8), 1415–1428.
Article Google Scholar
Kim, T.-K., Kittler, J., & Cipolla, R. (2007). Discriminative learning and recognition of image set classes using canonical correlations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(6), 1005–1018.
Article Google Scholar
Kokiopoulou, E., Chen, J., & Saad, Y. (2011). Trace optimization and eigenproblems in dimension reduction methods. Numerical Linear Algebra with Applications, 18(3), 565–602.
Article MathSciNet MATH Google Scholar
Lee, J. M. (2012). Introduction to smooth manifolds (Vol. 218). New York: Springer.
Book Google Scholar
Li, B., Ayazoglu, M., Mao, T., Camps, O. I., & Sznaier, M. (2011). Activity recognition using dynamic subspace angles. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 3193–3200).
Lui, Y. M. (2012). Human gesture recognition on product manifolds. Journal of Machine Learning Research, 13, 3297–3321.
MathSciNet MATH Google Scholar
Mairal, J., Bach, F., & Ponce, J. (2012). Task-driven dictionary learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(4), 791–804.
Article Google Scholar
Mairal, J., Bach, F., Ponce, J., & Sapiro, G. (2010). Online learning for matrix factorization and sparse coding. Journal of Machine Learning Research, 11, 19–60.
MathSciNet MATH Google Scholar
Mairal, J., Bach, F., Ponce, J., Sapiro, G., & Zisserman, A. (2008). Discriminative learned dictionaries for local image analysis. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 1–8). IEEE.
Mairal, J., Elad, M., & Sapiro, G. (2008). Sparse representation for color image restoration. IEEE Transactions on Image Processing (TIP), 17(1), 53–69.
Article MathSciNet Google Scholar
Manton, J. H. (2004). A globally convergent numerical algorithm for computing the centre of mass on compact lie groups. In International Conference on Control, Automation, Robotics and Vision 3 (pp. 2211–2216).
Ojala, T., Pietikäinen, M., & Mäenpää, T. (2002). Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24, 971–987.
Article Google Scholar
Olshausen, B. A., & Field, D. J. (1996). Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381(6583), 607–609.
Article Google Scholar
Ramamoorthi, R. (2002). Analytic PCA construction for theoretical analysis of lighting variability in images of a Lambertian object. IEEE Trans. Pattern Analysis and Machine Intelligence, 24(10), 1322–1333.
Article Google Scholar
Rao, S. R., Tron, R., Vidal, R., & Ma, Y. (2008). Motion segmentation via robust subspace separation in the presence of outlying, incomplete, or corrupted trajectories. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 1–8).
Ravichandran, A., Favaro, P., & Vidal, R. (2011). A unified approach to segmentation and categorization of dynamic textures. In Proceedings of the Asian Conference on Computer Vision (ACCV) (pp. 425–438). Springer.
Roweis, S. T., & Saul, L. K. (2000). Nonlinear dimensionality reduction by locally linear embedding. Science, 290(5500), 2323–2326.
Article Google Scholar
Sanderson, C., Harandi, M. T., Wong, Y., & Lovell, B. C. (2012). Combined learning of salient local descriptors and distance metrics for image set face verification. In Proceedings of the International Conference on Advanced Video and Signal-Based Surveillance (pp. 294–299).
Sankaranarayanan, A., Turaga, P., Baraniuk, R., & Chellappa, R. (2010). Compressive acquisition of dynamic scenes. Proceedings of the European Conference on Computer Vision (ECCV), 6311, 129–142.
Google Scholar
Shawe-Taylor, J., & Cristianini, N. (2004). Kernel methods for pattern analysis. Cambridge: Cambridge University Press.
Book Google Scholar
Shirazi, S., Sanderson, C., McCool, C., & Harandi, M. T. (2015). Bags of affine subspaces for robust object tracking. Preprint: arXiv:1408.2313.
Srivastava, A., & Klassen, E. (2004). Bayesian and geometric subspace tracking. Advances in Applied Probability, 36(1), 43–56.
Article MathSciNet MATH Google Scholar
Subbarao, R., & Meer, P. (2009). Nonlinear mean shift over Riemannian manifolds. International Journal of Computer Vision, 84(1), 1–20.
Article Google Scholar
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58(1), 267–288.
MathSciNet MATH Google Scholar
Turaga, P., Veeraraghavan, A., Srivastava, A., & Chellappa, R. (2011). Statistical computations on Grassmann and Stiefel manifolds for image and video-based recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(11), 2273–2286.
Article Google Scholar
Turk, M., & Pentland, A. (1991). Eigenfaces for recognition. Journal of Cognitive Neuroscience, 3(1), 71–86.
Article Google Scholar
Vemulapalli, R., Pillai, J. K., & Chellappa, R. (2013). Kernel learning for extrinsic classification of manifold features. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 1782–1789).
Viola, P., & Jones, M. J. (2004). Robust real-time face detection. International Journal of Computer Vision, 57(2), 137–154.
Article Google Scholar
Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., & Gong, Y. (2010). Locality-constrained linear coding for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 3360–3367).
Wang, Y., & Mori, G. (2009). Human action recognition by semilatent topic models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(10), 1762–1774.
Article Google Scholar
Wikipedia. Min-max theorem – wikipedia, the free encyclopedia, 2015. [Online; accessed 27-May-2015].
Wright, J., Ma, Y., Mairal, J., Sapiro, G., Huang, T. S., & Yan, S. (2010). Sparse representation for computer vision and pattern recognition. Proceedings of the IEEE, 98(6), 1031–1044.
Article Google Scholar
Wright, J., Yang, A. Y., Ganesh, A., Sastry, S. S., & Ma, Y. (2009). Robust face recognition via sparse representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(2), 210–227.
Article Google Scholar
Xu, Y., Quan, Y., Ling, H., & Ji, H. (2011). Dynamic texture classification using dynamic fractal analysis. In Proceedings of the International Conference on Computer Vision (ICCV).
Yang, J., Yu, K., Gong, Y., & Huang, T. (2009). Linear spatial pyramid matching using sparse coding for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 1794–1801).
Yu, K., & Zhang, T. (2010). Improved local coordinate coding using local tangents. In Proceedings of the International Conference on Machine Learning (ICML) (pp. 1215–1222).
Yu, K., Zhang, T., & Gong, Y. (2009). Nonlinear learning using local coordinate coding. In Proceedings of the Advances in Neural Information Processing Systems (NIPS) 9 (p 1).
Yu, S., Tan, T., Huang, K., Jia, K., & Wu, X. (2009). A study on gait-based gender classification. IEEE Transactions on Image Processing (TIP), 18(8), 1905–1910.
Article MathSciNet Google Scholar
Yuan, C., Hu, W., Li, X., Maybank, S., & Luo, G. (2010). Human action recognition under log-euclidean Riemannian metric. In H. Zha, R.-I. Taniguchi, & S. Maybank editors, Proc. Asian Conference on Computer Vision (ACCV), volume 5994 of Lecture Notes in Computer Science, pages 343–353. Springer Berlin Heidelberg.
Zhao, G., & Pietikäinen, M. (2007). Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Trans. Pattern Analysis and Machine Intelligence, 29(6), 915–928.
Article Google Scholar
Zheng, S., Zhang, J., Huang, K., He, R., & Tan, T. (2011). Robust view transformation model for gait recognition. In International Conference on Image Processing (ICIP) (pp. 2073–2076).

Download references

Acknowledgments

NICTA is funded by the Australian Government as represented by the Department of Broadband, Communications and the Digital Economy, as well as the Australian Research Council through the ICT Centre of Excellence program. This work is funded in part through an ARC Discovery Grant DP130104567. C. Shen’s participation was in part supported by ARC Future Fellowship F120100969.

Author information

Authors and Affiliations

College of Engineering and Computer Science, Australian National University, Canberra, Australia
Mehrtash Harandi & Richard Hartley
NICTA, Canberra, Australia
Mehrtash Harandi, Richard Hartley & Conrad Sanderson
School of Computer Science, The University of Adelaide, Adelaide, SA, 5005, Australia
Chunhua Shen
The University of Queensland, Brisbane, Australia
Brian Lovell & Conrad Sanderson

Authors

Mehrtash Harandi
View author publications
You can also search for this author in PubMed Google Scholar
Richard Hartley
View author publications
You can also search for this author in PubMed Google Scholar
Chunhua Shen
View author publications
You can also search for this author in PubMed Google Scholar
Brian Lovell
View author publications
You can also search for this author in PubMed Google Scholar
Conrad Sanderson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mehrtash Harandi.

Additional information

Communicated by Julien Mairal, Francis Bach, Michael Elad.

Appendix

In this appendix, we give proofs for the following theorems.

Theorem 2

Let ${\varvec{X}}$ be an $d \times d$ symmetric matrix with eigenvalue decomposition ${\varvec{X}} = {\varvec{U}} {\varvec{D}} {\varvec{U}}^T$, where ${\varvec{D}}$ contains the eigenvalues $\lambda _i$ of ${\varvec{X}}$ in descending order. Let ${\varvec{U}}_p$ be the $d \times p$ matrix consisting of the first p columns of ${\varvec{U}}$. Then $\widehat{{\varvec{U}}}_p = {\varvec{U}}_p {\varvec{U}}_p^T$ is the closest matrix in $\mathcal {PG}({p},{d})$ to ${\varvec{X}}$ (under the Frobenius norm).

Proof

Observe that $ \Vert \widehat{{\varvec{V}}} - {\varvec{X}}\Vert _F^2 = \Vert \widehat{{\varvec{V}}} \Vert _F^2 + \Vert {\varvec{X}} \Vert _F^2 -2 \left<\!\right. \widehat{{\varvec{V}}}, {\varvec{X}} \left. \!\right>. $ Since $\Vert \widehat{{\varvec{V}}}\Vert _F$ (for $\widehat{{\varvec{V}}} \in \mathcal {PG}({p},{d})$) and $\Vert {\varvec{X}}\Vert _F$ are fixed, minimizing $\Vert \widehat{{\varvec{V}}} - {\varvec{X}}\Vert _F$ over $\widehat{{\varvec{V}}} \in \mathcal {PG}({p},{d})$ is the same as maximizing $\left<\!\right. \widehat{{\varvec{V}}}, {\varvec{X}} \left. \!\right>$. If $\widehat{{\varvec{V}}} = {\varvec{V}} {\varvec{V}}^T$, we may write $ \left<\!\right. \widehat{{\varvec{V}}}, {\varvec{X}}\left. \!\right>= \mathop {{\mathrm{Tr}}}\nolimits ( {\varvec{V}} {\varvec{V}}^T {\varvec{X}}) = \mathop {{\mathrm{Tr}}}\nolimits ({\varvec{V}}^T {\varvec{X}} {\varvec{V}}), $ so it is sufficient to maximize $\mathop {{\mathrm{Tr}}}\nolimits ({\varvec{V}}^T {\varvec{X}} {\varvec{V}})$ over ${\varvec{V}} \in \mathcal {G}({p},{n})$.

If ${\varvec{X}} = {\varvec{U}} \mathrm{diag}(\lambda _1, \ldots , \lambda _d) {\varvec{U}}^T$, then ${\varvec{U}}_p^T {\varvec{X}} {\varvec{U}}_p = \mathrm{diag}(\lambda _1, \ldots , \lambda _p)$ and $\mathop {{\mathrm{Tr}}}\nolimits ({\varvec{U}}_p^T {\varvec{X}} {\varvec{U}}_p) = \sum _{i=1}^p \lambda _i$. On the other hand, let ${\varvec{W}} \in \mathcal {G}({p},{d})$. Then ${\varvec{W}}^T {\varvec{X}} {\varvec{W}}$ is symmetric of dimension $p\times p$. Let $\mu _1 \ge \mu _2 \ge \ldots \ge \mu _p$ be its eigenvalues and ${\varvec{a}}_i, \, i=1, \ldots , p$ the corresponding unit eigenvectors. Let ${\varvec{w}}_i = {\varvec{W}} {\varvec{a}}_i$. Then the ${\varvec{w}}_i$ are orthogonal unit vectors, and ${\varvec{w}}_i^T {\varvec{X}} {\varvec{w}}_i = \mu _i$.

For $k = 1$ to p, let $A_k$ be the subspace of $R^d$ spanned by ${\varvec{w}}_1, \ldots , {\varvec{w}}_k$ and $B_k$ be the space spanned by the eigenvectors ${\varvec{u}}_k, \ldots , {\varvec{u}}_d$ of ${\varvec{X}}$. Counting dimensions, $A_k$ and $B_k$ must have non-trivial intersection. Let ${\varvec{v}}$ be a non-zero vector in this intersection, and write ${\varvec{v}} = \sum _{i=1}^k \alpha _i {\varvec{w}}_i= \sum _{i=k}^d \beta _i {\varvec{u}}_i$. Then

$$\begin{aligned} \begin{aligned} \mu _k \le \frac{\sum _{i=1}^k \alpha _i^2 \mu _i}{\sum _{i=1}^k \alpha _i^2 } = \frac{{\varvec{v}}^T {\varvec{X}} {\varvec{v}}}{{\varvec{v}}^T{\varvec{v}}} = \frac{\sum _{i=k}^d \beta _i^2 \lambda _i}{\sum _{i=k}^d \beta _i^2 } \le \lambda _k ~. \end{aligned} \end{aligned}$$

(44)

Therefore $\mu _k \le \lambda _k$ and $ \mathop {{\mathrm{Tr}}}\nolimits ({\varvec{W}}^T {\varvec{X}} {\varvec{W}}) = \sum _{i=1}^p \mu _i \le \sum _{i=1}^p \lambda _i = \mathop {{\mathrm{Tr}}}\nolimits ({\varvec{U}}^T {\varvec{X}} {\varvec{U}}) ~. $ $\square $

We acknowledge that this theorem is an adaptation of classical results in trace optimization Kokiopoulou et al. (2011) to the problem of interest in this paper and the proof is inspired by the Courant–Fischer Theorem on the Wikipedia page (Wikipedia 2015).

The chordal mean For two points (matrices) $\widehat{{\varvec{X}}}$ and $\widehat{{\varvec{Y}}}$ in $\mathcal {PG}({p},{d})$ the distance $\Vert \widehat{{\varvec{X}}} - \widehat{{\varvec{Y}}}\Vert _F$ is called the chordal distance between the two points. Given several points $\widehat{{\varvec{X}}}_i$, the $\ell _2$ chordal mean of $\{\widehat{{\varvec{X}}}_i\}_{i=1}^m$ is the element $\widehat{{\varvec{Y}}} \in \mathcal {PG}({p},{d})$ that minimizes $\sum _{i=1}^m \Vert \widehat{{\varvec{Y}}} - \widehat{{\varvec{X}}}_i\Vert _F^2$. There is a closed-form solution for the chordal mean of a set of points in a Grassman manifold.

Theorem 3

The chordal mean of a set of points $\widehat{{\varvec{X}}}_i \in \mathcal {PG}({p},{d})$ is equal to $ \mathrm{Proj} (\sum _{i=1}^m \widehat{{\varvec{X}}}_i). $

Proof

The proof is analogous to the formula for the chordal mean of rotation matrices, given in Hartley et al. (2013). By the same argument as in Theorem 2, minimizing $\sum _{i=1}^m \Vert \widehat{{\varvec{X}}}_i - \widehat{{\varvec{Y}}}\Vert _F^2$ is equivalent to maximizing $\sum _{i=1}^m \left<\!\right. \widehat{{\varvec{X}}}_i, \widehat{{\varvec{Y}}} \left. \!\right>= \left<\!\right. \sum _{i=1}^m \widehat{{\varvec{X}}}_i, \widehat{{\varvec{Y}}} \left. \!\right>$. Thus, the required $\widehat{{\varvec{Y}}}$ is the closest point in $\mathcal {PG}({p},{d})$ to $\sum _{i=1}^m \,\widehat{{\varvec{X}}}_i$, as stated. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Harandi, M., Hartley, R., Shen, C. et al. Extrinsic Methods for Coding and Dictionary Learning on Grassmann Manifolds. Int J Comput Vis 114, 113–136 (2015). https://doi.org/10.1007/s11263-015-0833-x

Download citation

Received: 31 January 2014
Accepted: 29 May 2015
Published: 07 June 2015
Issue Date: September 2015
DOI: https://doi.org/10.1007/s11263-015-0833-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Extrinsic Methods for Coding and Dictionary Learning on Grassmann Manifolds

Abstract

Access this article

Similar content being viewed by others

Survey on SVM and their application in image classification

Challenges in Representation Learning: A Report on Three Machine Learning Contests

3D face recognition using image decomposition and POEM descriptor

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix

Theorem 2

Proof

Theorem 3

Proof

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Extrinsic Methods for Coding and Dictionary Learning on Grassmann Manifolds

Abstract

Access this article

Similar content being viewed by others

Survey on SVM and their application in image classification

Challenges in Representation Learning: A Report on Three Machine Learning Contests

3D face recognition using image decomposition and POEM descriptor

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix

Appendix

Theorem 2

Proof

Theorem 3

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation