Abstract
In this paper, a novel sparse neighborhood preserving non-negative tensor factorization (SNPNTF) algorithm is proposed for facial expression recognition. It is derived from non-negative tensor factorization (NTF), and it works in the rank-one tensor space. A sparse constraint is adopted into the objective function, which takes the optimization step in the direction of the negative gradient, and then projects onto the sparse constrained space. To consider the spatial neighborhood structure and the class-based discriminant information, a neighborhood preserving constraint is adopted based on the manifold learning and graph preserving theory. The Laplacian graph which encodes the spatial information in the face samples and the penalty graph which considers the pre-defined class information are considered in this constraint. By using it, the obtained parts-based representations of SNPNTF vary smoothly along the geodesics of the data manifold and they are more discriminant for recognition. SNPNTF is a quadratic convex function in the tensor space, and it could converge to the optimal solution. The gradient descent method is used for the optimization of SNPNTF to ensure the convergence property. Experiments are conducted on the JAFFE database, the Cohn–Kanade database and the AR database. The results demonstrate that SNPNTF provides effective facial representations and achieves better recognition performance, compared with non-negative matrix factorization, NTF and some variant algorithms. Also, the convergence property of SNPNTF is well guaranteed.
Similar content being viewed by others
References
Calder AJ, Burton AM, Miller P, Young AW, Akamatsu S (2001) A principal component analysis of facial expression. Vision Res 41(9):1179–1208
Yanmbor WS (2000) Analysis of PCA-based and Fisher discriminant-based image recognition algorithms, Computer Science Department, Colorado State University, Fort Collins, CO, M.S. Thesis Tech. Rep. CS-00-103
He XF, Cai D, Yan SC, Zhang HJ (2005) Neighborhood preserving embedding. In: Proceedings of IEEE International Conference on Computer Vision, pp 1208–1213
Turk M, Pentland S (1991) Face recognition using eigenfaces. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp 586–591
Lee DD, Seung HS (1999) Learning the parts of objects by nonnegative matrix factorization. Nature 401(21):788–791
Li SZ, Hou X, Zhang H, Cheng Q (2001) Learning spatially localized, parts-based representation. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, pp 207–212
Hoyer PO (2002) Non-negative sparse coding. In: Proceedings of IEEE Workshop on Neural Networks for Signal Processing, pp 557–565
Cai D, He X, Wu X, Han J (2008) Non-negative matrix factorization on manifold. In: Proceedings of IEEE International Conference on Data Mining, pp 63–72
Zafeiriou S, Tefas A, Buciu I, Pitas I (2006) Exploiting discriminant information in nonnegative matrix factorization with application to frontal face verification. IEEE Trans Neural Netw 17(3):683–695
Cyganek B (2003) Object detection and recognition in digital images: theory and practice. Wiley, New York
Shashua A, Hazan T (2005) Non-negative tensor factorization with applications to statistics and computer vision. In: Proceedings of International Conference on Machine Learning, pp 50–57
Yan SC, Xu D, Zhang B, Zhang HJ, Yang Q, Lin S (2007) Graph embedding and extensions: a general framework for dimensionality reduction. IEEE Trans Pattern Anal Mach Intell 29(1):40–51
Liang D, Yang J, Zheng ZL, Chang YC (2005) A facial expression recognition system based on supervised locally linear embedding. Pattern Recogn Lett 26(17):2374–2389
He XF, Niyogi P (2003) Locality preserving projections. In: Proceedings of Advances Neural Information Processing Systems, pp 153–160
Shashua A, Levin A (2001) Linear image coding for regression and classification using the tensor-rank principle. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp 42–49
Zhi RC, Flierl M, Ruan QQ, Kleijin WB (2011) Graph-preserving sparse nonnegative matrix factorization with application to facial expression recognition. IEEE Trans Sys Man Cybern Part B Cybern 41(1):38–52
Zheng M, Bu J, Chen C, Wang C, Zhang L, Qiu G, Cai D (2011) Graph regularized sparse coding for image representation. IEEE Trans Image Process 20(5):1327–2048
Kolda T, Bader B (2009) Tensor decompositions and applications. SIAM Rev 51(3):455–500
Lyons M, Akamatsu S, Kamachi M, Gyoba J (1998) Coding facial expressions with Gabor wavelets. In: Proceedings of IEEE International Conference on Automatic Face and Gesture Recognition, pp 200–205
Kanade T, Cohn J, Tian Y (2000) Comprehensive database for facial expression analysis. In: Proceedings of IEEE International Conference on Automatic Face and Gesture Recognition, pp 46–53
Mehrabian A (1968) Communication without words. Psychol Today 2(4):53–56
Shan C, Gong S, McOwan PW (2006) A comprehensive empirical study on linear subspace methods for facial expression analysis. In: Proceedings of International Conference on Computer Vision and Pattern Recognition, pp 153–158
He X, Yan S, Hu Y, Niyogi P, Zhang H (2005) Face recognition using Laplacianfaces. IEEE Trans Pattern Anal Mach Intell 27(3):328–340
Tao D, Li X, Wu X, Maybank S (2008) Tensor rank one discriminant analysis—a convergent method for discriminative multilinear subspace selection. Neurocomputing 71:1866–1882
Hazan T, Polakm S, Shashua A (2005) Sparse image coding using a 3D non-negative tensor factorization. In: Proceedings of IEEE International Conference on Computer Vision, vol 1. pp 50–57
Liu S, Ruan Q, Wang C, An G (2012) Tensor rank one differential graph preserving analysis for facial expression recognition. Image Vis Comput 30(8):535–545
Field D (1994) What is the goal of sensory coding? Neural Comput 6(4):559–601
Ellison JW, Massaro DW (1997) Feature evaluation, integration, and judgment of facial affect. J Exp Psychol Hum Percep Perform 23(1):213–226
Li XL, Lin S, Yan SC, Xu D (2008) Discriminant locally linear embedding with high-order tensor data. IEEE Trans Sys Man Cybern Part B Cybern 38(2):342–352
Liu S, Ruan Q (2011) Orthogonal tensor neighborhood preserving embedding for facial expression recognition. Pattern Recogn 44(2011):1497–1513
Wang Y, Jia Y, Hu C, Turk M (2005) Non-negative matrix factorization framework for face recognition. Int J Pattern Recogn Artif Intell 19(4):495–511
Zdunek R (2011) Uni-orthogonal nonnegative tucker decomposition for supervised image classification. In: Image Analysis and Processing, Lecture Notes in Computer Science, vol 6978. pp 88–97
Wu F, Tan X, Yang Y, Tao D, Tang S, Zhuang Y (2013) Supervised nonnegative tensor factorization with maximum-margin constraint. Twenty-seventh Conf Art Intell AAAI, pp 962–968
Zafeiriou S (2009) Discriminant nonnegative tensor factorization algorithm. IEEE Trans Neural Netw 20(2):217–235
Olshausen BA, Field DJ (1996) Emergence of simple-cell receptive field properties by learning a sparse code for naturalimages. Nature 381(13)
Martinez A, Benavente R (1998) The AR face database, CVC Technical Report 24
Naseem I, Togneri R, Bennamoun M (2010) Linear regression for face recognition. IEEE Trans Pattern Anal Mach Intell 32(11):2106–2112
He X, Yan S, Hu Y, Niyogi P, Zhang H (2005) Face recognition using Laplacianfaces. IEEE Trans Pattern Anal Mach Intell 27(3):328–340
Vasilescu MAO, Terzopoulos D (2002) Multilinear image analysis for facial recognition. IEEE Int Conf Pattern Recogn ICPR 2:511–514
Lin CJ (2005) Projected gradient methods for non-negative matrix factorization. Dept. Comput. Sci. Nat. Taiwan Univ, Taipei
Phan AH, Cichocki A (2010) Tensor decompositions for feature extraction and classification of high dimensional datasets. IEICE Nonlin Theory Appl 1(1):37–68
Chu M, Diele F, Plemmons R, Ragni S (2004) Optimality, computation, and interpretation of nonnegative matrix factorizations, Technical report, North Carolina State University, http://www4.ncsu.edu/mthu/Research/Papers/nnmf.pdf. 2004
Gonzalez E, Zhang Y (2005) Accelerating the Lee-Seung algorithm for non-negative matrix factorization, Technical report, Department of Computational and Applied Mathematics, Rice University, http://www.caam.rice.edu/tech_reports/2005/TR05-02.ps. 2005
Wang Y, Zhang Y (2013) Nonnegative matrix factorization: a comprehensive review. IEEE Trans Knowl Data Eng 25(6):1336–1353
Acknowledgments
This work was supported partly by the National Natural Science Foundation of China (61370127, 61472030), the Fundamental Research Funds for the Central Universities (2014JBZ004) and Beijing Higher Education Young Elite Teacher Project (YETP0544).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
We declare that we do not have any possible conflicts of interest. We have no financial and personal relationships with other people or organizations that can inappropriately influence our work. There is no professional or other personal interest of any nature or kind in any product, service and/or company that could be construed as influencing the position presented in, or the review of this manuscript.
Appendix
Appendix
The calculations about Eq. (25)
Firstly, we discuss the calculation of \(\nabla f_{{{\mathbf{V,Z}}}} ({\mathbf{U}}^{(t)} )\) and \(\nabla f_{{{\mathbf{U,Z}}}} ({\mathbf{V}}^{(t)} )\). The objective function of SNPNTF is now written as:
The differential of f is
To calculate \(\nabla f_{{{\mathbf{V,Z}}}} ({\mathbf{U}}^{(t)} )\), the differential of f along \(\varvec{u}_{s} (\forall s,1 \le s \le R)\) is
In Eq. (38), the sparseness constraint acts as \(\varepsilon\). It means the value of the coefficient \(\varepsilon\) controls the sparse degree, which proves the analysis in Sect. 3.2 mathematically.
Similarly, the partial differential for \(\varvec{u}_{s}^{p} (1 \le p \le m_{1} )\) is
where the pth element in \(\varvec{e}^{\text{p}} \in {\mathbb{R}}^{{{\text{m}}_{ 1} }}\) is 1, others are 0 s. That is \((\varvec{e}^{\text{p}} )_{\text{p}} = 1\) and \((\varvec{e}^{\text{p}} )_{{{\text{k}} \ne {\text{p}}}} = 0\). According to Definition 2.1, for any order tensors \({\mathbf{A}}_{ 1} ,{\mathbf{A}}_{ 2} \in {\mathbb{R}}^{{{\text{a}}_{1} \times {\text{a}}_{2} \cdots \times {\text{a}}_{\text{n}} }}\),\({\mathbf{B}}_{ 1} ,{\mathbf{B}}_{ 2} \in {\mathbb{R}}^{{{\text{b}}_{1} \times {\text{b}}_{2} \cdots \times {\text{b}}_{\text{n}} }}\), there is \(\left\langle {{\mathbf{A}}_{ 1} \otimes {\mathbf{B}}_{ 1} ,{\mathbf{A}}_{ 2} \otimes {\mathbf{B}}_{ 2} } \right\rangle = \left\langle {{\mathbf{A}}_{ 1} ,{\mathbf{A}}_{ 2} } \right\rangle \left\langle {{\mathbf{B}}_{ 1} ,{\mathbf{B}}_{ 2} } \right\rangle\). Then, Eq. (39) could be written as:
According to Eq. (25), the update rule for \({\text{u}}_{\text{s}}^{\text{p}}\) is
To confirm the non-negative of \({\text{u}}_{\text{s}}^{\text{p}}\), the update step \(\mu ({\text{u}}_{\text{s}}^{\text{p}} )\) is set as:
If the denominator is close to 0, Eq. (42) would lead to instable results. Therefore, an extra positive additive value is adopted in the denominator which is set to be 0.01. In the following part of this paper, the additive values are used in all denominators.
Now the update equation for \({\text{u}}_{\text{s}}^{\text{p}}\) is
where \({\mathbf{U}}_{{{\text{p}};}} \in {\mathbb{R}}^{{1 \times {\text{R}}}}\) represents the pth row of \({\mathbf{U}} = [\varvec{u}_{ 1} ,\varvec{u}_{ 2} , \ldots ,\varvec{u}_{\text{R}} ]\),\(\odot\) is the matrix Hadamard product (e.g., \((X{ \odot }Y)_{\text{ij}} = X_{\text{ij}} Y_{\text{ij}}\)), \({\mathbf{A}}_{\text{p;;}} \in {\mathbb{R}}^{{{\text{m}}_{2} \times {\text{N}}}}\) represents the matrix which fixes the first mode of A, and traversals the other two modes. It is defined as:
According to the analysis above, similarly, the update equation of the qth element of \(\varvec{v}_{\text{s}}\) (\({\text{v}}_{\text{s}}^{\text{q}}\), \(1 \le {\text{s}} \le {\text{R}}\), \(1 \le {\text{q}} \le {\text{m}}_{2}\)) can be written as:
where \({\mathbf{V}}_{{{\text{q}};}} \in {\mathbb{R}}^{{1 \times {\text{R}}}}\) represents the pth row of \({\mathbf{V}} = [\varvec{v}_{ 1} ,\varvec{v}_{ 2} , \ldots ,\varvec{v}_{\text{R}} ] \in {\mathbb{R}}^{{{\text{m}}_{2} \times {\text{R}}}}\), \(\odot\) is the matrix Hadamard product (e.g., \((X{ \odot }Y)_{\text{ij}} = X_{\text{ij}} Y_{\text{ij}}\)), \({\mathbf{A}}_{{ ; {\text{q;}}}} \in {\mathbb{R}}^{{{\text{m}}_{ 1} \times {\text{N}}}}\) represents the matrix which fixes the second mode of A, and traversals the other two modes. It is defined as:
Now, \(\varvec{v}_{\text{r}}\) and \(\varvec{u}_{\text{r}}\) are calculated.
Then, we discuss the calculation of \(\nabla f_{{{\mathbf{U,V}}}} ({\mathbf{Z}}^{(t)} )\). The differential of f along \({\mathbf{z}}_{\text{s}} (\forall {\text{s}},1 \le {\text{s}} \le {\text{R}})\) is
For \({{\left( {\lambda {\mathbf{z}}_{\text{s}}^{T} (D - S){\mathbf{z}}_{\text{s}} } \right)} \mathord{\left/ {\vphantom {{\left( {\lambda {\mathbf{z}}_{\text{s}}^{T} (D - S){\mathbf{z}}_{\text{s}} } \right)} 2}} \right. \kern-0pt} 2}\), the partial differential for \(\varvec{z}_{\text{s}}^{\text{i}}\) is
where the i h element in \(\varvec{e}^{\text{i}} \in {\mathbb{R}}^{\text{N}}\) is 1, others are 0 s. That is \((\varvec{e}^{\text{i}} )_{\text{i}} = 1\) and \((\varvec{e}^{\text{i}} )_{{{\text{k}} \ne {\text{i}}}} = 0\). Then, the partial differential for \({\text{z}}_{\text{s}}^{\text{i}}\) is:
According to Eq. (25), the update rule for \({\text{z}}_{\text{s}}^{\text{i}}\) is
To ensure the non-negative, the update step \(\mu ({\text{z}}_{\text{s}}^{\text{i}} )\) is set as:
And the final update equation of \({\text{z}}_{\text{s}}^{\text{i}}\) is
where \({\mathbf{Z}}_{{{\text{i}};}} \in {\mathbb{R}}^{{1 \times {\text{R}}}}\) represents the i th row of \({\mathbf{Z}} = [{\mathbf{z}}_{ 1} ,{\mathbf{z}}_{ 2} , \ldots ,{\mathbf{z}}_{\text{R}} ]\);\(S_{\text{i;}} \in {\mathbb{R}}^{{1 \times {\text{N}}}}\) and \(S_{\text{i;}}^{p} \in {\mathbb{R}}^{{1 \times {\text{N}}}}\) represent the ith row of \(S\) and \(S^{p}\), respectively; \(\odot\) is Hadamard product,\({\mathbf{A}}_{{ ; ; {\text{i}}}} \in {\mathbb{R}}^{{{\text{m}}_{ 1} \times {\text{m}}_{ 2} }}\) and is defined as:
Now \(\varvec{u}_{\text{r}}\), \(\varvec{v}_{\text{r}}\) and \(\varvec{z}_{\text{r}}\) in the objective function are all solved. The illustrations about \(A_{\text{p;;}}\), \({\text{A}}_{{ ; {\text{q;}}}}\) and \(A_{{ ; ; {\text{i}}}}\) are given in Fig. 11.
Rights and permissions
About this article
Cite this article
An, G., Liu, S. & Ruan, Q. A sparse neighborhood preserving non-negative tensor factorization algorithm for facial expression recognition. Pattern Anal Applic 20, 453–471 (2017). https://doi.org/10.1007/s10044-015-0507-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10044-015-0507-x