Facial expression recognition using $${l}_{p}$$ -norm MKL multiclass-SVM

Zhang, Xiao; Mahoor, Mohammad H.; Mavadati, S. Mohammad

doi:10.1007/s00138-015-0677-y

Facial expression recognition using ${l}_{p}$-norm MKL multiclass-SVM

Original Paper
Published: 16 April 2015

Volume 26, pages 467–483, (2015)
Cite this article

Machine Vision and Applications Aims and scope Submit manuscript

Xiao Zhang¹,
Mohammad H. Mahoor¹ &
S. Mohammad Mavadati¹

1007 Accesses
60 Citations
Explore all metrics

Abstract

Automatic recognition of facial expressions is an interesting and challenging research topic in the field of pattern recognition due to applications such as human–machine interface design and developmental psychology. Designing classifiers for facial expression recognition with high reliability is a vital step in this research. This paper presents a novel framework for person-independent expression recognition by combining multiple types of facial features via multiple kernel learning (MKL) in multiclass support vector machines (SVM). Existing MKL-based approaches jointly learn the same kernel weights with $l_{1}$-norm constraint for all binary classifiers, whereas our framework learns one kernel weight vector per binary classifier in the multiclass-SVM with $l_{p}$-norm constraints $(p \ge 1)$, which considers both sparse and non-sparse kernel combinations within MKL. We studied the effect of $l_{p}$-norm MKL algorithm for learning the kernel weights and empirically evaluated the recognition results of six basic facial expressions and neutral faces with respect to the value of “$p$”. In our experiments, we combined two popular facial feature representations, histogram of oriented gradient and local binary pattern histogram, with two kernel functions, the heavy-tailed radial basis function and the polynomial function. Our experimental results on the CK$+$, MMI and GEMEP-FERA face databases as well as our theoretical justification show that this framework outperforms the state-of-the-art methods and the SimpleMKL-based multiclass-SVM for facial expression recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Facial expression recognition using three-stage support vector machines

Article Open access 16 December 2019

Facial expression recognition with dynamic cascaded classifier

Article 19 March 2019

Face expression recognition system based on ripplet transform type II and least square SVM

Article 26 December 2017

References

Mehrabian, A., Wiener, M.: Decoding of inconsistent communications. J. Personal. Soc. Psychol. 6(1), 109–114 (1967)
Article Google Scholar
Knapp, M.L., Hall, J.A.: Nonverbal Communication in Human Interaction, 7th edn. Cengage Learning, Wadsworth (2010)
Google Scholar
Ekman, P., Friesen, W.: Facial Action Coding System: A Technique for the Measurement of Facial Movement. Consulting Psychologists Press, Palo Alto (1978)
Google Scholar
Cornelius, R.R.: Theoretical approaches to emotion. In: SpeechEmotion-2000, pp. 3–10 (2000)
Fasel, B., Luettin, J.: Automatic facial expression analysis: a survey. Pattern Recognit. 36(1), 259–275 (2003)
Article MATH Google Scholar
Zeng, Z., Pantic, M., Roisman, G., Huang, T.: A survey of affect recognition methods: audio, visual, and spontaneous expressions. IEEE Trans. Pattern Anal. Mach. Intell. 31(1), 39–58 (2009)
Article Google Scholar
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
MATH Google Scholar
Kotsia, I., Pitas, I.: Facial expression recognition in image sequences using geometric deformation features and support vector machines. IEEE Trans. Image Process. 16(1), 172–187 (2007)
Article MathSciNet Google Scholar
Shan, C., Gong, S., McOwan, P.: Facial expression recognition based on local binary patterns: a comprehensive study. Image Vis. Comput. 27(6), 803–816 (2009)
Article Google Scholar
Hsu, C., Lin, C.: A comparison of methods for multiclass support vector machines. IEEE Trans. Neural Netw. 13(2), 415–425 (2002)
Article Google Scholar
Senechal, T., Rapp, V., Salam, H., Seguier, R., Bailly, K., Prevost, L.: Facial action recognition combining heterogeneous features via multikernel learning. IEEE Trans. Syst. Man Cybern. Part B Cybern. 42(4), 993–1005 (2012)
Zhang, X., Mahoor, M.H., Voyles, R.M.: Facial expression recognition using hessianmkl based multiclass-svm. In: IEEE International Conference on Automatic Face Gesture Recognition and Workshops (FG’13) (2013)
Lanckriet, G., Cristianini, N., Bartlett, P., Ghaoui, L., Jordan, M.: Learning the kernel matrix with semidefinite programming. J. Mach. Learn. Res. 5, 27–72 (2004)
MATH Google Scholar
Hartley, R.I., Zisserman, A.: Multiple View Geometry in Computer Vision, 2nd edn. Cambridge University Press, Cambridge (2004)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), vol. 1, pp. 886–893 (2005)
Ojala, T., Pietikäinen, M., Harwood, D.: A comparative study of texture measures with classification based on featured distributions. Pattern Recognit. 29(1), 51–59 (1996)
Article Google Scholar
Chapelle, O., Haffner, P., Vapnik, V.: Support vector machines for histogram-based image classification. IEEE Trans. Neural Netw. 10(5), 1055–1064 (1999)
Scholkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2001)
Google Scholar
Rakotomamonjy, A., Bach, F., Canu, S., Grandvalet, Y., et al.: SimpleMKL. J. Mach. Learn. Res. 9, 2491–2521 (2008)
MATH MathSciNet Google Scholar
Chapelle, O., Rakotomamonjy, A.: Second order optimization of kernel parameters. In: Proc. of the NIPS Workshop on Kernel Learning: Automatic Selection of Optimal Kernels (2008)
Valstar, M., Jiang, B., Mehu, M., Pantic, M., Scherer, K.: The first facial expression recognition and analysis challenge. In: IEEE International Conference on Automatic Face Gesture Recognition and Workshops (FG’11), pp. 921–926 (2011)
Zhu, Y., De la Torre, F., Cohn, J., Zhang, Y.: Dynamic cascades with bidirectional bootstrapping for spontaneous facial action unit detection. In: 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops (ACII’09), pp. 1–8 (2009)
Chang, Y., Hu, C., Feris, R., Turk, M.: Manifold based analysis of facial expression. Image Vis. Comput. 24(6), 605–614 (2006)
Article Google Scholar
Cootes, T., Taylor, C., Cooper, D., Graham, J., et al.: Active shape models—their training and application. Comput. Vis. Image Underst. 61(1), 38–59 (1995)
Article Google Scholar
Pantic, M., Rothkrantz, L.: Facial action recognition for facial expression analysis from static face images. IEEE Trans. Syst. Man Cybern. Part B Cybern. 34(3), 1449–1461 (2004)
Article Google Scholar
Pantic, M., Patras, I.: Dynamics of facial expression: recognition of facial actions and their temporal segments from face profile image sequences. IEEE Trans. Syst. Man Cybern. Part B Cybern. 36(2), 433–449 (2006)
Article Google Scholar
Cootes, T., Edwards, G., Taylor, C.: Active appearance models. IEEE Trans. Pattern Anal. Mach. Intell. 23(6), 681–685 (2001)
Article Google Scholar
Sung, J., Kim, D.: Pose-robust facial expression recognition using view-based 2d + 3d aam. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 38(4), 852–866 (2008)
Article Google Scholar
Cheon, Y., Kim, D.: Natural facial expression recognition using differential-aam and manifold learning. Pattern Recognit. 42(7), 1340–1350 (2009)
Article MATH Google Scholar
Lyons, M., Budynek, J., Akamatsu, S.: Automatic classification of single facial images. IEEE Trans. Pattern Anal. Mach. Intell. 21(12), 1357–1362 (1999)
Article Google Scholar
Wu, T., Bartlett, M., Movellan, J.: Facial expression recognition using gabor motion energy filters. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 42–47 (2010)
Ahonen, T., Hadid, A., Pietikäinen, M.: Face recognition with local binary patterns. Comput. Vis.-ECCV 2004, 469–481 (2004)
Google Scholar
Liao, S., Fan, W., Chung, A., Yeung, D.: Facial expression recognition using advanced local binary patterns, Tsallis entropies and global appearance features. In: IEEE International Conference on Image Processing, pp. 665–668 (2006)
Almaev, T.R., Valstar, M.F.: Local gabor binary patterns from three orthogonal planes for automatic facial expression recognition. In: Humaine Association Conference on Affective Computing and Intelligent Interaction (ACII), pp. 356–361 (2013)
Wang, X., Han, T., Yan, S.: An hog-lbp human detector with partial occlusion handling. In: IEEE 12th International Conference on Computer Vision, pp. 32–39 (2009)
Li, Z., Imai, J., Kaneko, M.: Facial-component-based bag of words and phog descriptor for facial expression recognition. In: IEEE International Conference on Systems, Man and Cybernetics (SMC’09), pp. 1353–1358 (2009)
Dahmane, M., Meunier, J.: Emotion recognition using dynamic grid-based hog features. In: IEEE International Conference on Automatic Face and Gesture Recognition and Workshops (FG’11), pp. 884–888 (2011)
Bartlett, M., Littlewort, G., Frank, M., Lainscsek, C., Fasel, I., Movellan, J.: Recognizing facial expression: machine learning and application to spontaneous behavior. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), vol. 2, pp. 568–573 (2005)
Sebe, N., Lew, M., Sun, Y., Cohen, I., Gevers, T., Huang, T.: Authentic facial expression analysis. Image Vis. Comput. 25(12), 1856–1863 (2007)
Article Google Scholar
Wan, S., Aggarwal, J.: Spontaneous facial expression recognition: a robust metric learning approach. Pattern Recognit. 47(5), 1859–1868 (2014)
Article Google Scholar
Yacoob, Y., Davis, L.: Recognizing human facial expressions from long image sequences using optical flow. IEEE Trans. Pattern Anal. Mach. Intell. 18(6), 636–642 (1996)
Article Google Scholar
Essa, I., Pentland, A.: Coding, analysis, interpretation, and recognition of facial expressions. IEEE Trans. Pattern Anal. Mach. Intell. 19(7), 757–763 (1997)
Article Google Scholar
Cohen, I., Sebe, N., Garg, A., Chen, L., Huang, T.: Facial expression recognition from video sequences: temporal and static modeling. Comput. Vis. Image Underst. 91(1), 160–187 (2003)
Article Google Scholar
Yeasin, M., Bullot, B., Sharma, R.: Recognition of facial expressions and measurement of levels of interest from video. IEEE Trans. Multimed. 8(3), 500–508 (2006)
Article Google Scholar
Zhang, Y., Ji, Q.: Active and dynamic information fusion for facial expression understanding from image sequences. IEEE Trans. Pattern Anal. Mach. Intell. 27(5), 699–714 (2005)
Article Google Scholar
Shan, C., Gong, S., McOwan, P.: Dynamic facial expression recognition using a bayesian temporal manifold model. In: Proc. BMVC, vol. 1, pp. 297–306 (2006)
Fang, H., Mac Parthaláin, N., Aubrey, A.J., Tam, G.K., Borgo, R., Rosin, P.L., Grant, P.W., Marshall, D., Chen, M.: Facial expression recognition in dynamic sequences: an integrated approach. Pattern Recognit. 47(3), 1271–1281 (2014)
Gönen, M., Alpaydın, E.: Multiple kernel learning algorithms. J. Mach. Learn. Res. 12, 2211–2268 (2011)
MATH MathSciNet Google Scholar
Fu, S., Kuai, X., Yang, G.: Multiple kernel active learning for facial expression analysis. Adv. Neural Netw.-ISNN 2011, 381–387 (2011)
Google Scholar
Sénéchal, T., Rapp, V., Salam, H., Seguier, R., Bailly, K., Prevost, L.: Facial action recognition combining heterogeneous features via multikernel learning. IEEE Trans. Syst. Man Cybern. Part B Cybern. 42(4), 993–1005 (2012)
Zhang, W., Shan, S., Gao, W., Chen, X., Zhang, H.: Local gabor binary pattern histogram sequence (lgbphs): a novel non-statistical model for face representation and recognition. In: Tenth IEEE International Conference on Computer Vision (ICCV’05), vol. 1, pp. 786–791 (2005)
Sonnenburg, S., Rätsch, G., Schäfer, C., Schölkopf, B.: Large scale multiple kernel learning. J. Mach. Learn. Res. 7, 1531–1565 (2006)
MATH MathSciNet Google Scholar
Cortes, C., Mohri, M., Rostamizadeh, A.: $l_{2}$ regularization for learning kernels. In: Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence, pp. 109–116 (2009)
Sun, T., Jiao, L., Liu, F., Wang, S., Feng, J.: Selective multiple kernel learning for classification with ensemble strategy. Pattern Recognit. 46(11), 3081–3090 (2013)
Article Google Scholar
Kloft, M., Brefeld, U., Sonnenburg, S., Zien, A.: $l_{p}$-norm multiple kernel learning. J. Mach. Learn. Res. 12, 953–997 (2011)
MATH MathSciNet Google Scholar
Kloft, M.: Lp-norm multiple kernel learning. Ph.D. dissertation, Berlin Institute of Technology (2011)
Yan, F., Mikolajczyk, K., Kittler, J., Tahir, M.: A comparison of l\_1 norm and l\_2 norm multiple kernel svms in image and video classification. In: Seventh International Workshop on Content-Based Multimedia Indexing (CBMI’09), pp. 7–12 (2009)
Luenberger, D., Ye, Y.: Linear and Nonlinear Programming. Springer, New York (2008)
Nocedal, J., Wright, S.J.: Numerical Optimization. Springer, New York (2000)
Bach, F.R.: Consistency of the group lasso and multiple kernel learning. J. Mach. Learn. Res. 9, 1179–1225 (2008)
MATH MathSciNet Google Scholar
Canu, S., Grandvalet, Y., Guigue, V., Rakotomamonjy, A.: SVM and kernel methods matlab toolbox. Perception Systémes et Information, INSA de Rouen, Rouen, France (2005)
Lucey, P., Cohn, J.F., Kanade, T., Saragih, J., Ambadar, Z., Matthews, I.: The extended Cohn–Kanade dataset (CK+): a complete dataset for action unit and emotion-specified expression. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 94–101 (2010)
Jiang, B., Valstar, M., Martinez, B., Pantic, M.: A dynamic appearance descriptor approach to facial actions temporal modeling. IEEE Trans. Cybern. 44(2), 161–174 (2014)
Zhang, X., Mahoor, M.H., Nielsen, R.D.: On multi-task learning for facial action unit detection. In: IVCNZ, pp. 202–207 (2013)
Zhang, X., Mahoor, M., Mavadati, S., Cohn, J.: A lp-norm mtmkl framework for simultaneous detection of multiple facial action units. In: IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1104–1111 (2014)
Turk, M., Pentland, A.: Eigenfaces for recognition. J. Cogn. Neurosci. 3(1), 71–86 (1991)
Article Google Scholar
Gehler, P., Nowozin, S.: On feature combination for multiclass object classification. In: IEEE 12th International Conference on Computer Vision, pp. 221–228 (2009)
Roy, K., Kamel, M.: Facial expression recognition using game theory. In: Artificial Neural Networks in Pattern Recognition, pp. 139–150 (2012)
Jain, S., Hu, C., Aggarwal, J.: Facial expression recognition with temporal modeling of shapes. In: IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pp. 1642–1649 (2011)
Ramirez Rivera, A., Rojas Castillo, J., Chae, O.: Local directional number pattern for face analysis: face and expression recognition. IEEE Trans. Image Process. 22(5), 1740–1752 (2013)
Gu, W., Xiang, C., Venkatesh, Y., Huang, D., Lin, H.: Facial expression recognition using radial encoding of local Gabor features and classifier synthesis. Pattern Recognit. 45(1), 80–91 (2012)
Article Google Scholar
Pantic, M., Valstar, M., Rademaker, R., Maat, L.: Web-based database for facial expression analysis. In: IEEE International Conference on Multimedia and Expo (ICME’05), pp. 5–8 (2005)
Valstar, M., Pantic, M.: Induced disgust, happiness and surprise: an addition to the MMI facial expression database. In: The Workshop Programme, pp. 65–70 (2010)
Xiong, X., De la Torre, F.: Supervised descent method and its applications to face alignment. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 532–539 (2013)
Guo, Y., Zhao, G., Pietikäinen, M.: Dynamic facial expression recognition using longitudinal facial expression atlases. Comput. Vis.-ECCV 2012, 631–644 (2012)
Google Scholar
Sánchez, A., Ruiz, J.V., Moreno, A.B., Montemayor, A.S., Hernández, J., Pantrigo, J.J.: Differential optical flow applied to automatic facial expression recognition. Neurocomputing 74(8), 1272–1282 (2011)
Article Google Scholar
Littlewort, G., Bartlett, M.S., Fasel, I., Susskind, J., Movellan, J.: Dynamics of facial expression extracted automatically from video. Image Vis. Comput. 24(6), 615–625 (2006)
Article Google Scholar
Valstar, M.F., Jiang, B., Mehu, M., Pantic, M., Scherer, K.: The first facial expression recognition and analysis challenge. In: IEEE International Conference on Automatic Face and Gesture Recognition and Workshops (FG’11), pp. 921–926 (2011)
Valstar, M.F., Mehu, M., Jiang, B., Pantic, M., Scherer, K.: Meta-analysis of the first facial expression recognition challenge. IEEE Trans. Syst. Man Cybern. Part B Cybern. 42(4), 966–979 (2012)
Article Google Scholar
Tariq, U., Lin, K.-H., Li, Z., Zhou, X., Wang, Z., Le, V., Huang, T.S., Lv, X., Han, T.X.: Emotion recognition from an ensemble of features. In: IEEE International Conference on Automatic Face and Gesture Recognition and Workshops (FG’11), pp. 872–877 (2011)
Yang, S., Bhanu, B.: Facial expression recognition using emotion avatar image. In: IEEE International Conference on Automatic Face and Gesture Recognition and Workshops (FG’11), pp. 866–871 (2011)
Micchelli, C.A., Pontil, M.: Learning the kernel function via regularization. J. Mach. Learn. Res. 6, 1099–1125 (2005)
MATH MathSciNet Google Scholar

Download references

Acknowledgments

This research is partially supported by Grants BCS-1052781, IIS-1111568 from the National Science Foundation.

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, University of Denver, Denver, CO, 80208, USA
Xiao Zhang, Mohammad H. Mahoor & S. Mohammad Mavadati

Authors

Xiao Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad H. Mahoor
View author publications
You can also search for this author in PubMed Google Scholar
S. Mohammad Mavadati
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohammad H. Mahoor.

Appendices

1.1 Appendix A. Description of MKL-based SVM in multiple kernel spaces

To explain the functionality of MKL-based SVM, we utilize the $p=1$ case as an example. In this case, the optimized kernel combination weights follow the constraint ${\sum _{i=1}^{M}{d_{m}}} = 1,d_{m} \ge 0$. Given a test example $x_{0} \in {\mathbb {R}}^{D}$, we rewrite Eq. 5 as follows.

$$\begin{aligned}&y_{0} = {{\text {sgn}}}\left[ {\sum }_{i=1}^{N}{{\sum }_{m=1}^{M}{{\alpha }_{i}y_{i}d_{m}k_{m}(x_{i},x_{0})+w_{0}}}\right] \\& = {{\text {sgn}}}\left[ {\sum }_{m=1}^{M}{d_{m}\underbrace{\left( {\sum }_{i=1}^{N}{{\alpha }_{i}y_{i}k_{m}(x_{i},x_{0})}+w_{0}\right) }_{\text {single kernel with single feature}}}\right] \end{aligned}$$

From the above equation, the formulation within the under bracket is the discriminant function used for classifying new samples in canonical binary SVM. In other words, using MKL-based SVM the label of a sample is determined based on weighted summation of the results obtained from each RKHS, which enhances the discriminant power for classification.

1.2 Appendix B. Proof of the superiority of MKL-based SVM over canonical binary SVM with single kernel and single type of features

Without loss of generality, our proof is pursed in the case of $1<p<2$. We transform the object function of Eq. 3 based on the Lemma 26 in [82] as:

$$\begin{aligned} \mathop {\min }_{d,\Vert d\Vert _{r} \le 1} \mathop {\min }_{w,w_{0},\xi } \quad J(d,w,w_{0},\xi ) = \frac{1}{2}\sum _{m=1}^{M}{\frac{\Vert w_{m}\Vert ^{2}_{2}}{d_{m}}}+C\sum _{i=1}^{N}{\xi _{i}} \end{aligned}$$

where $r=p/(2-p)$.

As described in Sect. 3.3, this convex optimization problem is solved by the two-step method, where two nested iterations are equipped in each loop of the method. In the outer iteration, the kernel combination weights are updated by fixing the parameters of SVM, whereas in the inner iteration the optimization problem of canonical SVM is solved by fixing the updated kernel combination weights. Let $N_{f}$ be the number of features extracted from each sample and $N_{k}$ the number of kernel functions used in the $l_{p}$-norm MKL-based SVM. We denote the updated kernel combination vector in the $t$th loop of the two-step method as follows.

$$\begin{aligned} d^{(t)}= & {} \left[ \underbrace{d_{1}^{(t)}, \ldots , d_{N_{k}}^{(t)}}_{\text {the } 1{\mathrm{st}} \text { feature}}, \ldots , \underbrace{d_{(i-1)N_{k}+1}^{(t)}, \ldots , d_{i \cdot N_{k}}^{(t)}}_{\text {the } i{\mathrm{th}} \text { feature}}, \ldots ,\right. \\&\quad \left. \underbrace{d_{(N_{f}-1)N_{k}+1}^{(t)}, \ldots , d_{N_{f}N_{k}}^{(t)}}_{\text {the } {N_{f}}{\mathrm{th}}} \text {feature}\right] ^{T} \\ d^{(t)}\in & {} {{\mathbb {R}}_{+}^{\star N_{f}N_{K}}},\quad {\Vert d^{(t)}\Vert }_{r}=1 \end{aligned}$$

In addition, the SVM discriminant hyperplane obtained in the outer iteration of the $t$th loop is denoted based on $w^{(t)}$ and $w_{0}^{(t)}$.

For the canonical binary SVM, we suppose that the $i$th feature with the $j$th kernel function is utilized. Then, the canonical SVM becomes a special case in the framework of MKL-based SVM, and its corresponding kernel combination vector can be defined as follows.

$$\begin{aligned} \hat{d} = \left[ \underbrace{0,0, \ldots ,0, \ldots , 0}_{{d_{1} \sim d_{(i-1)N_{k}+j-1}}}, 1, \underbrace{0,0, \ldots ,0, \ldots , 0}_{{d_{(i-1)N_{k}+j+1} \sim d_{N_{f}N_{k}}}}\right] ^{T} \end{aligned}$$

Further, the learned discriminant hyperplane of canonical SVM is defined based on ${\hat{w}}^{\star }$ and ${\hat{w}}_{0}^{\star }$.

By assuming that in the first loop of the two-step method $d^{(1)}$ is initialized as $\hat{d}$ in the outer iteration, we obtain that in the inner iteration of the first loop the learned $w^{(1)}={\hat{w}}^{\star }$ and $w_{0}^{(1)} = {\hat{w}}_{0}^{\star }$. Thereafter, our proof is formulated as follows,

$$\begin{aligned} {\hat{J}}^{\star }&= J({\hat{d}}, {\hat{w}}^{\star },{\hat{w}}_{0}^{\star }) = J(d^{(1)},w^{(1)},w_{0}^{(1)})\\&\ge J(d^{(2)},w^{(1)},w_{0}^{(1)}) \ge J(d^{(2)},w^{(2)},w_{0}^{(2)}) \\&\ge \cdots \ge J(d^{\star },w^{\star },w_{0}^{\star }) = J^{\star } \end{aligned}$$

where $J^{\star }$ is the learned minimum of the objective function in $l_{p}$-norm MKL-based SVM with its corresponding optimum $d^{\star },w^{\star },w_{0}^{\star }$, and ${\hat{J}}^{\star }$ with ${\hat{w}}^{\star },{\hat{w}}_{0}^{\star }$ is for canonical binary SVM.

Based on the above justification, we can naturally extend the conclusion to a more general case. That is:

Suppose $\exists $ a set of basis kernel functions $S$ ($S \ne \emptyset $) and a set of features $F$ ($F \ne \emptyset $). Then $\forall S^{\prime } \subseteq S$ ($S^{\prime } \ne \emptyset $) and $F^{\prime } \subseteq F$ ($F^{\prime } \ne \emptyset $), we obtain that $J^{\star }_{S \times F} \le J^{\star }_{S^{\prime } \times F^{\prime }}$, since $d^{\star }_{S^{\prime } \times F^{\prime }}$ can be seen as a special case of $d_{S \times F}$. The subscripts $S \times F$ and $S^{\prime } \times F^{\prime }$ denote the kernels and features in use.

To be more specific, we conclude that MKL-based SVM with multiple kernels and features performs better or at least equally than those with multiple kernels and single feature or with single kernel and multiple features.

1.3 Appendix C. Proof of the superiority of our proposed MKL-based multiclass-SVM over the SimpleMKL-based multiclass-SVM

To be consistent with the SimpleMKL-based multiclass-SVM, we set $p=1$ in our framework. Then, the only difference between the two methods are the ways of updating the kernel combination vectors for multiclass classification tasks as mentioned in Eqs. 6 and 7. The superiority of our proposed MKL framework for multiclass-SVM lies in the fact that its minimized objective function preserves the lower boundary of the one obtained using SimpleMKL-based multiclass-SVM. That is, the derived hyperplanes from our method perform better or at least equally among the training data.

Suppose that ${\hat{L}}^{\star }$ is the optimal value of the objective function in Eq. 7, and $d_{u}^{\star }$ is the learned optimum for each binary classifier in our framework. $L^{\star }$ and $d^{\star }$ are the corresponding notations for the SimpleMKL-based multiclass-SVM in Eq. 6. Our proof is as follows

$$\begin{aligned} {\hat{L}}^{\star } = \sum _{u \in \varPhi }L_{u}(d_{u}^{\star }) \le \sum _{u \in \varPhi }L_{u}({d}^{\star }) = L^{\star } \end{aligned}$$

since $L_{u}(d_{u}^{\star }) \le L_{u}({d}^{\star }), \forall u \in \varPhi $.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, X., Mahoor, M.H. & Mavadati, S.M. Facial expression recognition using ${l}_{p}$-norm MKL multiclass-SVM. Machine Vision and Applications 26, 467–483 (2015). https://doi.org/10.1007/s00138-015-0677-y

Download citation

Received: 25 February 2014
Revised: 07 February 2015
Accepted: 24 March 2015
Published: 16 April 2015
Issue Date: May 2015
DOI: https://doi.org/10.1007/s00138-015-0677-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Facial expression recognition using \({l}_{p}\)-norm MKL multiclass-SVM

Abstract

Access this article

Similar content being viewed by others

Facial expression recognition using three-stage support vector machines

Facial expression recognition with dynamic cascaded classifier

Face expression recognition system based on ripplet transform type II and least square SVM

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendices

1.1 Appendix A. Description of MKL-based SVM in multiple kernel spaces

1.2 Appendix B. Proof of the superiority of MKL-based SVM over canonical binary SVM with single kernel and single type of features

1.3 Appendix C. Proof of the superiority of our proposed MKL-based multiclass-SVM over the SimpleMKL-based multiclass-SVM

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Facial expression recognition using \({l}_{p}\)-norm MKL multiclass-SVM

Abstract

Access this article

Similar content being viewed by others

Facial expression recognition using three-stage support vector machines

Facial expression recognition with dynamic cascaded classifier

Face expression recognition system based on ripplet transform type II and least square SVM

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendices

1.1 Appendix A. Description of MKL-based SVM in multiple kernel spaces

1.2 Appendix B. Proof of the superiority of MKL-based SVM over canonical binary SVM with single kernel and single type of features

1.3 Appendix C. Proof of the superiority of our proposed MKL-based multiclass-SVM over the SimpleMKL-based multiclass-SVM

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation