Abstract
Facial expression recognition from video data is considered to be a very challenging task in the research areas of computer vision, image processing, and pattern recognition. A novel approach is proposed in this paper to recognize facial expressions using depth video data. After extracting Local Directional Deviation-based Binary Pattern (LD2BP) features from depth images, the features are then extended by Generalized Discriminant Analysis (GDA) to improve them. At last, the time-sequential LD2BP-GDA features are applied with Hidden Markov Models (HMMs) for expression training and recognition. The proposed approach outperforms the conventional facial expression recognition approaches.
Similar content being viewed by others
References
Aleksic PS, Katsaggelos AK (2006) Automatic facial expression recognition using facial animation parameters and multistream HMMs. IEEE Trans Inform Sec 1:3–11
M. S. Bartlett, G. Donato, J. R. Movellan, J. C. Hager, P. Ekman, and T. J. Sejnowski (1999) “Face image analysis for expression measurement and detection of deceit,” in Proceedings of the Sixth Joint Symposium on Neural Computation, pp. 8–15
Bartlett MS, Movellan JR, Sejnowski TJ (2002) Face recognition by independent component analysis. IEEE Trans Neural Network 13(6):1450–1464
M.D. Breitenstein, J. Jensen, C. Hoilund, T.B. Moeslund, and L. Van Gool (2009) “Head pose estimation from passive stereo images,” in proceedings of 16th Scandinavian Conference on Image Analysis, pp. 219–228
M.D. Breitenstein, D. Kuettel, T. Weise, L. Van Gool, and H. Pfister (2008) “Real-time face pose estimation from single range images,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8
P. Breuer, C. Eckes, and S. Muller (2007) “Hand gesture recognition with a novel IR time-of-flight range camera: a pilot study,” in Proceedings of the 3rd international conference on Computer vision/computer graphics collaboration techniques, pp. 247–260
I. Buciu, C. Kotropoulos, and I. Pitas, “ICA and Gabor representation for facial expression recognition,” in Proceedings of the IEEE, pp. 855–858, 2003
Q. Cai, D. Gallup, C. Zhang, and Z. Zhang, “3d deformable face tracking with a commodity depth camera,” in Proceeding of European Conference on Computer Vision, pp. -242, 2010
Calder AJ, Burton AM, Miller P, Young AW, Akamatsu S (2001) A principal component analysis of facial expressions. Vis Res 41:1179–1208
Calder AJ, Young AW, Keane J (2000) Configural information in facial expression perception. J Exp Psychol Hum Percept Perform 26(2):527–551
Caschera MC, Ferri F, Grifoni P (2013) InteSe: an integrated model for resolving ambiguities in multimodal sentences. IEEE Trans Syst, Man, Cybernet: Syst 43(4):911–931
Chang KI, Bowyer KW, Flynn PJ (2006) Multiple nose region matching for 3d face recognition under varying facial expression. IEEE Trans Patt Anal Mach Intell 28(10):1695–1700
Chao-Fa C, Shin FY (2006) Recognizing facial action units using independent component analysis and support vector machine. Pattern Recogn 39:1795–1798
Chen F, Kotani K (2008) Facial expression recognition by supervised independent component analysis using MAP estimation. IEICE Trans Inf Syst E91-D(2):341–350
Cohen, N. Sebe, L. Chen, A. Garg, T. S. Huang, “Facial Expression Recognition from Video Sequences: Temporal and Static Modeling,” Computer Vision and Image Understanding, pp. 160–187, 2003
Cohen I, Sebe N, Garg A, Chen LS, Huang TS (2003) Facial expression recognition from video sequences: temporal and static modeling. Comput Vis Image Underst 91:160–187
Donato G, Bartlett MS, Hagar JC, Ekman P, Sejnowski TJ (1999) Classifying facial actions. IEEE Trans Patt Anal Mach Intell 21(10):974–989
P. Dreuw, H. Ney, G. Martinez, O. Crasborn, J. Piater, J.M. Moya, and M. Wheatley (2010) “The signspeak project - bridging the gap between signers and speakers,” in Proceedings of International Conference on Language Resources and Evaluation, pp. 476–481
Dubuisson S, Davoine F, Masson M (2002) A solution for facial expression representation and recognition. Signal Process Image Commun 17:657–673
Ekman P, Priesen WV (1978) Facial action coding system: a technique for the measurement of facial movement. Consulting Psychologists Press, Palo Alto
A. El-Yacoubi, R. Sabourin, M. Gilloux, and C.Y. Suen, “Off-Line Handwritten Word Recognition Using Hidden Markov Models,” in L.C. Jain and B. Lazzerini (eds.), Knowledge-Based Intelligent Techniques in Character Recognition, pp. 191–229, CRC Press LLC, 1999
G. Fanelli, J. Gall, and L. Van Gool (2011) “Real time head pose estimation with random regression forests,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 617–624
M. Gales and S. Young (2013)” The application of hidden Markov models in speech recognition,” Foundations and Trends in Signal Processing, vol. 1, no. 3, pp. 195–304
H. Hamer, J. Gall, T. Weise, and L. Van Gool (2010) “An object-dependent hand pose prior from sparse training data,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. pp. 671–678
H. Hamer, K. Schindler, E. Koller-Meier, and L. Van Gool (2009) “Tracking a hand manipulating an object” in Proceedings of IEEE International Conference on Computer Vision, pp. 1475–1482
L. He, X. Wang, C. Yu, and K. Wu, “facial expression recognition using embedded hidden markov model,” IEEE International Conference on Systems, Man and Cybernetics, pp. 1568–1572, 2009
Hyvarinen A, Karhunen J, Oja E (2001) Independent component analysis. Wiley, New York
Iddan GJ, Yahav G (2001) 3D imaging in the studio (and elsewhere…). Proc SPIE 4298:48–55
Iddan GJ, Yahav G (2001) 3D imaging in the studio (and elsewhere…). Proc SPIE 4298:48–55
J. Wang., Z. Liu, J. Chorowski, Z. Chen, and Y. Wu (2012) “Robust 3d action recognition with random occupancy patterns,” in Proceedings of European Conference on Computer Vision, pp. 872–885
T. Jabid, M. H. Kabir, O. Chae (2010) “Local Directional Pattern (LDP) a robust image descriptor for object recognition”, in Proceedings of the IEEE Advanced Video and Signal Based Surveillance (AVSS), pp. 482–487
Jalal A, Uddin MZ, Kim JT, Kim TS (2011) Recognition of human home activities via depth silhouettes and transformation for smart homes. Indoor Built Environ 21(1):184–190
Karklin Y, Lewicki MS (2003) Learning higher-order structures in natural images. Netw Comput Neural Syst 14:483–499
Kim D-S, Jeon I-J, Lee S-Y, Rhee P-K, Chung D-J (2006) Embedded face recognition based on fast genetic algorithm for intelligent digital photography. IEEE Trans Consum Electron 52(3):726–734
Kollorz E, Penne J, Hornegger J, Barke A (2008) Gesture recognition with a time-of-flight camera. Int J Intell Syst Technol Appl 5:334–343
Koppula HS, Gupta R, Saxena A (2013) Human activity learning using object affordances from rgb-d videos. Int J Robot Res 32(8):951–970
J. Lei, X. Ren, qnd D. Fox (2012) “Fine-grained kitchen activity recognition using rgb-d,” in Proceedings of ACM Conference on Ubiquitous Computing, pp. 208–211
Z. Li and R. Jarvis, “Real time hand gesture recognition using a range camera,” in Proceedings of Australasian Conference on Robotics and Automation, 2009
Li W, Zhang Z, Liu Z (2008) Expandable data-driven graphical modeling of human actions based on salient postures. IEEE Trans Circ Syst Video Technol 18(11):1499–1510
Liu C (2004) Enhanced independent component analysis and its application to content based face image retrieval. IEEE Trans Syst, Man, Cybernet- B: Cybernet 34(2):1117–1127
X. Liu and K. Fujimura (2004) “Hand gesture recognition using depth data,” in Proceedings of International Conference on Automatic Face and Gesture Recognition, pp. 529–534
M. Liu, S. Shan, R. Wang, and X. Chen, “Learning expressionlets on spatio-temporal manifold for dynamic facial expression recognition,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1749–1756, 2014
X. Lu and A.K. Jain (2006) “Automatic feature extraction for multiview 3d face recognition,” in Proceedings of 7th International Conference on Automatic Face and Gesture Recognition, pp. 585–59
D. D. Luong, S. Lee, and T.-S. Kim (2013) “Human Computer Interface Using the Recognized Finger Parts of Hand Depth Silhouette via Random Forests,” in Proceedings of 13th International Conference on Control, Automation and Systems, pp. 905–909
M. J. Lyons, S. Akamatsu, M. Kamachi, and J. Gyoba (1998) “Coding facial expressions with Gabor wavelets,” in Proceedings of the Third IEEE International Conference on Automatic Face and Gesture Recognition, pp.200-205(1998)
Marnik J (2007) The polish finger alphabet hand postures recognition using elastic graph matching. Comput Recog Syst 2 45:454–461
A. McCallum, D. Freitag, and F.C.N. Pereira (2000) “Maximum entropy markov models for information extraction and segmentation,” in Proceedings of International Conference on Machine Learning, pp. 591–598
Meulders M, Boeck PD, Mechelen IV, Gelman A (2005) Probabilistic feature analysis of facial perception of emotions. Appl Stat 54(4):781–793
A. Mian, M. Bennamoun, and R. Owens (2006) “Automatic 3d face detection, normalization and recognition,” in Proceedings of Third International Symposium on 3D Data Processing, Visualization, and Transmission, pp. 735–742
Mitra S, Acharya T (2007) Gesture recognition: a survey. IEEE Trans Syst, Man, Cybernet-C: Appl Rev 37(3):311–324
Z. Mo and U. Neumann, “Real-time hand pose recognition using low-resolution depth images,” in Proceedigns of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1499–1505, 2006
L.P. Morency, P. Sundberg, and T. Darrell (2003) “Pose estimation using 3d view-based eigenspaces,” in Proceedings of IEEE International Workshop on Analysis and Modeling of Faces and Gestures, pp. 45–52
Nair P, Cavallaro A (2009) 3-d face detection, landmark localization, and registration using a point distribution model. IEEE Trans Multimed 11(4):611–623
I. Oikonomidis, N. Kyriazis, and A.A. Argyros (2012) “Tracking the articulated motion of two strongly interacting hands,“in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1862–1869
Ojala T, Pietikäinen M, Mäenpää T (2002) Multiresolution gray scale and rotation invariant texture analysis with local binary patterns. IEEE Trans Pattern Anal Mach Intell 24:971–987
Ong S, Ranganath S (2005) Automatic sign language analysis: a survey and the future beyond lexical meaning. IEEE Trans Patt Anal Mach Intell 27(6):873–891
O. Oreifej and Z. Liu, “Hon4d: Histogram of oriented 4d normals for activity recognition from depth sequences,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. pp. 716–723, 2013
Padgett C, Cottrell G (1997) “Representation face images for emotion classification”, advances in neural information processing systems, vol 9. MIT Press, Cambridge, MA
T. Pei, T. Starner, H. Hamilton, I. Essa, and J. Rehg (2009) “Learnung the basic units in american sign language using discriminative segmental feature selection,” in Proceeding of IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 4757–4760
J. Penne, S. Soutschek, L. Fedorowicz, and J. Hornegger, “Robust real-time 3d time-of-flight based gesture navigation,” in Proceedings of International Conference on Automatic Face and Gesture Recognition, pp. 1–2, 2008
Phillips PJ, Wechsler H, Huang J, Rauss P (1998) The FERET database and evaluation procedure for face-recognition algorithms. Image Vis Comput 16:295–306
Rabiner LR (1989) A tutorial on hidden Markov modes and selected application in speech recognition. Proceed IEEE 77:257–286
Rahman MT, Kehtarnavaz N (2008) Real-time face-priority auto focus for digital and cell-phone cameras. IEEE Trans Consum Electron 54(4):1506–1513
M. Schmidt, M. Schels, and F. Schwenker, “A hidden markov model based approach for facial expression recognition in image sequences,” In Proceedings of the 4th IAPR TC3 conference on Artificial Neural Networks in Pattern Recognition, pp. 149–160, 2010
E. Seemann, K. Nickel, and R. Stiefelhagen (2004) “Head pose estimation using stereo vision for human-robot interaction,” in Proceedings of Sixth IEEE International Conference on on Automatic Face and Gesture Recognition, pp. 626–631
Segundo M, Silva L, Bellon O, Queirolo C (2010) Automatic face segmentation and facial landmark detection in range images. IEEE Trans Syst, Man, Cybernet, B: Cybernet 40(5):1319–1330
Shan C, Gong S, McOwan P (2009) Facial expression recognition based on local binary patterns: a comprehensive study. Image Vis Comput 27:803–816
S. Soutschek, J. Penne, J. Hornegger, and J. Kornhuber, “3-d gesture-based scene navigation in medical imaging applications using time-of-flight cameras,” in Proceedings of Workshop On Time of Flight Camera based Computer Vision, pp. 1–6, 2008
Y. Sun and L. Yin, “Automatic pose estimation of 3d facial models,” in Proceedings of International Conference on Pattern Recognition, pp. 1–4, 2008
J. Sung, C. Ponce, B. Selman, and A. Saxena (2012) “Unstructured human activity detection from rgbd images,” in Proceedings of IEEE International Conference on Robotics and Automation, pp. 842–849
H. Takimoto, S. Yoshimori, Y. Mitsukura, and M. Fukumi, “Classification of hand postures based on 3d vision model for human-robot interaction,” in Proceedings of International Symposium on Robot and Human Interactive Communication, pp. 292–297, 2010
Uddin MZ, Hassan MM (2013) A depth video-based facial expression recognition system using radon transform, generalized discriminant analysis, and hidden markov model. Multimed Tools Appl. doi:10.1007/s11042-013-1793-1
Uddin MZ, Lee JJ, Kim T-S (2009) An enhanced independent component-based human facial expression recognition from video. IEEE Trans Consum Electron 55(4):2216–2224
M. Van den Bergh, and L. Van Gool, “Combining rgb and tof cameras for real-time 3d hand gesture interaction,” in Proceedings of IEEE Workshop on Applications of Computer Vision, pp. 66–72, 2011
A. Vieira, E. Nascimento, G. Oliveira, Z. Liu, and M. Campos (2012)”Stop: Space-time occupancy patterns for 3d action recognition from depth map sequences,” in Proceedings of Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, pp. 252–259
W Li., Z. Zhang, and Z. Liu, “Action recognition based on a bag of 3d points,” in Proceedings of workshop on human activity understanding from 3D Data, pp. 9–14, 2010
Y. Wang, K. Huang, and T. Tan (2007) “Human activity recognition based on r transform,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8
Weise T, Bouaziz S, Li H, Pauly M (2011) Realtime performance-based facial animation. ACM Trans Graph 30(no. 4, article 77):1–10
T. Weise, B. Leibe, and L. Van Gool (2007)”Fast 3d scanning with automatic motion compensation,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8
Wilson AD, Bobick AF (2001) Hidden Markov models for modeling and recognizing gesture under variation. Int J Patt Recog Artif Intell - IJPRAI 15(1):123–160
Yang HD, Sclaroff S, Lee SW (2009) Sign language spotting with a threshold model based on conditional random fields. IEEE Trans Patt Anal Mach Intell 31(7):1264–1277
X. Yang and Y. Tian (2012) “Eigenjoints-based action recognition using naive-bayesnearest-neighbor,” in Proceedings of Workshop on Human Activity Understanding from 3D Data, pp. 14–19
X. Yang, C. Zhang, and Y Tian (2012) “Recognizing actions using depth motion mapsbased histograms of oriented gradients,” in Proceedings of ACM International Conference on Multimedia, pp. 1057–1060
P. Yu, D. Xu, and P. Yu (2010) “Comparison of PCA, LDA and GDA for palm print verification,” in Proceedings of the International Conference on Information, Networking and Automation, pp.148-152
Zhao G, Pietikäinen M (2007) Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Trans Patt Anal Mach Intell 29(6):915–928
Acknowledgments
This paper was supported by Faculty Research Fund, Sungkyunkwan University, 2013.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Uddin, M.Z. A depth video-based facial expression recognition system utilizing generalized local directional deviation-based binary pattern feature discriminant analysis. Multimed Tools Appl 75, 6871–6886 (2016). https://doi.org/10.1007/s11042-015-2614-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-015-2614-5