A depth video-based facial expression recognition system utilizing generalized local directional deviation-based binary pattern feature discriminant analysis

Uddin, Md. Zia

doi:10.1007/s11042-015-2614-5

A depth video-based facial expression recognition system utilizing generalized local directional deviation-based binary pattern feature discriminant analysis

Published: 22 April 2015

Volume 75, pages 6871–6886, (2016)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Md. Zia Uddin¹

345 Accesses
8 Citations
Explore all metrics

Abstract

Facial expression recognition from video data is considered to be a very challenging task in the research areas of computer vision, image processing, and pattern recognition. A novel approach is proposed in this paper to recognize facial expressions using depth video data. After extracting Local Directional Deviation-based Binary Pattern (LD²BP) features from depth images, the features are then extended by Generalized Discriminant Analysis (GDA) to improve them. At last, the time-sequential LD²BP-GDA features are applied with Hidden Markov Models (HMMs) for expression training and recognition. The proposed approach outperforms the conventional facial expression recognition approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Efficient Local Feature-Based Facial Expression Recognition System

Article 05 October 2014

Facial Expression Recognition Using Color-Depth Cameras

A New Feature Extraction Technique for Human Facial Expression Recognition Systems Using Depth Camera

References

Aleksic PS, Katsaggelos AK (2006) Automatic facial expression recognition using facial animation parameters and multistream HMMs. IEEE Trans Inform Sec 1:3–11
Article Google Scholar
M. S. Bartlett, G. Donato, J. R. Movellan, J. C. Hager, P. Ekman, and T. J. Sejnowski (1999) “Face image analysis for expression measurement and detection of deceit,” in Proceedings of the Sixth Joint Symposium on Neural Computation, pp. 8–15
Bartlett MS, Movellan JR, Sejnowski TJ (2002) Face recognition by independent component analysis. IEEE Trans Neural Network 13(6):1450–1464
Article Google Scholar
M.D. Breitenstein, J. Jensen, C. Hoilund, T.B. Moeslund, and L. Van Gool (2009) “Head pose estimation from passive stereo images,” in proceedings of 16^th Scandinavian Conference on Image Analysis, pp. 219–228
M.D. Breitenstein, D. Kuettel, T. Weise, L. Van Gool, and H. Pfister (2008) “Real-time face pose estimation from single range images,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8
P. Breuer, C. Eckes, and S. Muller (2007) “Hand gesture recognition with a novel IR time-of-flight range camera: a pilot study,” in Proceedings of the 3rd international conference on Computer vision/computer graphics collaboration techniques, pp. 247–260
I. Buciu, C. Kotropoulos, and I. Pitas, “ICA and Gabor representation for facial expression recognition,” in Proceedings of the IEEE, pp. 855–858, 2003
Q. Cai, D. Gallup, C. Zhang, and Z. Zhang, “3d deformable face tracking with a commodity depth camera,” in Proceeding of European Conference on Computer Vision, pp. -242, 2010
Calder AJ, Burton AM, Miller P, Young AW, Akamatsu S (2001) A principal component analysis of facial expressions. Vis Res 41:1179–1208
Article Google Scholar
Calder AJ, Young AW, Keane J (2000) Configural information in facial expression perception. J Exp Psychol Hum Percept Perform 26(2):527–551
Article Google Scholar
Caschera MC, Ferri F, Grifoni P (2013) InteSe: an integrated model for resolving ambiguities in multimodal sentences. IEEE Trans Syst, Man, Cybernet: Syst 43(4):911–931
Article Google Scholar
Chang KI, Bowyer KW, Flynn PJ (2006) Multiple nose region matching for 3d face recognition under varying facial expression. IEEE Trans Patt Anal Mach Intell 28(10):1695–1700
Article Google Scholar
Chao-Fa C, Shin FY (2006) Recognizing facial action units using independent component analysis and support vector machine. Pattern Recogn 39:1795–1798
Article MATH Google Scholar
Chen F, Kotani K (2008) Facial expression recognition by supervised independent component analysis using MAP estimation. IEICE Trans Inf Syst E91-D(2):341–350
Article Google Scholar
Cohen, N. Sebe, L. Chen, A. Garg, T. S. Huang, “Facial Expression Recognition from Video Sequences: Temporal and Static Modeling,” Computer Vision and Image Understanding, pp. 160–187, 2003
Cohen I, Sebe N, Garg A, Chen LS, Huang TS (2003) Facial expression recognition from video sequences: temporal and static modeling. Comput Vis Image Underst 91:160–187
Article Google Scholar
Donato G, Bartlett MS, Hagar JC, Ekman P, Sejnowski TJ (1999) Classifying facial actions. IEEE Trans Patt Anal Mach Intell 21(10):974–989
Article Google Scholar
P. Dreuw, H. Ney, G. Martinez, O. Crasborn, J. Piater, J.M. Moya, and M. Wheatley (2010) “The signspeak project - bridging the gap between signers and speakers,” in Proceedings of International Conference on Language Resources and Evaluation, pp. 476–481
Dubuisson S, Davoine F, Masson M (2002) A solution for facial expression representation and recognition. Signal Process Image Commun 17:657–673
Article Google Scholar
Ekman P, Priesen WV (1978) Facial action coding system: a technique for the measurement of facial movement. Consulting Psychologists Press, Palo Alto
Google Scholar
A. El-Yacoubi, R. Sabourin, M. Gilloux, and C.Y. Suen, “Off-Line Handwritten Word Recognition Using Hidden Markov Models,” in L.C. Jain and B. Lazzerini (eds.), Knowledge-Based Intelligent Techniques in Character Recognition, pp. 191–229, CRC Press LLC, 1999
G. Fanelli, J. Gall, and L. Van Gool (2011) “Real time head pose estimation with random regression forests,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 617–624
M. Gales and S. Young (2013)” The application of hidden Markov models in speech recognition,” Foundations and Trends in Signal Processing, vol. 1, no. 3, pp. 195–304
H. Hamer, J. Gall, T. Weise, and L. Van Gool (2010) “An object-dependent hand pose prior from sparse training data,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. pp. 671–678
H. Hamer, K. Schindler, E. Koller-Meier, and L. Van Gool (2009) “Tracking a hand manipulating an object” in Proceedings of IEEE International Conference on Computer Vision, pp. 1475–1482
L. He, X. Wang, C. Yu, and K. Wu, “facial expression recognition using embedded hidden markov model,” IEEE International Conference on Systems, Man and Cybernetics, pp. 1568–1572, 2009
Hyvarinen A, Karhunen J, Oja E (2001) Independent component analysis. Wiley, New York
Book Google Scholar
Iddan GJ, Yahav G (2001) 3D imaging in the studio (and elsewhere…). Proc SPIE 4298:48–55
Article Google Scholar
Iddan GJ, Yahav G (2001) 3D imaging in the studio (and elsewhere…). Proc SPIE 4298:48–55
Article Google Scholar
J. Wang., Z. Liu, J. Chorowski, Z. Chen, and Y. Wu (2012) “Robust 3d action recognition with random occupancy patterns,” in Proceedings of European Conference on Computer Vision, pp. 872–885
T. Jabid, M. H. Kabir, O. Chae (2010) “Local Directional Pattern (LDP) a robust image descriptor for object recognition”, in Proceedings of the IEEE Advanced Video and Signal Based Surveillance (AVSS), pp. 482–487
Jalal A, Uddin MZ, Kim JT, Kim TS (2011) Recognition of human home activities via depth silhouettes and transformation for smart homes. Indoor Built Environ 21(1):184–190
Article Google Scholar
Karklin Y, Lewicki MS (2003) Learning higher-order structures in natural images. Netw Comput Neural Syst 14:483–499
Article Google Scholar
Kim D-S, Jeon I-J, Lee S-Y, Rhee P-K, Chung D-J (2006) Embedded face recognition based on fast genetic algorithm for intelligent digital photography. IEEE Trans Consum Electron 52(3):726–734
Article Google Scholar
Kollorz E, Penne J, Hornegger J, Barke A (2008) Gesture recognition with a time-of-flight camera. Int J Intell Syst Technol Appl 5:334–343
Google Scholar
Koppula HS, Gupta R, Saxena A (2013) Human activity learning using object affordances from rgb-d videos. Int J Robot Res 32(8):951–970
Article Google Scholar
J. Lei, X. Ren, qnd D. Fox (2012) “Fine-grained kitchen activity recognition using rgb-d,” in Proceedings of ACM Conference on Ubiquitous Computing, pp. 208–211
Z. Li and R. Jarvis, “Real time hand gesture recognition using a range camera,” in Proceedings of Australasian Conference on Robotics and Automation, 2009
Li W, Zhang Z, Liu Z (2008) Expandable data-driven graphical modeling of human actions based on salient postures. IEEE Trans Circ Syst Video Technol 18(11):1499–1510
Article Google Scholar
Liu C (2004) Enhanced independent component analysis and its application to content based face image retrieval. IEEE Trans Syst, Man, Cybernet- B: Cybernet 34(2):1117–1127
Article Google Scholar
X. Liu and K. Fujimura (2004) “Hand gesture recognition using depth data,” in Proceedings of International Conference on Automatic Face and Gesture Recognition, pp. 529–534
M. Liu, S. Shan, R. Wang, and X. Chen, “Learning expressionlets on spatio-temporal manifold for dynamic facial expression recognition,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1749–1756, 2014
X. Lu and A.K. Jain (2006) “Automatic feature extraction for multiview 3d face recognition,” in Proceedings of 7th International Conference on Automatic Face and Gesture Recognition, pp. 585–59
D. D. Luong, S. Lee, and T.-S. Kim (2013) “Human Computer Interface Using the Recognized Finger Parts of Hand Depth Silhouette via Random Forests,” in Proceedings of 13th International Conference on Control, Automation and Systems, pp. 905–909
M. J. Lyons, S. Akamatsu, M. Kamachi, and J. Gyoba (1998) “Coding facial expressions with Gabor wavelets,” in Proceedings of the Third IEEE International Conference on Automatic Face and Gesture Recognition, pp.200-205(1998)
Marnik J (2007) The polish finger alphabet hand postures recognition using elastic graph matching. Comput Recog Syst 2 45:454–461
Article Google Scholar
A. McCallum, D. Freitag, and F.C.N. Pereira (2000) “Maximum entropy markov models for information extraction and segmentation,” in Proceedings of International Conference on Machine Learning, pp. 591–598
Meulders M, Boeck PD, Mechelen IV, Gelman A (2005) Probabilistic feature analysis of facial perception of emotions. Appl Stat 54(4):781–793
MathSciNet MATH Google Scholar
A. Mian, M. Bennamoun, and R. Owens (2006) “Automatic 3d face detection, normalization and recognition,” in Proceedings of Third International Symposium on 3D Data Processing, Visualization, and Transmission, pp. 735–742
Mitra S, Acharya T (2007) Gesture recognition: a survey. IEEE Trans Syst, Man, Cybernet-C: Appl Rev 37(3):311–324
Article Google Scholar
Z. Mo and U. Neumann, “Real-time hand pose recognition using low-resolution depth images,” in Proceedigns of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1499–1505, 2006
L.P. Morency, P. Sundberg, and T. Darrell (2003) “Pose estimation using 3d view-based eigenspaces,” in Proceedings of IEEE International Workshop on Analysis and Modeling of Faces and Gestures, pp. 45–52
Nair P, Cavallaro A (2009) 3-d face detection, landmark localization, and registration using a point distribution model. IEEE Trans Multimed 11(4):611–623
Article Google Scholar
I. Oikonomidis, N. Kyriazis, and A.A. Argyros (2012) “Tracking the articulated motion of two strongly interacting hands,“in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1862–1869
Ojala T, Pietikäinen M, Mäenpää T (2002) Multiresolution gray scale and rotation invariant texture analysis with local binary patterns. IEEE Trans Pattern Anal Mach Intell 24:971–987
Article MATH Google Scholar
Ong S, Ranganath S (2005) Automatic sign language analysis: a survey and the future beyond lexical meaning. IEEE Trans Patt Anal Mach Intell 27(6):873–891
Article Google Scholar
O. Oreifej and Z. Liu, “Hon4d: Histogram of oriented 4d normals for activity recognition from depth sequences,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. pp. 716–723, 2013
Padgett C, Cottrell G (1997) “Representation face images for emotion classification”, advances in neural information processing systems, vol 9. MIT Press, Cambridge, MA
Google Scholar
T. Pei, T. Starner, H. Hamilton, I. Essa, and J. Rehg (2009) “Learnung the basic units in american sign language using discriminative segmental feature selection,” in Proceeding of IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 4757–4760
J. Penne, S. Soutschek, L. Fedorowicz, and J. Hornegger, “Robust real-time 3d time-of-flight based gesture navigation,” in Proceedings of International Conference on Automatic Face and Gesture Recognition, pp. 1–2, 2008
Phillips PJ, Wechsler H, Huang J, Rauss P (1998) The FERET database and evaluation procedure for face-recognition algorithms. Image Vis Comput 16:295–306
Article Google Scholar
Rabiner LR (1989) A tutorial on hidden Markov modes and selected application in speech recognition. Proceed IEEE 77:257–286
Article Google Scholar
Rahman MT, Kehtarnavaz N (2008) Real-time face-priority auto focus for digital and cell-phone cameras. IEEE Trans Consum Electron 54(4):1506–1513
Article Google Scholar
M. Schmidt, M. Schels, and F. Schwenker, “A hidden markov model based approach for facial expression recognition in image sequences,” In Proceedings of the 4^th IAPR TC3 conference on Artificial Neural Networks in Pattern Recognition, pp. 149–160, 2010
E. Seemann, K. Nickel, and R. Stiefelhagen (2004) “Head pose estimation using stereo vision for human-robot interaction,” in Proceedings of Sixth IEEE International Conference on on Automatic Face and Gesture Recognition, pp. 626–631
Segundo M, Silva L, Bellon O, Queirolo C (2010) Automatic face segmentation and facial landmark detection in range images. IEEE Trans Syst, Man, Cybernet, B: Cybernet 40(5):1319–1330
Article Google Scholar
Shan C, Gong S, McOwan P (2009) Facial expression recognition based on local binary patterns: a comprehensive study. Image Vis Comput 27:803–816
Article Google Scholar
S. Soutschek, J. Penne, J. Hornegger, and J. Kornhuber, “3-d gesture-based scene navigation in medical imaging applications using time-of-flight cameras,” in Proceedings of Workshop On Time of Flight Camera based Computer Vision, pp. 1–6, 2008
Y. Sun and L. Yin, “Automatic pose estimation of 3d facial models,” in Proceedings of International Conference on Pattern Recognition, pp. 1–4, 2008
J. Sung, C. Ponce, B. Selman, and A. Saxena (2012) “Unstructured human activity detection from rgbd images,” in Proceedings of IEEE International Conference on Robotics and Automation, pp. 842–849
H. Takimoto, S. Yoshimori, Y. Mitsukura, and M. Fukumi, “Classification of hand postures based on 3d vision model for human-robot interaction,” in Proceedings of International Symposium on Robot and Human Interactive Communication, pp. 292–297, 2010
Uddin MZ, Hassan MM (2013) A depth video-based facial expression recognition system using radon transform, generalized discriminant analysis, and hidden markov model. Multimed Tools Appl. doi:10.1007/s11042-013-1793-1
Google Scholar
Uddin MZ, Lee JJ, Kim T-S (2009) An enhanced independent component-based human facial expression recognition from video. IEEE Trans Consum Electron 55(4):2216–2224
Article Google Scholar
M. Van den Bergh, and L. Van Gool, “Combining rgb and tof cameras for real-time 3d hand gesture interaction,” in Proceedings of IEEE Workshop on Applications of Computer Vision, pp. 66–72, 2011
A. Vieira, E. Nascimento, G. Oliveira, Z. Liu, and M. Campos (2012)”Stop: Space-time occupancy patterns for 3d action recognition from depth map sequences,” in Proceedings of Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, pp. 252–259
W Li., Z. Zhang, and Z. Liu, “Action recognition based on a bag of 3d points,” in Proceedings of workshop on human activity understanding from 3D Data, pp. 9–14, 2010
Y. Wang, K. Huang, and T. Tan (2007) “Human activity recognition based on r transform,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8
Weise T, Bouaziz S, Li H, Pauly M (2011) Realtime performance-based facial animation. ACM Trans Graph 30(no. 4, article 77):1–10
Article Google Scholar
T. Weise, B. Leibe, and L. Van Gool (2007)”Fast 3d scanning with automatic motion compensation,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8
Wilson AD, Bobick AF (2001) Hidden Markov models for modeling and recognizing gesture under variation. Int J Patt Recog Artif Intell - IJPRAI 15(1):123–160
Article Google Scholar
Yang HD, Sclaroff S, Lee SW (2009) Sign language spotting with a threshold model based on conditional random fields. IEEE Trans Patt Anal Mach Intell 31(7):1264–1277
Article Google Scholar
X. Yang and Y. Tian (2012) “Eigenjoints-based action recognition using naive-bayesnearest-neighbor,” in Proceedings of Workshop on Human Activity Understanding from 3D Data, pp. 14–19
X. Yang, C. Zhang, and Y Tian (2012) “Recognizing actions using depth motion mapsbased histograms of oriented gradients,” in Proceedings of ACM International Conference on Multimedia, pp. 1057–1060
P. Yu, D. Xu, and P. Yu (2010) “Comparison of PCA, LDA and GDA for palm print verification,” in Proceedings of the International Conference on Information, Networking and Automation, pp.148-152
Zhao G, Pietikäinen M (2007) Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Trans Patt Anal Mach Intell 29(6):915–928
Article Google Scholar

Download references

Acknowledgments

This paper was supported by Faculty Research Fund, Sungkyunkwan University, 2013.

Author information

Authors and Affiliations

Department of Computer Education, Sungkyunkwan University, Seoul, Republic of Korea
Md. Zia Uddin

Authors

Md. Zia Uddin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Md. Zia Uddin.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Uddin, M.Z. A depth video-based facial expression recognition system utilizing generalized local directional deviation-based binary pattern feature discriminant analysis. Multimed Tools Appl 75, 6871–6886 (2016). https://doi.org/10.1007/s11042-015-2614-5

Download citation

Received: 10 August 2014
Revised: 01 February 2015
Accepted: 06 April 2015
Published: 22 April 2015
Issue Date: June 2016
DOI: https://doi.org/10.1007/s11042-015-2614-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A depth video-based facial expression recognition system utilizing generalized local directional deviation-based binary pattern feature discriminant analysis

Abstract

Access this article

Similar content being viewed by others

An Efficient Local Feature-Based Facial Expression Recognition System

Facial Expression Recognition Using Color-Depth Cameras

A New Feature Extraction Technique for Human Facial Expression Recognition Systems Using Depth Camera

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A depth video-based facial expression recognition system utilizing generalized local directional deviation-based binary pattern feature discriminant analysis

Abstract

Access this article

Similar content being viewed by others

An Efficient Local Feature-Based Facial Expression Recognition System

Facial Expression Recognition Using Color-Depth Cameras

A New Feature Extraction Technique for Human Facial Expression Recognition Systems Using Depth Camera

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation