Advertisement

International Journal of Computer Vision

, Volume 83, Issue 2, pp 178–194 | Cite as

Pose-Invariant Facial Expression Recognition Using Variable-Intensity Templates

  • Shiro KumanoEmail author
  • Kazuhiro Otsuka
  • Junji Yamato
  • Eisaku Maeda
  • Yoichi Sato
Article

Abstract

In this paper, we propose a method for pose-invariant facial expression recognition from monocular video sequences. The advantage of our method is that, unlike existing methods, our method uses a simple model, called the variable-intensity template, for describing different facial expressions. This makes it possible to prepare a model for each person with very little time and effort. Variable-intensity templates describe how the intensities of multiple points, defined in the vicinity of facial parts, vary with different facial expressions. By using this model in the framework of a particle filter, our method is capable of estimating facial poses and expressions simultaneously. Experiments demonstrate the effectiveness of our method. A recognition rate of over 90% is achieved for all facial orientations, horizontal, vertical, and in-plane, in the range of ±40 degrees, ±20 degrees, and ±40 degrees from the frontal view, respectively.

Keywords

Facial expression recognition Variable-intensity templates Intensity distribution models Particle filter 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Supplementary material

VideoObject

VideoObject

VideoObject

VideoObject

VideoObject

VideoObject

References

  1. Bartlett, M. S., Littlewort, G., Frank, M. G., Lainscsek, C., Fasel, I. R., & Movellan, J. R. (2006). Automatic recognition of facial actions in spontaneous expressions. Journal of Multimedia, 1(6), 22–35. CrossRefGoogle Scholar
  2. Beaton, A. E., & Tukey, J. W. (1974). The fitting of power series, meaning polynomials, illustrated on band-spectroscopic data. Technometrics, 16(2), 147–185. zbMATHCrossRefGoogle Scholar
  3. Black, M. J., & Yacoob, Y. (1997). Recognizing facial expressions in image sequences using local parameterized models of image motion. International Journal of Computer Vision, 25(1), 23–48. CrossRefGoogle Scholar
  4. Cascia, M. L., Sclaroff, S., & Athitsos, V. (2000). Fast, reliable head tracking under varying illumination: An approach based on registration of texture-mapped 3D models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(4), 322–336. CrossRefGoogle Scholar
  5. Castrillon, M., Deniz, O., Guerra, C., & Hernandez, M. (2007). Encara2: Real-time detection of multiple faces at different resolutions in video streams. Journal of Visual Communication and Image Representation, 18(2), 130–140. CrossRefGoogle Scholar
  6. Chang, Y., Hu, C., Feris, R., & Turk, M. (2006). Manifold based analysis of facial expression. Image and Vision Computing, 24(6), 605–614. CrossRefGoogle Scholar
  7. Cohen, I., Sebe, N., Garg, A., Chen, L. S., & Huang, T. S. (2003). Facial expression recognition from video sequences: temporal and static modeling. Computer Vision and Image Understanding, 91(1–2), 160–187. CrossRefGoogle Scholar
  8. Dornaika, F., & Davoine, F. (2008). Simultaneous facial action tracking and expression recognition in the presence of head motion. International Journal of Computer Vision, 76(3), 257–281. CrossRefGoogle Scholar
  9. Ekman, P., & Friesen, W. V. (1975). Unmasking the face: a guide to recognizing emotions from facial expressions. Englewood Cliffs: Prentice Hall. Google Scholar
  10. Ekman, P., & Friesen, W. V. (1978). The facial action coding system: a technique for the measurement of facial movement. Palo Alto: Consulting Psychologists Press. Google Scholar
  11. Ekman, P., Friesen, W. V., & Hager, J. C. (2002). FACS investigator’s guide. A human face. Google Scholar
  12. Fasel, B., & Luettin, J. (2003). Automatic facial expression analysis: Survey. Pattern Recognition, 36, 259–275. zbMATHCrossRefGoogle Scholar
  13. Fasel, B., Monay, F., & Gatica-Perez, D. (2004). Latent semantic analysis of facial action codes for automatic facial expression recognition. In Proceedings of the ACM SIGMM international workshop on multimedia information retrieval (pp. 181–188). Google Scholar
  14. Geman, S., & McClure, D. E. (1987). Statistical methods for tomographic image reconstruction. Bulletin of the International Statistical Institute, LII, 5–21. MathSciNetGoogle Scholar
  15. Gokturk, S. B., Tomasi, C., Girod, B., & Bouguet, J. (2002). Model-based face tracking for view-independent facial expression recognition. In Proceedings of the IEEE international conference on automatic face and gesture recognition (pp. 287–293). Google Scholar
  16. Gross, R., Matthews, I., & Baker, S. (2005). Generic vs. person specific active appearance models. Image and Vision Computing, 23(11), 1080–1093. CrossRefGoogle Scholar
  17. Hu, Y., Zeng, Z., Yin, L., Wei, X., Zhou, X., & Huang, T. S. (2008). Multi-view facial expression recognition. In Proceedings of the IEEE international conference on automatic face and gesture recognition. Google Scholar
  18. Huang, C. L., & Huang, Y. M. (1997). Facial expression recognition using model-based feature extraction and action parameters classification. Journal of Visual Communication and Image Representation, 8(3), 278–290. CrossRefGoogle Scholar
  19. Isard, M., & Blake, A. (1998). Condensation—conditional density propagation for visual tracking. International Journal of Computer Vision, 29(1), 5–28. CrossRefGoogle Scholar
  20. Kanade, T., Cohn, J., & Tian, Y. L. (2000). Comprehensive database for facial expression analysis. In Proceedings of the IEEE international conference on automatic face and gesture recognition (pp. 46–53). Google Scholar
  21. Kass, M., Witkin, A., & Terzopoulos, D. (1988). Snakes: Active contour models. International Journal of Computer Vision, 1(4), 321–331. CrossRefGoogle Scholar
  22. Koelstra, S., & Pantic, M. (2008). Non-rigid registration using free-form deformations for recognition of facial actions and their temporal dynamics. In Proceedings of the IEEE international conference on automatic face and gesture recognition. Google Scholar
  23. Kotsia, I., & Pitas, I. (2007). Facial expression recognition in image sequences using geometric deformation features and support vector machines. IEEE Transactions on Image Processing, 16(1), 172–187. CrossRefMathSciNetGoogle Scholar
  24. Kumano, S., Otsuka, K., Yamato, J., Maeda, E., & Sato, Y. (2007). Pose-invariant facial expression recognition using variable-intensity templates. In Proceedings of Asian conference on computer vision (Vol. 1, pp. 324–334). Google Scholar
  25. Lanitis, A., Taylor, C. J., & Cootes, T. F. (1997). Automatic interpretation and coding of face images using flexible models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(7), 743–756. CrossRefGoogle Scholar
  26. Liao, W. K., & Cohen, I. (2006). Belief propagation driven method for facial gestures recognition in presence of occlusions. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition workshop (pp. 158–163). Google Scholar
  27. Littlewort, G., Bartlett, M. S., Fasel, I. R., Susskind, J., & Movellan, J. R. (2006). Dynamics of facial expression extracted automatically from video. Image and Vision Computing, 24(6), 615–625. CrossRefGoogle Scholar
  28. Loy, G., & Zelinsky, A. (2003). Fast radial symmetry for detecting points of interest. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(8), 959–973. CrossRefGoogle Scholar
  29. Lucey, S., Matthews, I., Hu, C., Ambadar, Z., Torre, F., & Cohn, J. (2006). AAM derived face representations for robust facial action recognition. In Proceedings of the IEEE international conference on automatic face and gesture recognition (pp. 155–160). Google Scholar
  30. Matsubara, Y., & Shakunaga, T. (2005). Sparse template matching and its application to real-time object tracking. IPSJ Transactions on Computer Vision and Image Media, 46(9), 17–40 (in Japanese). Google Scholar
  31. Murphy-Chutorian, E., & Trivedi, M. M. (2008). Head pose estimation in computer vision: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence (in press). Google Scholar
  32. Oka, K., & Sato, Y. (2005). Real-time modeling of face deformation for 3D head pose estimation. In Proceedings of the IEEE international conference on automatic face and gesture recognition (pp. 308–320). Google Scholar
  33. Otsuka, K., Sawada, H., & Yamato, J. (2007). Automatic inference of cross-modal nonverbal interactions in multiparty conversations: “who responds to whom, when, and how?” from gaze, head gestures, and utterances. In Proceedings of the international conference on multimodal interfaces (pp. 255–262). Google Scholar
  34. Pantic, M., & Bartlett, M. (2007). Machine analysis of facial expressions. In I-Tech education and publishing (pp. 377–416). Google Scholar
  35. Pantic, M., & Rothkrantz, L. (2000a). Expert system for automatic analysis of facial expression. Image and Vision Computing, 18, 881–905. CrossRefGoogle Scholar
  36. Pantic, M., & Rothkrantz, L. J. M. (2000b). Automatic analysis of facial expressions: the state of the art. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22, 1424–1445. CrossRefGoogle Scholar
  37. Russell, S., & Norvig, P. (2003). Artificial intelligence—a modern approach. Paris: Pearson Education. Google Scholar
  38. Sebe, N., Lew, M. S., Sun, Y., Cohen, I., Gevers, T., & Huang, T. S. (2007). Authentic facial expression analysis. Image and Vision Computing, 25(12), 1856–1863. CrossRefGoogle Scholar
  39. Tang, H., & Huang, T. S. (2008). 3D facial expression recognition based on properties of line segments connecting facial feature points. In Proceedings of the IEEE international conference on automatic face and gesture recognition. Google Scholar
  40. Tian, Y. L., Kanade, T., & Cohn, J. F. (2001). Recognizing action units for facial expression analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(2), 97–115. CrossRefGoogle Scholar
  41. Tian, Y. L., Kanade, T., & Cohn, J. (2005). Facial expression analysis. Berlin: Springer. Google Scholar
  42. Tong, Y., Liao, W., & Ji, Q. (2007). Facial action unit recognition by exploiting their dynamic and semantic relationships. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(10), 1683–1699. CrossRefGoogle Scholar
  43. Viola, P., & Jones, M. (2001). Rapid object detection using a boosted cascade of simple features. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (pp. 511–518). Google Scholar
  44. Wang, J., Yin, L., Wei, X., & Sun, Y. (2006). 3D facial expression recognition based on primitive surface feature distribution. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (pp. 1399–1406). Google Scholar
  45. Xiao, J., Moriyama, T., Kanade, T., & Cohn, J. (2003). Robust full-motion recovery of head by dynamic templates and re-registration techniques. International Journal of Imaging Systems and Technology, 13, 85–94. CrossRefGoogle Scholar
  46. Xiao, J., Baker, S., Matthews, I., & Kanade, T. (2004). Real-time combined 2D+3D active appearance models. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (Vol. 2, pp. 535–542). Google Scholar
  47. Yang, P., Liu, Q., Cui, X., & Metaxas, D. N. (2008). Facial expression recognition based on dynamic binary patterns. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition. Google Scholar
  48. Zhang, W., Chen, H., Yao, P., Li, B., & Zhuang, Z. (2006). Precise eye localization with AdaBoost and fast radial symmetry. In Proceedings of the international conference on computational intelligence and security (Vol. 1, pp. 725–730). Google Scholar
  49. Zhao, G., & Pietikainen, M. (2007). Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(6), 915–928. CrossRefGoogle Scholar
  50. Zhu, Z., & Ji, Q. (2006). Robust real-time face pose and facial expression recovery. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition (pp. 681–688). Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2008

Authors and Affiliations

  • Shiro Kumano
    • 1
    Email author
  • Kazuhiro Otsuka
    • 2
  • Junji Yamato
    • 2
  • Eisaku Maeda
    • 2
  • Yoichi Sato
    • 1
  1. 1.Institute of Industrial ScienceThe University of TokyoTokyoJapan
  2. 2.NTT Communication Science LaboratoriesNTTKanagawaJapan

Personalised recommendations