Journal on Multimodal User Interfaces

, Volume 1, Issue 1, pp 21–30 | Cite as

Synthesis of expressive facial animations: A multimodal caricatural mirror

  • Olivier Martin
  • Irene Kotsia
  • Ioannis Pitas
  • Arman Savran
  • Jordi Adell
  • Ana Huerta
  • Raphael Sebbe


This paper describes a natural and intuitive way to create expressive facial animations, using a novel approach based on the so-called ‘multimodal caricatural mirror’ (MCM). Taking as an input an audio-visual video sequence of the user’s face, the MCM generates a facial animation, in which the prosody and the facial expressions of emotions can either be reproduced or amplified. The user can thus simulate an emotion and see almost instantly the animation it produced, like with a regular mirror. In addition, the MCM also enables to amplify the emotions of selected parts of the input video sequence, leaving other parts unchanged. It therefore constitutes a novel approach to the design of very expressive facial animation, as the affective content of the animation can be modified by post-processing operations.


Face Analysis Prosody Analysis Facial Animation Spoken Language Processing Multimodal Interfaces 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

8. References

  1. [1]
    M. S. Bartlett, G. Littlewort, I. Fasel, and J. R. Movellan, “Real Time Face Detection and Facial Expression Recognition: Development and Applications to Human Computer Interaction”, inProceedings of Conference on Computer Vision and Pattern Recognition Workshop, vol. 5, (Madison, Wisconsin), pp. 53–58, 16–22 June 2003. 21CrossRefGoogle Scholar
  2. [2]
    P. S. Aleksic and A. K. Katsaggelos, “Speech-To-Video Synthesis Using MPEG-4 Compliant Visual Features”,IEEE Transactions on Circuits and Systems for Video Technology, vol. 14, pp. 682–692, May 2004. 21CrossRefGoogle Scholar
  3. [3]
    Z. Deng, U. Neumann, J. Lewis, T.-Y. Kim, M. Bulut, and S. Narayanan, “Expressive Facial Animation Synthesis by Learning Speech Coarticulation and Expression Spaces”,IEEE Transactions on Visualization and Computer Graphics, vol. 12, pp. 1523–1534, November/December 2006. 21, 22CrossRefGoogle Scholar
  4. [4]
    M. Pantic and L. Rothkrantz, “Automatic Analysis of Facial Expressions: The State of the Art.”,IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, pp. 1424–1445, December 2000. 21CrossRefGoogle Scholar
  5. [5]
    B. Fasel and J. Luettin, “Automatic Facial Expression Analysis: A Survey.”,Pattern Recognition, vol. 1, no. 30, pp. 259–275, 2003. 21CrossRefGoogle Scholar
  6. [6]
    I. Cohen, N. Sebe, S. Garg, L. S. Chen, and T. S. Huanga, “Facial expression recognition from video sequences: temporal and static modelling”,Computer Vision and Image Understanding, vol. 91, pp. 160–187, 2003. 21CrossRefGoogle Scholar
  7. [7]
    I. Kotsia and I. Pitas, “Facial Expression Recognition in Image Sequences Using Geometric Deformation Features and Support Vector Machines”,IEEE Transactions on Image Processing, vol. 16, pp. 172–187, January 2007. 21, 25, 26CrossRefMathSciNetGoogle Scholar
  8. [8]
    J. Tao, Y. Kang, and A. Li, “Prosody Conversion From Neutral Speech to Emotional Speech”,IEEE Transactions on Audio, Speech and Language Processing, vol. 14, pp. 1145–1154, July 2006. 21CrossRefGoogle Scholar
  9. [9]
    D. H. Kim, S. U. Jungt, K. H. An, H. S. Lee, and M. J. Chung, “Development of a Facial Expression Imitation System”,IEEE Transactions On Visualization And Computer Graphic, vol. 12, pp. 1523–1534, November/December 2006. 21CrossRefGoogle Scholar
  10. [10]
    P. Ekman and W. V. Friesen,Manual for the Facial Action Coding System. Consulting Psychologists Press, 1977. 22Google Scholar
  11. [11]
    D. Comaniciu, V. Ramesh, and P.Meer, “Real-Time Tracking of Non-Rigid Objects using Mean-Shift”, inin Proceedinds of IEEE Conference on Computer Vision and Pattern Recognition, (Hilton Head Island, South Carolina), 2000. 23Google Scholar
  12. [12]
    J. Allen, R. Xu, and J. Jin, “Object Tracking Using CamShift Algorithm and Multiple Quantized Feature Spaces”, inin Proceedinds of the Pan-Sydney Area Workshop on Visual Information Processing (VIP2003), (Sydney, Australia), 2003. 23Google Scholar
  13. [13]
    K. Sobottka and I. Pitas, “Segmentation and Tracking of Faces in Color Images”, inin Proceedings of the 2nd International Conference on Automatic Face and Gesture Recognition, (Killington, Vermont, USA), 14–16 October 1996. 24Google Scholar
  14. [14]
    J. Ahlberg, “CANDIDE-3 — an updated parameterized face”, Tech. Rep. Technical report no. LiTH-ISY-R-2326, Dept. of Electrical Engineering, Linkoping University, 2001. 24, 28Google Scholar
  15. [15]
    T. Kanade, J. F. Cohn, and Y. L. Tian, “Comprehensive database for facial expression analysis”, inin Proceedinds of the 4th IEEE International Conference on Automatic Face and Gesture Recognition (FG’00), 2000. 24, 26Google Scholar
  16. [16]
    J. Y. Bouguet, “Pyramidal implementation of the Lucas-Kanade feature tracker”, tech. rep., Intel Corporation, Microprocessor Research Labs, 1999. 24, 25Google Scholar
  17. [17]
    P. Ekman and W. V. Friesen,Emotion in the Human Face. New Jersey: Prentice Hall, 1975. 25Google Scholar
  18. [18]
    P. Oudeyer, “The production and recognition of emotions in speech: features and algorithms”,International Journal on Human-Computer Studies, vol. 59, pp. 157–183, 2003. 26, 27CrossRefGoogle Scholar
  19. [19]
    P. Boersma, “Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound”, inin Proceedings of the the Institute of Phonetic Sciences, pp. 97–110, 1995. 26Google Scholar

Copyright information

© OpenInterface Association 2007

Authors and Affiliations

  • Olivier Martin
    • 1
  • Irene Kotsia
    • 2
  • Ioannis Pitas
    • 2
  • Arman Savran
    • 3
  • Jordi Adell
    • 4
  • Ana Huerta
    • 5
  • Raphael Sebbe
    • 6
  1. 1.TELE LabUniversité catholique de LouvainBelgium
  2. 2.AIIA LabAristotle University of ThessalonikiGreece
  3. 3.Electrical and Electronics Engineering Dept.Bogazici UniversityTurkey
  4. 4.TALP Research CenterUniversitat Politècnica de CatalunyaSpain
  5. 5.Speech Technology GroupTechnical University of MadridSpain
  6. 6.TCTS LabFaculté Polytechnique de MonsBelgium

Personalised recommendations