The Visual Computer

, Volume 21, Issue 6, pp 355–372 | Cite as

Mirror MoCap: Automatic and efficient capture of dense 3D facial motion parameters from video

  • I-Chen LinEmail author
  • Ming Ouhyoung
original article


In this paper, we present an automatic and efficient approach to the capture of dense facial motion parameters, which extends our previous work of 3D reconstruction from mirror-reflected multiview video. To narrow search space and rapidly generate 3D candidate position lists, we apply mirrored-epipolar bands. For automatic tracking, we utilize spatial proximity of facial surfaces and temporal coherence to find the best trajectories and rectify statuses of missing and false tracking. More than 300 markers on a subject’s face are tracked from video at a process speed of 9.2 frames per second (fps) on a regular PC. The estimated 3D facial motion trajectories have been applied to our facial animation system and can be used for facial motion analysis.


Efficient Capture Temporal Coherence Facial Motion Candidate Position Facial Animation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Ahlberg J (2002) An active model for facial feature tracking. EURASIP J Appl Signal Process 6:566–571CrossRefGoogle Scholar
  2. 2.
    Arun KS, Huang TS, Blostein SD (1987) Least square fitting of two 3D point sets. IEEE Trans Pattern Anal Mach Intell 9(5):698–700Google Scholar
  3. 3.
    Basu S, Pentland A (1997) A three-dimensional model of human lip motions trained from video. In: Proceedings of the workshop on IEEE non-rigid and articulated motion, San Juan, Puerto Rico, pp 46–53Google Scholar
  4. 4.
    Bozic SM (1979) Digital and Kalman filtering. Edward Arnold, LondonGoogle Scholar
  5. 5.
    Blanz V, Vetter T (1999) A morphable model for the synthesis of 3D faces. In: Proceedings of ACM SIGGRAPH’99, pp 353–360Google Scholar
  6. 6.
    Blanz V, Basso C, Poggio T, Vetter T (2003) Reanimating faces in images and video. Comput Graph Forum 22(2):641–650CrossRefGoogle Scholar
  7. 7.
    Brand M (1999) Voice puppetry. In: Proceedings of SIGGRAPH’99, pp 21–28Google Scholar
  8. 8.
    Buckley K, Vaddiraju A, Perry R (2000) A new pruning/merging algorithm for MHT multitarget tracking. In: Proceedings of Radar-2000Google Scholar
  9. 9.
    Castañon DA (1990) Efficient algorithms for finding the k best paths through a trellis. IEEE Trans Aerospace Elect Syst 26(1):405–410CrossRefGoogle Scholar
  10. 10.
    Cohen MM, Massaro DW (1993) Modeling co-articulation in synthetic visual speech. In: Magnenat-Thalmann N, Thalmann D (eds) Models and techniques in computer animation. Springer, Berlin Heidelberg New York, pp 139–156Google Scholar
  11. 11.
    Davis J, Nehab D, Ramamoorthi R, Rusinkiewicz S (2005) Spacetime stereo: a unifying framework for depth from triangulation. IEEE Trans Pattern Anal Mach Intell 27(1):296–302CrossRefPubMedGoogle Scholar
  12. 12.
    Ezzat T, Geiger G, Poggio T (2002) Trainable videorealistic speech animation. ACM Trans Graph 21(2):388–398 (also in Proceedings of SIGGRAPH’02)CrossRefGoogle Scholar
  13. 13.
    Goto T, Kshirsagar S, Magnenat-Thalmann N (2001) Automatic face cloning and animation using real-time facial feature tracking and speech acquisition. IEEE Signal Process Mag 18(2):17–25CrossRefGoogle Scholar
  14. 14.
    Guenter B, Grimm C, Wood D, Malvar H, Pighin F (1998) Making faces. In: Proceedings of ACM SIGGRAPH’98, pp 55–66Google Scholar
  15. 15.
    Haralick RH, Shapiro LG (1992) Computer and robotic vision, vol 1. Addison-Wesley, Reading, MAGoogle Scholar
  16. 16.
    Heikkilä J, Silvén O (1997) A four-step camera calibration procedure with implicit image correction. In: Proceedings of the IEEE conference on computer vision and pattern recognition, San Juan, Puerto Rico, pp 1106–1112Google Scholar
  17. 17.
    Kalberer GA, Gool LV (2001) Face animation based on observed 3D speech dynamics. In: Proceedings of Computer Animation 2001, Seoul, Korea. IEEE Press, New York, pp 18–24Google Scholar
  18. 18.
    Kuratate T, Yehia H, Vatikiotis-Bateson E (1998) Kinematics-based synthesis of realistic talking faces. In: Proceedings of Auditory-Visual Speech Processing, pp 185–190Google Scholar
  19. 19.
    Kshirsagar S, Magnenat-Thalmann N (2003) Visyllable based speech animation. Comput Graph Forum 22(2):631–639CrossRefGoogle Scholar
  20. 20.
    Lin I-C, Yeh J-S, Ouhyoung M (2002) Extracting 3D facial animation parameters from multiview video clips. IEEE Comput Graph Appl 22(6):72–80CrossRefGoogle Scholar
  21. 21.
    Pandzic IS, Ostermann J, Millen D (1999) User evaluation: synthetic talking faces for interactive services. Visual Comput 15:330–340CrossRefGoogle Scholar
  22. 22.
    Patterson EC, Litwinowicz PC, Greene N (1991) Facial animation by spatial mapping. In: Proceedings of Computer Animation ‘91. Springer, Berlin Heidelberg New York, pp 31–44Google Scholar
  23. 23.
    Pighin F, Hecker J, Lischinski D, Szeliski R, Salesin DH (1998) Synthesizing realistic facial expressions from photographs. In: Proceedings of ACM SIGGRAPH ’98, pp 75–84Google Scholar
  24. 24.
    Pighin F, Szeliski R, Salesin DH (1999) Resynthesizing facial animation through 3D model-based tracking. In: Proceedings of the international conference on computer vison, 1:143–150Google Scholar
  25. 25.
    Tu P-H, Lin I-C, Yeh J-S, Liang R-H, Ouhyoung M (2004) Surface detail capturing for realistic facial animation. J Comput Sci Technol 19(5):618–625Google Scholar
  26. 26.
    Weng J, Huang TS, Ahuja N (1989) Motion and structure from two perspective views: algorithms, error analysis, and error estimation. IEEE Trans Pattern Anal Mach Intell 11(5):451–476CrossRefGoogle Scholar
  27. 27.
    Wolf JK (1989) Finding the best set of K paths through a trellis with application to multitarget tracking. IEEE Trans Aerospace Elect Syst 26(1):287–296CrossRefGoogle Scholar
  28. 28.
    Yeasin M, Polat E, Sharma R (2004) A mutiobject tracking framework for interactive mutimedia applications. IEEE Trans Multimedia 6(2):398–405CrossRefGoogle Scholar
  29. 29.
    Zhang L, Snavely N, Curless B, Seitz SM (2004) Spacetime faces: high resolution capture for modeling and animation. ACM Trans Graph 23(2):548–558 (also in Proceedings of SIGGRAPH’04)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag 2005

Authors and Affiliations

  1. 1.Dept. of Computer and Information ScienceNational Chiao Tung UniversityHsinchuTaiwan
  2. 2.Dept. of Computer Science and Information EngineeringNational Taiwan UniversityTaipeiTaiwan

Personalised recommendations