Skip to main content
Log in

Face model fitting on video sequences based on incremental virtual active appearance model

  • Published:
Multidimensional Systems and Signal Processing Aims and scope Submit manuscript

Abstract

It has been proved that the performance of a person-specific active appearance model (AAM) built to model the appearance variation of a single person across pose, illumination, and expression is substantially better than the performance of a generic AAM built to model the appearance variation of many faces. However, it is not practical to build a personal AAM before tracking an unseen subject. A virtual person-specific AAM is proposed to tackle the problem. The AAM is constructed from a set of virtual personal images with different poses and expressions which are synthesized from the annotated first frame via regressions. To preserve personal facial details on the virtual images, a poison fusion strategy is designed and applied to the virtual facial images generated via bilinear kernel ridge regression. Furthermore, the AAM subspace is sequentially updated during tracking based on sequential Karhunen–Loeve algorithm, which helps the AAM adaptive to the facial context variation. Experiments show the proposed virtual personal AAM is robust to facial context changes during tracking, and outperforms other state-of-the-art AAM on facial feature tracking accuracy and computation cost.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

Notes

  1. It is often assumed that the model is initialized by fitting it to the manually annotated points in the first frame in the field of AAM-based facial feature tracking when demonstrating the efficiency of tracking algorithms (Cootes and Taylor 2006; Matthews 2004). In practical cases, the model can be initialized via general AAM by placing the mean shape within the face detection window or by warping AAM base mesh to the detected face feature points, such as eyes, mouth centers, and nose tips for initialization. Although the general AAM does not work well for a whole video sequence, it is usually capable of fitting to the first frame which is usually frontal and neutral (Ionita et al. 2011; Saragih et al. 2011).

  2. http://www-prima.inrialpes.fr/FGnet/data/01-TalkingFace/talking_face.html.

  3. The images have been been re-annotated following the Multi-PIE 68 points mark-up.

References

  • Abdi, H., & Williams, L. (2010). Principal component analysis. Wiley Interdisciplinary Reviews: Computational Statistics, 2(4), 433–459.

    Article  Google Scholar 

  • Akshay, A., Conrad, S., Tom, G., & Roland, G. (2009). Learning-based face synthesis for pose-robust recognition from single image. In: BMVC 2009: Proceedings of British Machine Vision Conference. London, UK.

  • Asthana, A., Saragih, J., Wagner, M., & Goecke, R. (2009). Evaulating aam fitting methods for facial expression recognition. In: Proceedings of the International Conference on Affective Computing and Intelligent Interaction, pp. 1–8. Amsterdam.

  • Aykut, M., & Ekinci, M. (2013). AAM-based palm segmentation in unrestricted backgrounds and various postures for palmprint recognition. Pattern Recognition Letters, 9(34), 955–962.

    Article  Google Scholar 

  • Baker, S., & Matthews, I. (2004). Lucas-kanade 20 years on: A unifying framework. Intertional Journal of Computer Vision, 56, 221–255.

    Article  Google Scholar 

  • Bathe, K., & Wilson, E. (1971). Numerical methods in finite element. Englewood Cliffs: Prentice-Hall.

    MATH  Google Scholar 

  • Chen, Y., Yu, F., & Ai, C. (2013). Sequential active appearance model based on online instance learning. IEEE Signal Processing Letter, 20(6), 567–570.

    Article  Google Scholar 

  • Chena, S., Tiana, Y., Liub, Q., & Metaxas, D. N. (2013). Recognizing expressions from face and body gesture by temporal normalized motion and appearance features. Image and Vision Computing, 31(2), 175–185.

    Article  Google Scholar 

  • Cootes, T. F., Edwards, G. J., & Taylor, C. J. (1998). Active appearance models. European Conference on Computer Vision, 2, 484–498.

    Google Scholar 

  • Cootes, T. F., & Taylor, C. J. (2006). An algorithm for tuning an active appearance model to new data. In: Proceeding of BMVC, pp. 919–928.

  • Du, Z., & Wang, Y. (2009). Dynamic facial expression synthesis by active appearance model. Journal of Computer-Aided Design and Computer Graphics, 21(11), 1590–1594.

    Google Scholar 

  • Goodall, C. (1991). Procrustes methods in the statistical analysis of shape. Journal of the Royal Statistical Society B, 53(2), 285–339.

    MATH  MathSciNet  Google Scholar 

  • Gross, R., Matthews, I., & Baker, S. (2005). Generic vs. person specific active appearance models. Image and Vision Computing, 23(11), 1080–1093.

    Article  Google Scholar 

  • Huang, D., & De la Torre, F. (2010). Bilinear kernel reduced rank regression for facial expression synthesis. In: ECCV 2010: Proceedings of European Conference on Computer Vision. Grete, Greece.

  • Ionita, M.C., Tresadern, P.A., & Cootes, T.F. (2011). Real time feature point tracking with automatic model selection. In: IEEE International Conference on Computer Vision Workshops, pp. 453–460.

  • Levy, A., & Lindenbaum, M. (2000). Sequential karhunen-loeve basis extraction and its application to images. IEEE Transactions on Image Processing, 9(8), 1371–1374.

    Article  MATH  Google Scholar 

  • Li, K., Xu, F., Wang, J., Dai, Q., & Liu, Y. (2012). A data-driven approach for facial expression synthesis in video. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 57–64.

  • Liu, X. (2007). Generic face alignment using boosted appearance model. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8.

  • Liu, X. (2009). Discriminative face alignment. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(11), 1941–1948.

    Article  Google Scholar 

  • Liu, X. (2010). Video-based face model fitting using adaptive active appearance model. Image and Vision Computing, 28, 1162–1172.

    Article  Google Scholar 

  • Matthews, I. (2004). The template update problem. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26, 810–815.

    Article  Google Scholar 

  • Matthews, I., & Baker, S. (2004). Active appearance models revisited. International Journal of Computer Vision, 60, 135–164.

    Article  Google Scholar 

  • Murtaza, M., Sharif, M., Raza, M., & Shah, J. H. (2013). Analysis of face recognition under varying facial expression: A survey. The International Arab Journal of Information Technology, 10(4), 378–388.

    Google Scholar 

  • Ross, D. A., Lim, J., Lin, R. S., & Yang, M. H. (2008). Incremental learning for robust visual tracking. International Journal of Computer Vision, 77, 125–141.

    Article  Google Scholar 

  • Saragih, J. M., Lucey, S., & Cohn, J. F. (2011). Deformable model fitting by regularized landmark mean-shift. International Journal of Computer Vision, 91, 200–215.

    Article  MATH  MathSciNet  Google Scholar 

  • Sung, J., & Kim, D. (2009). Adaptive active appearance model with incremental learning. Pattern Recognition Letters, 30(4), 359–367.

    Article  Google Scholar 

  • Van der Maaten, L., & Hendriks, E. (2010). Capturing appearance variation in active appearance models. In: Proceedings of 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 34–41. Los Alamitos: IEEE Computer Society Press.

  • Yu, H., Garrod, O. G. B., & Schyns, P. G. (2012). Perception-driven facial expression synthesis. Computers and Graphics, 36(3), 152–162.

    Article  Google Scholar 

  • Zhao, X., Shan, S., Chai, X., & Chen, X. (2013). Locality-constrained active appearance model. In: Computer Vision-ACCV 2012, Lecture Notes in Computer Science, vol. 7724, pp. 636–647.

  • Zhong, B., Yao, H., Chen, S., Ji, R., Chin, T. J., & Wang, H. (2014). Visual tracking via weakly supervised learning from multiple imperfect oracles. Pattern Recognition, 47(3), 1395–1410.

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ying Chen.

Additional information

This work was supported by the National Natural Science Foundation of China (61104213), Natural Science Foundation of Jiangsu Province (BK2011146).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, Y., Hua, C. & Guo, X. Face model fitting on video sequences based on incremental virtual active appearance model. Multidim Syst Sign Process 28, 1–21 (2017). https://doi.org/10.1007/s11045-015-0326-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11045-015-0326-7

Keywords

Navigation