Audio-Visual Identity Verification and Robustness to Imposture

  • Walid Karam
  • Chafic Mokbel
  • Hanna Greige
  • Gérard Chollet
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5558)


The robustness of talking-face identity verification (IV) systems is best evaluated by monitoring their behavior under impostor attacks. We propose a scenario where the impostor uses a still face picture and a sample of speech of the genuine client to transform his/her speech and visual appearance into that of the target client. We propose MixTrans, an original text-independent technique for voice transformation in the cepstral domain, which allows a transformed audio signal to be estimated and reconstructed in the temporal domain. We also propose a face transformation technique that allows a frontal face image of a client to be animated, using principal warps to deform defined MPEG-4 facial feature points based on determined facial animation parameters. The robustness of the talking-face IV system is evaluated under these attacks. Results on the BANCA talking-face database clearly show that such attacks represent a serious challenge and a security threat to IV systems.


Identity verification audio-visual forgery talking-face imposture voice conversion face animation biometric verification robustness 


  1. 1.
    Reallusion crazytalk animation studio software,
  2. 2.
    Blouet, R., Mokbel, C., Mokbel, H., Soto, E.S., Chollet, G., Greige, H.: Becars: A free software for speaker verification. In: Proc. ODYSSEY 2004, pp. 145–148 (2004) Google Scholar
  3. 3.
    Bookstein, F.: Principal warps: Thin-plate splines and the decomposition of deformations. IEEE Transactions on Pattern Analysis and Machine Intelligence 11(6), 567–585 (1989) Google Scholar
  4. 4.
    Bredin, H., Chollet, G.: Making talking-face authentication robust to deliberate imposture. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2008), pp. 1693–1696 (2008) Google Scholar
  5. 5.
    Duchon, J.: Interpolation des fonctions de deux variables suivant le principe de la flexion des plaques minces. R.A.I.R.O. Analyse numérique 10, 5–12 (1976) Google Scholar
  6. 6.
    Fauve, B., Bredin, H., Karam, W., Verdet, F., Mayoue, A., Chollet, G., Hennebert, J., Lewis, R., Mason, J., Mokbel, C., Petrovska, D.: Some results from the biosecure talking face evaluation campaign. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2008), vol. 1, pp. 4137–4140 (2008) Google Scholar
  7. 7.
    Lienhart, R., Maydt, J.: An extended set of haar-like features for rapid object detection. In: Proceedings of the International Conference on Image Processing, vol. 1, pp. I–900–I–903(2002) Google Scholar
  8. 8.
    Popovici, V., Thiran, J., Bailly-Bailliere, E., Bengio, S., Bimbot, F., Hamouz, M., Kittler, J., Mariethoz, J., Matas, J., Messer, K., Ruiz, B., Poiree, F.: The BANCA database and evaluation protocol. In: Kittler, J., Nixon, M.S. (eds.) AVBPA 2003. LNCS, vol. 2688, pp. 625–638. Springer, Heidelberg (2003) Google Scholar
  9. 9.
    Sanderson, C., Paliwal, K.K.: Fast feature extraction method for robust face verification. IEE Electronics Letters 38(25), 1648–1650 (2002) Google Scholar
  10. 10.
    Stylianou, Y., Cappe, O., Moulines, E.: Continuous probabilistic transform for voice conversion. IEEE Transactions on Speech and Audio Processing 15(6), 131–142 (1998) Google Scholar
  11. 11.
    Tekalp, A., Ostermann, J.: Face and 2-d mesh animation in mpeg-4. Image Communication Journal 15(4-5), 387–421 (2000) Google Scholar
  12. 12.
    Verdet, F., Hennebert, J.: Impostures of talking face systems using automatic face animation. In: Proceedings of the IEEE Conference on Biometrics: Theory, Applications and Systems (BTAS 2008) (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Walid Karam
    • 1
    • 2
  • Chafic Mokbel
    • 1
  • Hanna Greige
    • 1
  • Gérard Chollet
    • 2
  1. 1.University of BalamandAl-KurahLebanon
  2. 2.CNRS-LTCI, TELECOM ParisTechParisFrance

Personalised recommendations