Audio-video Features and Fusion

Audio- and Video-based Biometric Person Authentication

Volume 1206 of the series Lecture Notes in Computer Science pp 319-326


Acoustic-labial speaker verification

  • Pierre JourlinAffiliated withIDIAPLIA
  • , Juergen LuettinAffiliated withIDIAP
  • , Dominique GenoudAffiliated withIDIAP
  • , Hubert WassnerAffiliated withIDIAP

* Final gross prices may vary according to local VAT.

Get Access


This paper describes a multimodal approach for speaker verification. The system consists of two classifiers, one using visual features and the other using acoustic features. A lip tracker is used to extract visual information from the speaking face which provides shape and intensity features. We describe an approach for normalizing and mapping different modalities onto a common confidence interval. We also describe a novel method for integrating the scores of multiple classifiers. Verification experiments are reported for the individual modalities and for the combined classifier. The performance of the integrated system out-performed each sub-system and reduced the false acceptance rate of the acoustic sub-system from 2.3% to 0.5%.