Advertisement

Fusion of Speech, Faces and Text for Person Identification in TV Broadcast

  • Hervé Bredin
  • Johann Poignant
  • Makarand Tapaswi
  • Guillaume Fortier
  • Viet Bac Le
  • Thibault Napoleon
  • Hua Gao
  • Claude Barras
  • Sophie Rosset
  • Laurent Besacier
  • Jakob Verbeek
  • Georges Quénot
  • Frédéric Jurie
  • Hazim Kemal Ekenel
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7585)

Abstract

The Repere challenge is a project aiming at the evaluation of systems for supervised and unsupervised multimodal recognition of people in TV broadcast. In this paper, we describe, evaluate and discuss QCompere consortium submissions to the 2012 Repere evaluation campaign dry-run. Speaker identification (and face recognition) can be greatly improved when combined with name detection through video optical character recognition. Moreover, we show that unsupervised multimodal person recognition systems can achieve performance nearly as good as supervised monomodal ones (with several hundreds of identity models).

Keywords

Face Recognition Speaker Recognition Face Track Speaker Model Person Recognition 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Zhao, W., Chellappa, R., Phillips, P.J., Rosenfeld, A.: Face Recognition: a Literature Survey. ACM Comput. Surv. 35(4), 399–458 (2003)CrossRefGoogle Scholar
  2. 2.
    Bimbot, F., Bonastre, J.F., Fredouille, C., Gravier, G., Magrin-Chagnolleau, I., Meignier, S., Merlin, T., Ortega-García, J., Petrovska-Delacrétaz, D., Reynolds, D.A.: A Tutorial on Text-Independent Speaker Verification. EURASIP J. Appl. Signal Process. 2004, 430–451 (2004)CrossRefGoogle Scholar
  3. 3.
    Barras, C., Zhu, X., Meignier, S., Gauvain, J.L.: Multi-Stage Speaker Diarization of Broadcast News. IEEE Transactions on Audio, Speech and Language Processing 14(5), 1505–1512 (2006)CrossRefGoogle Scholar
  4. 4.
    Le, V.B., Barras, C., Ferràs, M.: On the use of GSV-SVM for Speaker Diarization and Tracking. In: Proc. Odyssey 2010 - The Speaker and Language Recognition Workshop, Brno, Czech Republic, pp. 146–150 (June 2010)Google Scholar
  5. 5.
    Baeuml, M., Bernardin, K., Fischer, M., Ekenel, H., Stiefelhagen, R.: Multi-Pose Face Recognition for Person Retrieval in Camera Networks. In: Advanced Video and Signal-based Surveillance (2010)Google Scholar
  6. 6.
    Ekenel, H., Stiefelhagen, R.: Analysis of Local Appearance Based Face Recognition: Effects of Feature Selection and Feature Normalization. In: CVPR Biometrics Workshop (2006)Google Scholar
  7. 7.
    Everingham, M., Sivic, J., Zisserman, A.: “Hello! My name is... Buffy” – Automatic Naming of Characters in TV video. In: British Machine Vision Conference (2006)Google Scholar
  8. 8.
    Dalal, N., Triggs, B.: Histograms of Oriented Gradients for Human Detection. In: International Conference on Computer Vision & Pattern Recognition, pp. 886–893 (2005)Google Scholar
  9. 9.
    Guillaumin, M., Mensink, T., Verbeek, J., Schmid, C.: Face Recognition from Caption-based Supervision. International Journal of Computer Vision 96(1), 64–82 (2012)MathSciNetMATHCrossRefGoogle Scholar
  10. 10.
    Poignant, J., Besacier, L., Quénot, G., Thollard, F.: From Text Detection in Videos to Person Identification. In: IEEE ICME, Melbourne, Australia (2012)Google Scholar
  11. 11.
    Gauvain, J., Lamel, L., Adda, G.: The LIMSI Broadcast News Transcription System. Speech Communication 37(1-2), 89–109 (2002)MATHCrossRefGoogle Scholar
  12. 12.
    Dinarelli, M., Rosset, S.: Models Cascade for Tree-Structured Named Entity Detection. In: Proceedings of International Joint Conference of Natural Language Processing (IJCNLP), Chiang Mai, Thailand (November 2011)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Hervé Bredin
    • 1
  • Johann Poignant
    • 2
  • Makarand Tapaswi
    • 3
  • Guillaume Fortier
    • 4
  • Viet Bac Le
    • 5
  • Thibault Napoleon
    • 6
  • Hua Gao
    • 3
  • Claude Barras
    • 1
  • Sophie Rosset
    • 1
  • Laurent Besacier
    • 2
  • Jakob Verbeek
    • 4
  • Georges Quénot
    • 2
  • Frédéric Jurie
    • 6
  • Hazim Kemal Ekenel
    • 3
  1. 1.CNRS-LIMSI UPR 3251Univ Paris-SudOrsayFrance
  2. 2.UJF-Grenoble 1 / UPMF-Grenoble 2 / Grenoble INP / CNRS-LIG UMR 5217GrenobleFrance
  3. 3.Karlsruher Institut fur TechnologieKarlsruheGermany
  4. 4.INRIA Rhone-AlpesMontbonnotFrance
  5. 5.Vocapia ResearchParc Orsay UniversitéOrsayFrance
  6. 6.Université de Caen / GREYC UMR 6072Caen CedexFrance

Personalised recommendations