MIT Lincoln Laboratory Multimodal Person Identification System in the CLEAR 2007 Evaluation

  • Kevin Brady
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4625)


A description of the MIT Lincoln Laboratory system used in the person identification task of the recent CLEAR 2007 Evaluation is documented in this paper. This task is broken into audio, visual, and multimodal subtasks. The audio identification system utilizes both a GMM and a SVM subsystem, while the visual (face) identification system utilizes an appearance-based [Kernel] approach for identification. The audio channels, originating from a microphone array, were preprocessed with beamforming and noise preprocessing.


Gaussian Mixture Model (GMM) Support Vector Machine (SVM) Person Identification Kernel methods 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Turk, M., Pentland, A.: Eigenfaces for Recognition. Journal of Cognitive Neurosciences 3(1), 71–86 (1991)CrossRefGoogle Scholar
  2. 2.
    Belhumeur, V., Hespanha, J., Kriegman, D.: Eigenfaces vs. Fisherfaces: Recognition using Class Specific Linear Projection. IEEE Trans. PAMI 19(7), 711–720 (1997)Google Scholar
  3. 3.
    Yang, M.H.: Kernel Eigenfaces vs. Kernel Fisherfaces: Face Recognition using Kernel Mehods. In: Proc. of IEEE Int. Conf. on Face and Gesture Recognition, Washington DC, USA (May 2002)Google Scholar
  4. 4.
    Moghaddam, B., Pentland, A.: Probabilistic Visual Learning for Object Representation. IEEE PAMI 19(7), 696–710 (1997)Google Scholar
  5. 5.
    Anguera, X., Wooters, C., Hernando, J.: Speaker diarization for multi-party meetings using acoustic fusion. In: IEEE Automatic Speech Recognition and Understanding Workshop, Puerto Rico, USA (2005)Google Scholar
  6. 6.
    Martin, R., Cox, R.: New Speech Enhancement Techniques for Low Bit Rate Speech Coding. In: Proc IEEE Workshop on Speech Coding (1999)Google Scholar
  7. 7.
    Campbell, W., Brady, K., Campbell, J., Reynolds, D., Granville, R.: Understanding Scores in Forensic Speaker Recognition. In: IEEE Speaker Odyssey, Puerto Rico, USA (June 2006)Google Scholar
  8. 8.
    Reynolds, D., Quatieri, T.F., Dunn, R.B.: Speaker Verification Using Adapted Gaussian Mixture Models. Digital Signal Processing 10(1-3), 19–41 (2000)CrossRefGoogle Scholar
  9. 9.
    Campbell, W., Campbell, J., eynolds, D., Singer, E., Torres, P.: Support Vector Machines for Speaker and Language Recognition. Computer Speech and Language 20(2-3), 210–229 (2006)CrossRefGoogle Scholar
  10. 10.
    Messer, K., et al.: XM2VTSDB: The Extended M2VTS Database. In: AVBPA, Washington DC, USA (1999)Google Scholar
  11. 11.
    Chibelushi, C.C., Deravi, F., Mason, J.S.D.: A Review of Speech-based Bimodal Recognition. IEEE Trans. On Multimedia 4(1), 23–37 (2002)CrossRefGoogle Scholar
  12. 12.
    Sanderson, C., Paliwal, K.K.: Identity Verification Using Speech and Face Information. Digital Signal Processing Journal 14, 449–480 (2004)CrossRefGoogle Scholar
  13. 13.
    Campbell, J.P.: Seaker Recognition: A Tutorial. Proc. of the IEEE 85(9), 1437–1462 (An Invited Paper, 1997)CrossRefGoogle Scholar
  14. 14.
    Mostefa, D., Potamianos, G., Casas, J., Cristoforetti, L., Pnevmatikakis, A., Burger, S., Stiefelhagen, R., Bernardin, K., Rochet, C.: The CHIL Audiovisual Corpus for Lecture and Meeting Analysis inside Smart Rooms. Journal for Language Resources and Evaluation (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Kevin Brady
    • 1
  1. 1.MIT Lincoln Laboratory Lexington MassachusettsUSA

Personalised recommendations