Advertisement

Evaluations of Automatic Speaker Classification Systems

  • Alvin F. Martin
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4343)

Abstract

The annual NIST Speaker Recognition Evaluations (SREs) from 1996 to 2006 have been internationally recognized as the leading source or performance evaluation of research systems in the speaker classification field. We discuss how these evaluations have developed and been conducted and the performance measures used. We consider the key factors that have been studied for their effect on performance, including training and test durations, channel variability, and speaker variability. We examine the extent to which progress has been observed in state-of-the-art performance. We also consider how the technology has changed over the past decade, other evaluations that have been conducted or planned, and where the field may be headed in the future.

Keywords

speaker recognition speaker detection speaker classification speaker identification speaker evaluation NIST evaluations NIST SRE DET curves 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Doddington, G.: Speech Recognition: Â turning theory to practice. IEEE Spectrum 18(9), 26–32 (1981)Google Scholar
  2. 2.
    Garofolo, J.S., Lamel, L.F., Fisher, W.M., Fiscus, J.G., Pallet, D.S., Dahlgren, N.L.: The DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus CDROM. Technical report, National Institute of Standards and Technology, Gaithersburg (1993)Google Scholar
  3. 3.
    Brümmer, N., Du Preez, J.: Application-independent evaluation of speaker detection. Computer Speech & Language 20(2-3), 230–275 (2006)CrossRefGoogle Scholar
  4. 4.
    Doddington, G.: Speaker recognition based on idiolectal differences between speakers. In: Eurospeech 2001. Proceedings of the 7th European Conference on Speech Communication and Technology. Aalborg, Denmark, Vol. 4, pp. 2521–2524 (2001)Google Scholar
  5. 5.
    Andrews, W.D., Kohler, M.A., Campbell, J.P., Godfrey, J.J.: Phonetic, idiolectal, and acoustic speaker recognition. In: Odyssey 2001. Proceedings of the the Odyssey Speaker Recognition Workshop, Chania, Crete, Greece, pp. 55–63 (2001)Google Scholar
  6. 6.
    Doddington, G., Liggett, W., Martin, A., Przybocki, M., Reynolds, D.: Sheep, Goats, Lambs and Woves: A Statistical Analysis of Speaker Performance in the NIST 1998 Speaker Recognition Evaluation. In: ICSLP 1998. Proceedings of the 5th International Conference on Spoken Language Processing (1998)Google Scholar
  7. 7.
    Fiscus, J.: The NIST Rich Transcription Evaluation Series, NIST web-site (2007), http://nist.gov/speech/tests/rt/
  8. 8.
    Fiscus, J., Radde, N., Garofolo, J.S., Le, A., Ajot, J., Laprun, C.: The Rich Transcription 2005 Spring Meeting Recognition Evaluation. In: Renals, S., Bengio, S. (eds.) MLMI 2005. LNCS, vol. 3869, Springer, Heidelberg (2006)CrossRefGoogle Scholar
  9. 9.
    CLEAR2007: Classification of Events, Activities and Relationships, Evaluation and Workshop (2007), http://www.clear-evaluation.org/
  10. 10.
    Van Leeuwen, D.A., Martin, A.F., Przybocki, M.A., Boutenc, J.S.: NIST and NFI-TNO evaluations of automatic speaker recognition. Computer Speech & Language 20(2–3), 128–158 (2006)Google Scholar
  11. 11.
    Hansen, E.G., Slyh, R.E., Anderson, T.R.: Formant and F0 Features for Speaker Verification. In: Odyssey 2001. Proceedings of the the Odyssey Speaker Recognition Workshop, Chania, Crete, Greece, pp. 25–29 (2001)Google Scholar
  12. 12.
    Przybocki, M.A., Martin, A.F.: Odyssey Text Independent Evaluation Data. In: Odyssey 2001. Proceedings of the the Odyssey Speaker Recognition Workshop, Chania, Crete, Greece, pp. 21–23 (2001)Google Scholar
  13. 13.
    Higgins, A.L., Bahler, L.G.: ITT SpeakerKey Evaluation. In: Odyssey 2001. Proceedings of the the Odyssey Speaker Recognition Workshop, Chania, Crete, Greece, pp. 31–32 (2001)Google Scholar
  14. 14.
    Toledo-Ronen, O.: Speech Detection for Text-Dependent Speaker Verification. In: Odyssey 2001. Proceedings of the the Odyssey Speaker Recognition Workshop, Chania, Crete, Greece, pp. 33–36 (2001)Google Scholar
  15. 15.
    BioSecure: BioSecure Evaluation Campaign (2007), http://www.biosecure.info/eval/
  16. 16.
    Campbell, J.P., Nakasone, H., Cieri, C., Miller, D., Walker, K., Martin, A.F., Przybocki, M.A.: The MMSR Bilingual and Crosschannel Corpora for Speaker Recognition. In: LREC 2004. Proceedings of the 4th International Conference on Language Resources and Evaluation, Lisbon, Portugal, [Alvin: wasn’t this one published at the Odyssey 04 workshop raher than LREC?] (2004)Google Scholar
  17. 17.
    Cieri, C., Andrew, W., Campbell, J.P., Doddington, G., Godfrey, J., Huang, S., Libermann, M., Martin, A., Nakasone, H., Przybocki, M., Walter, K.: The Mixer and Transcript Reading Corpora: Resources for Multilingual Crosschannel Speaker Recognition Research. In: LREC 2006. Proceedings of the 5th International Conference on Language Resources and Evaluation, Genoa, Italy (2006)Google Scholar
  18. 18.
    Reynolds, D.A., Doddington, G., Przybocki, M., Marin, A.: The NIST speaker recognition evaluation - overview, methodology, systems, results, perspectives. Speech Communication 31(2-3), 225–254 (2000)CrossRefGoogle Scholar
  19. 19.
    Fiscus, J., Ajot, J., Michel, M., Garofolo, J.S.: The Rich Transcription 2006 Spring Meeting Recognition Evaluation. In: Renals, S., Bengio, S., Fiscus, J.G. (eds.) MLMI 2006. LNCS, vol. 4299, Springer, Heidelberg (2006)CrossRefGoogle Scholar
  20. 20.
    Linguistic Data Consortium: Catalog of Speaker Recognition Corpora (2007), http://www.ldc.upenn.edu/Catalog/
  21. 21.
    Martin, A.F., Przybocki, M.A.: The NIST 1999 Speaker Recognition Evaluation - An Overview. Digital Signal Processing 10, 1–18 (2000)CrossRefGoogle Scholar
  22. 22.
    Martin, A.F., Przybocki, M.A.: The NIST Speaker Recognition Evaluations: 1996-2001. In: Odyssey 2001. Proceedings of the the Odyssey Speaker Recognition Workshop, Chania, Crete, Greece, pp. 39–43 (2001)Google Scholar
  23. 23.
    Martin, A.F., Przybocki, M.A., Doddington, G.: Speaker Recognition in a Multi-Speaker Environment. In: Eurospeech 2001. Proceedings of the 7th European Conference on Speech Communication and Technology, Aalborg, Denmark, vol. 2, pp. 787–790 (2001)Google Scholar
  24. 24.
    Martin, A., Miller, D., Przybocki, M., Campbell, J.: Conversational Telephone Speech Corpus Collection for the NIST Speaker Recognition Evaluation 2004. In: LREC 2004. Proceedings of the 4th International Conference on Language Resources and Evaluation, Lisbon, Portugal (2004)Google Scholar
  25. 25.
    Martin, A.F., Doddington, G., Kamm, T., Ordowski, M., Przybocki, M.: The DET Curve in Assessment of Detection Task Performance. In: Eurospeech 1997. Proceedings of the 5th European Conference on Speech Communication and Technology. Rhodes, Greece, vol. 4, pp. 1985–1988 (1997)Google Scholar
  26. 26.
    Martin, A.F., Przybocki, M.A., Campbell, J.P.: The NIST speaker recognition evaluation program. In: Wayman, J., Jain, A.K., Wayman, D.M. (eds.) Biometric Systems: Technology, Design and Performance Evaluation, pp. 241–262. Springer, Heidelberg (2005)Google Scholar
  27. 27.
    Martin, A.F., Przybocki, M.A., Le, A.N.: The NIST Speaker Recognition Evaluation Series, NIST web-site (2007), http://www.nist.gov/speech/tests/spk/
  28. 28.
    Philipps, P.J., Martin, A., Wilson, C., Przybocki, M.: An introduction to evaluating biometric systems. IEEE Computer 33(2), 56–63 (2000)Google Scholar
  29. 29.
    Przybocki, M.A., Martin, A.F.: NIST speaker recognition evaluation. In: RLA2C 1998. Proceedings of the Workshop on Speaker Recognition and its Commercial and Forensic Applications, Avignon, pp. 120–123 (1998)Google Scholar
  30. 30.
    Przybocki, M.A., Martin, A.F.: NIST Speaker Recognition Evaluation Chronicles. In: Odyssey 2004. Proceedings of the ODYSSEY Speaker and Language Recognition Workshop, Toledo, Spain (2004)Google Scholar
  31. 31.
    Przybocki, M.A., Martin, A.F.: NIST’s Assessment of Text Independent Speaker Recognition Performance. In: The Advent of Biometrics on the Internet: Proceedings of the COST 275 Workshop, Rome, Italy, pp. 25–32 (2000)Google Scholar
  32. 32.
    Przybocki, M.A., Martin, A.F., Le, A.N.: NIST Speaker Recognition Evaluation Chronicles Part 2. In: Odyssey 2006. Proceedings of the ODYSSEY Speaker and Language Recognition Workshop, San Juan, Puerto Rico (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Alvin F. Martin
    • 1
  1. 1.National Institute of Standards and Technology, 100 Bureau Drive Stop 8940, Gaithersburg, MD 20899-8940 

Personalised recommendations