Skip to main content
Log in

On Performance Improvement of a Speaker Verification System Using Vector Quantization, Cohorts and Hybrid Cohort-World Models

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

This paper presents the use of distance normalization techniques in order to improve speaker verification system performance. These techniques provide a dynamic threshold that compensates for the trial-to-trial variations and replaces the fixed threshold used in the classical speaker verification approach. Two methods are described: the cohort model normalization and a new and original hybrid cohort-world model normalization. These methods are compared from the point of view of storage space requirements and computational effort. Two algorithms are proposed: one uses existing user models, and the other creates new models. The algorithms were evaluated using the YOHO database and a proprietary database. The results showed that using these methods, the errors of false rejection are significantly reduced for a constant false acceptance error, when the cohort size is increasing. The algorithms also involve fewer computational resources than other algorithms, making them more suitable for commercial application.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Beigi, H., Maes, S., and Sorensen, J. (1998). A distance measure between collections of distributions and its application to speaker recognition. IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP'98 Proceedings, vol. 2, pp. 753–756.

    Google Scholar 

  • Besacier, L. and Bonastre, J.F. (1998). Frame pruning for speaker recognition. IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP'98 Proceedings, pp. 211–214.

  • Bryan, L.P. and Hansen, J.H.L. (1999). An experimental study of speaker verification sensitivity to computer voice-altered imposters. IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP'99 Proceedings, vol. 2, pp. 833–836.

    Google Scholar 

  • Campbell, W.M. (2001). A sequence kernel and its application to speaker recognition. Neural Information Processing System, NIPS 2001 Proceedings, Vancouver, Canada.

  • Carey, M.J. and Parris, E.S. (1992). Speaker verification using connected words. Proc. Institute of Acoustics, vol. 14, part 6, pp. 95–100.

    Google Scholar 

  • Che, C.W., Lin, Q., and Yuk, D.-S. (1996). AHMMapproach to textprompted speaker verification. IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP'96 Proceedings, vol. 2, pp. 673–676.

    Google Scholar 

  • Furui, S. (1994). An overview of speaker recognition technology. Workshop on Automatic Speaker Recognition, Identification and Verification, ESCA'94 Proceedings, pp. 1–9.

  • Isobe, T. and Takahashi, J. (1999). A new cohort normalization using local acoustic information for speaker verification. IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP'99 Proceedings, vol. 2, pp. 841–844.

    Google Scholar 

  • James, D., Hutter, H.P., and Bimbot, F. (1997). The CAVE speaker verification project-Experiments on the YOHO and SESP Corpora. International Conference on Audio-and Video-Based Biometric Personal Authenticatio, AVBPA'97 Proceedings, Crans-Montana, Switzerland.

  • Liu, C.-S., Wang, H.-C., and Lee, C. (1996). Speaker verification using normalized log-likelihood score. IEEE Transactions on Speech and Audio Processing, 4(1):56–64.

    Google Scholar 

  • Nakagawa, S. and Markov, K.P. (1997). Speaker verification using frame and utterance level likelihood normalization. SPCHL'97 Proceedings, vol. 2, pp. 1087–1091.

    Google Scholar 

  • Rosenberg, A.E., DeLong, J., Lee, C.-H., Juang, B.-H., and Soong, F.-K. (1992). The use of cohort normalized scores for speaker verification. International Conference on Spoken Language Processing, ICSLP'92 Proceedings, pp. 599–602.

  • Rosenberg, A.E. and Parthasarathy, S. (1996). Speaker background models for connected digit password speaker verification. IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP'96 Proceedings, pp. 81–84.

  • Sonmez, K., Heck, L., and Weintraub, M. (2000). Multiple speaker tracking and detection: Handset normalization and duration scoring. Digital Signal Processing, 10(1-3):133–143.

    Google Scholar 

  • Thyes, O., Kuhn, R., Nguyen, P., and Junqua, J.-C. (2000). Speaker identification and verification using eigenvoices. International Conference on Spoken Language Processing, ICSLP 2000 Proceedings, Beijing.

  • Yu, G. and Gish, H. (1993). Identification of speakers engaged in dialog. IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP'93 Proceedings, vol. II, pp. 383–386.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Burileanu, C., Moraru, D., Bojan, L. et al. On Performance Improvement of a Speaker Verification System Using Vector Quantization, Cohorts and Hybrid Cohort-World Models. International Journal of Speech Technology 5, 247–257 (2002). https://doi.org/10.1023/A:1020244924468

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1020244924468

Navigation