Abstract
In an environment that is highly unpredictable in nature, a speaker verification system needs a good background model to carry out the verification task reliably. In this paper, a 1024-component UBM is created by pooling a noisy speech UBM and clean speech UBM. This pooled UBM is used for speaker adaptation as well as for speaker testing. Experimental results have shown minor improvement with pooled UBM as compared to baseline UBM. In addition to this, a score-level solution is proposed by means of cohort model selection using HT-normalization to reduce undesirable variation arising from acoustically mismatched devices and environment. For cohort selection a simple distance metric based on similarity modeling of each client speaker is used. The normalization parameters computed over a group of speakers (cohort) having some common characteristics are used in the final score calculation. Experiments on a noisy corpus has shown reasonable improvements in performance, when normalization parameters were taken from a cohort than from a general group. Experiments have shown a recognition rate of 90.58 and 87.64% for matched handset type in office and roadside environment respectively.
References
Reynolds, D.A.: Comparison of background normalization methods for text-independent speaker verification. In: Proceedings of the European Conference on Speech Communication and Technology, pp. 963–967 (1997)
Matsui, T., Furui, S.: Likelihood normalization for speaker verification using a phoneme- and speaker-independent model. Speech Commun. 17, 109–116 (1995)
Rosenberg, A.E., Parthasarathy, S.: Speaker background models for connected digit password speaker verification. In: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, pp. 81–84 (1996)
Auckenthaler, C., Thomas. L.: Score normalization for text-independent speaker verification systems. Digit. Sig. Process. 10(1–3) (2000)
Reynolds, D.A.: Comparison of background normalization methods for text-independent speaker verification. EUROSPEECH (1997)
Higgins, A., Bahler, L., Porter, J.: Speaker verification using randomized phrase prompting. Digit. Sig. Proc. 1, 89–106 (1991)
Rosenberg, A.E., DeLong, J., Lee, C.-H., Juang, B.-H., Soong, F.K.: The use of cohort normalized scores for speaker verification. Proc. ICSLP 92, 599–602 (1992)
Tran, D., Wagner, M.: A proposed likelihood transformation for speaker verification. In: Proceedings of ICASSP 2000, Turkey (2000)
Reynolds, D.A.: Speaker identification and verification using Gaussian mixture speaker models. Speech Commun. 17, 91–108 (1995)
Colombi, J.M., Reider, J.S., Campbell, J.P.: Allowing good impostors to test. In: Conference Record of the Thirty-First Asilomar Conference on Signals, Systems & Computers, 1997, pp. 296–300 (1998)
Reynolds, D.A., Rose, R.: Robust text-independent speaker identification using Gaussian mixture speaker models. IEEE Trans. Speech Audio Process. 3, 72–83 (1995)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Das, P. (2018). A Score-Level Solution to Speaker Verification Using UBM Pooling and Adaptive Cohort Selection. In: Kalam, A., Das, S., Sharma, K. (eds) Advances in Electronics, Communication and Computing. Lecture Notes in Electrical Engineering, vol 443. Springer, Singapore. https://doi.org/10.1007/978-981-10-4765-7_49
Download citation
DOI: https://doi.org/10.1007/978-981-10-4765-7_49
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-4764-0
Online ISBN: 978-981-10-4765-7
eBook Packages: EngineeringEngineering (R0)