Abstract
In this paper we present a fusion methodology for combining prompted text-dependent and text-independent speaker verification operation modalities. The fusion is performed in score level extracted from GMM-UBM single mode speaker verification engines using several machine learning algorithms for classification. In order to improve the performance we apply clustering of the score-based data before the classification stage. The experimental results indicated that the fusion of the two operation modes improves the speaker verification performance both in terms of sensitivity and specificity by approximately 2 % and 1.5 % respectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Aronowitz, H., Hoory, R., Pelecanos, J., Nahamoo, D.: New developments in voice biometrics for user authentication. In: Proceedings of the Interspeech (2011)
Hébert, M., Sondhi, M., Huang, Y.: Text-Dependent Speaker Recognition. Book Section. In: Springer Handbook of Speech Processing, pp. 743–762 (2008)
Larcher, A., Kong, A.L., Bin, M., Haizhou, L.: Text-dependent speaker verification: Classifiers, databases and RSR2015. Speech Commun. 60, 56–77 (2014)
Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted gaussian mixture models. Digit. Signal Proc. 10(1–3), 19–41 (2000)
Safavi, S., Hanani, A., Russell, M., Jancovic, P., Carey, M.J.: Contrasting the effects of different frequency bands on speaker and accent identification. IEEE Signal Proc. Lett. 19(12), 829–832 (2012)
Safavi, S., Najafian, M., Hanani, A., Russell, M.J., Jancovic, P., Carey, M.J.: Speaker Recognition for Children’s Speech. In: Interspeech, pp. 1836–1839 (2012)
Ganchev, T., Siafarikas, M., Mporas, I., Stoyanova, T.: Wavelet basis selection for enhanced speech parameterization in speaker verification. Int. J. Speech Technol. 17(1), 27–36 (2014)
Davis, S., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust. Speech Signal Proc. 28(4), 357–366 (1980)
Furui, S.: Cepstral analysis technique for automatic speaker verification. IEEE Trans. Acoust. Speech Signal Proc. 29(2), 254–272 (1981)
Reynolds, D.A., Rose, R.C.: Robust text-independent speaker identification using Gaussian mixture speaker models. IEEE Trans. Speech Audio Proc. 3(1), 72–83 (1995)
Campbell, W.M., Campbell, J.P., Reynolds, D.A., Jones, D.A., Leek, T.R.: Phonetic speaker recognition with support vector machines. In: Neural Information Processing Systems 16, Neural Information Processing Systems, NIPS 2003, 8–13 December 2003, Vancouver and Whistler, British Columbia, Canada (2003)
Campbell, W.M., Sturim, D.E., Reynolds, D.A.: Support vector machines using GMM supervectors for speaker verification. IEEE Signal Proc. Lett. 13(5), 308–311 (2006)
Kenny, P., Boulianne, G., Ouellet, P., Dumouchel, P.: Joint factor analysis versus eigenchannels in speaker recognition. IEEE Trans. Audio Speech Lang. Proc. 15(4), 1435–1447 (2007)
Campbell, J.P., Reynolds, D.A.: Corpora for the evaluation of speaker recognition systems. In: Proceedings of ICASSP 1999, vol. 2, pp. 829–832 (1999)
Hermansky, H., Morgan, N.: RASTA processing of speech. IEEE Trans. Speech Audio Proc. 2(4), 578–589 (1994)
Witten, I.H., Frank, E., Hall, M.A.: Data Mining, Practical machine learning tools and techniques, 3rd edn. Morgan Kaufmann, San Francisco (2011)
Acknowledgement
This work was partially supported by the H2020 OCTAVE Project entitled “Objective Control for TAlker VErification” funded by the EC with Grand Agreement number 647850. The authors would like to thank Dr Md Sahidullah, Dr Nicholas Evans and Dr Tomi Kinnunen for their support in this work.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Mporas, I., Safavi, S., Sotudeh, R. (2016). Improving Robustness of Speaker Verification by Fusion of Prompted Text-Dependent and Text-Independent Operation Modalities. In: Ronzhin, A., Potapova, R., Németh, G. (eds) Speech and Computer. SPECOM 2016. Lecture Notes in Computer Science(), vol 9811. Springer, Cham. https://doi.org/10.1007/978-3-319-43958-7_45
Download citation
DOI: https://doi.org/10.1007/978-3-319-43958-7_45
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-43957-0
Online ISBN: 978-3-319-43958-7
eBook Packages: Computer ScienceComputer Science (R0)