Abstract
The speaker verification is based on variations in formant frequencies at stationary fragments and transient processes of vowels, the spectral features of fricative sounds, and the duration of speech segments. The best features are chosen for each word from the fixed list of Russian numerals ranging from zero to nine. The password phrase is randomly generated by the system at each verification. The compensation for dynamic noise and the counteraction with respect to interference using the reproduction of the intercepted and recorded speech are provided by the repeated reproduction of several words. The total error probabilities for male and female voices are 0.006 and 0.025%, respectively, for 30 million tests, 429 speakers, and a maximum length of the password phrase of 10 words. Note that the probabilities of false identification and false rejection are almost equal.
Similar content being viewed by others
References
A. Jain, A. Ross, and S. Pankanti, “Biometrics: A Tool for Information Security,” IEEE Trans. on Inf. Forensics and Security 1, 125–143 (2006).
http://www.nist.gov/speech/tests/sre/2008/official-results/index.html.
G. S. Ramishvili, Automatic Voice Recognition (Radio i svyaz’, Moscow, 1981) [in Russian].
L. G. Kersta, “Voiceprint Identification,” Nature 196(4861) (1962).
S. Furui, “Cepstral Analysis Techniques for Automatic Speaker Verification,” IEEE Trans. Acoust., Speech, Signal Process. 27, 254–277 (1981).
L. R. Rabiner and B.-H. Juang, Fundamentals of Speech Recognition. PTR Prentice Hall (Englewood Cliffs, New York, 1993).
D. A. Reynolds, “Speaker Identification and Verification Using Gaussian Mixture Speaker Models,” Speech Commun. 17, 91–108 (1995).
D. A. Reynolds, T. F. Quatieri, and R. B. Dunn, Speaker Verification Using Adapted Gaussian Mixture Models. Digital Signal Process. 10, 19–41 (2000).
S. Furui, “An Overview of Speaker Recognition Technology,” in Automatic Speech and Speaker Recognition, Ed. by C. H. Lee, F. K. Soong, K. K. Paliwal, (Kluwer Academic, Boston, 1996), Ch. 2.
B. Yegnanaryana, Artificial Neural Networks (Prentice Hall., New Deli, India, 1999).
V. N. Sorokin and A. I. Tsyplikhin, RF Patent No. 2351023, Byull. Izobret., No. 5 (2007).
Y. Lavner, I. Gath, and J. Rosenhouse, “The Effects of Acoustic Modifications on the Identification of Familiar Voices Speaking Isolated Vowels,” Speech Commun. 30, 9–26 (2000).
V. N. Sorokin, Theory of Speech Formation (Radio i svyaz’, Moscow, 1985) [in Russian].
A. S. Leonov, I. S. Makarov, V. N. Sorokin, and A. I. Tsyplikhin, “Articulation Resynthesis of Fricatives,” Inf. Protsessy 4(2), 141–159 (2004); www.jpg.ru.
A. I. Tsyplikhin and V. N. Sorokin, “Segmentation of Speech into Cardinal Elements,” Inf. Protsessy 6(3), 177–207 (2006); www.jpg.ru.
V. N. Sorokin and D. N. Chepelev, “Primary Analysis of Speech Signals,” Akust. Zh. 51, 536–542 (2005) [Acoust. Phys. 51, 397–403 (2005)].
V. N. Sorokin, Speech Synthesis (Nauka, Moscow, 1992) [in Russian].
J. Navratil, Q. Jin, W. Andrews, and J. Campbell, “Phonetic Speaker Recognition Using Maximum Likelihood Binary Decision Tree Models,” in Proc. of IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP’2003), Hong Kong, China, 2003, (IEEE, New York, 2003), Vol. 4, pp. 769–799.
A. S. Leonov, I. S. Makarov, and V. N. Sorokin, “Frequency Modulations in the Speech Signal,” Akust. Zh. 55, 809–821 (2009) [Acoust. Phys. 55, 876–887 (2009)].
A. S. Leonov, I. S. Makarov, and V. N. Sorokin, “Stability of Estimations of Formant Frequencies,” Rechevye Tekhnol., No. 1, 3–18 (2009).
A. I. Tsyplikhin, “Analysis of Vocal Pulses in a Speech Signal,” Akust. Zh. 53, 119–133 (2007) [Acoust. Phys. 53, 105–118 (2007)].
A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum Likelihood from Incomplete Data Via the EM Algorithm,” J. Royal Stat. Soc., Ser. B, No. 34, 1–38 (1977).
G. McLachlan and D. Peel, Finite Mixture Models (John Wiley & Sons, New York, 2000).
U. Naonori, N. Ryohei, Z. Ghahramani, and G. E. Hinton, “SMEM Algorithm for Mixture Models,” Neural Comput. 12, 2109–2128 (2000).
Additional information
Original Russian Text © V.N. Sorokin, A.I. Tsyplikhin, 2010, published in Informatsionnye Protsessy, 2010, Vol. 10, No. 2, pp. 87–104.
Rights and permissions
About this article
Cite this article
Sorokin, V.N., Tsyplikhin, A.I. Speaker verification using the spectral and time parameters of voice signal. J. Commun. Technol. Electron. 55, 1561–1574 (2010). https://doi.org/10.1134/S1064226910120302
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S1064226910120302