Skip to main content
Log in

Speaker verification using the spectral and time parameters of voice signal

  • Articles from the Russian Journal Informatsionnye Protsessy
  • Published:
Journal of Communications Technology and Electronics Aims and scope Submit manuscript

Abstract

The speaker verification is based on variations in formant frequencies at stationary fragments and transient processes of vowels, the spectral features of fricative sounds, and the duration of speech segments. The best features are chosen for each word from the fixed list of Russian numerals ranging from zero to nine. The password phrase is randomly generated by the system at each verification. The compensation for dynamic noise and the counteraction with respect to interference using the reproduction of the intercepted and recorded speech are provided by the repeated reproduction of several words. The total error probabilities for male and female voices are 0.006 and 0.025%, respectively, for 30 million tests, 429 speakers, and a maximum length of the password phrase of 10 words. Note that the probabilities of false identification and false rejection are almost equal.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. A. Jain, A. Ross, and S. Pankanti, “Biometrics: A Tool for Information Security,” IEEE Trans. on Inf. Forensics and Security 1, 125–143 (2006).

    Article  Google Scholar 

  2. http://www.nist.gov/speech/tests/sre/2008/official-results/index.html.

  3. G. S. Ramishvili, Automatic Voice Recognition (Radio i svyaz’, Moscow, 1981) [in Russian].

    Google Scholar 

  4. L. G. Kersta, “Voiceprint Identification,” Nature 196(4861) (1962).

  5. S. Furui, “Cepstral Analysis Techniques for Automatic Speaker Verification,” IEEE Trans. Acoust., Speech, Signal Process. 27, 254–277 (1981).

    Article  Google Scholar 

  6. L. R. Rabiner and B.-H. Juang, Fundamentals of Speech Recognition. PTR Prentice Hall (Englewood Cliffs, New York, 1993).

    Google Scholar 

  7. D. A. Reynolds, “Speaker Identification and Verification Using Gaussian Mixture Speaker Models,” Speech Commun. 17, 91–108 (1995).

    Article  Google Scholar 

  8. D. A. Reynolds, T. F. Quatieri, and R. B. Dunn, Speaker Verification Using Adapted Gaussian Mixture Models. Digital Signal Process. 10, 19–41 (2000).

    Article  Google Scholar 

  9. S. Furui, “An Overview of Speaker Recognition Technology,” in Automatic Speech and Speaker Recognition, Ed. by C. H. Lee, F. K. Soong, K. K. Paliwal, (Kluwer Academic, Boston, 1996), Ch. 2.

    Google Scholar 

  10. B. Yegnanaryana, Artificial Neural Networks (Prentice Hall., New Deli, India, 1999).

    Google Scholar 

  11. V. N. Sorokin and A. I. Tsyplikhin, RF Patent No. 2351023, Byull. Izobret., No. 5 (2007).

  12. Y. Lavner, I. Gath, and J. Rosenhouse, “The Effects of Acoustic Modifications on the Identification of Familiar Voices Speaking Isolated Vowels,” Speech Commun. 30, 9–26 (2000).

    Article  Google Scholar 

  13. V. N. Sorokin, Theory of Speech Formation (Radio i svyaz’, Moscow, 1985) [in Russian].

    Google Scholar 

  14. A. S. Leonov, I. S. Makarov, V. N. Sorokin, and A. I. Tsyplikhin, “Articulation Resynthesis of Fricatives,” Inf. Protsessy 4(2), 141–159 (2004); www.jpg.ru.

    Google Scholar 

  15. A. I. Tsyplikhin and V. N. Sorokin, “Segmentation of Speech into Cardinal Elements,” Inf. Protsessy 6(3), 177–207 (2006); www.jpg.ru.

    Google Scholar 

  16. V. N. Sorokin and D. N. Chepelev, “Primary Analysis of Speech Signals,” Akust. Zh. 51, 536–542 (2005) [Acoust. Phys. 51, 397–403 (2005)].

    Google Scholar 

  17. V. N. Sorokin, Speech Synthesis (Nauka, Moscow, 1992) [in Russian].

    Google Scholar 

  18. J. Navratil, Q. Jin, W. Andrews, and J. Campbell, “Phonetic Speaker Recognition Using Maximum Likelihood Binary Decision Tree Models,” in Proc. of IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP’2003), Hong Kong, China, 2003, (IEEE, New York, 2003), Vol. 4, pp. 769–799.

    Google Scholar 

  19. A. S. Leonov, I. S. Makarov, and V. N. Sorokin, “Frequency Modulations in the Speech Signal,” Akust. Zh. 55, 809–821 (2009) [Acoust. Phys. 55, 876–887 (2009)].

    Google Scholar 

  20. A. S. Leonov, I. S. Makarov, and V. N. Sorokin, “Stability of Estimations of Formant Frequencies,” Rechevye Tekhnol., No. 1, 3–18 (2009).

  21. A. I. Tsyplikhin, “Analysis of Vocal Pulses in a Speech Signal,” Akust. Zh. 53, 119–133 (2007) [Acoust. Phys. 53, 105–118 (2007)].

    Google Scholar 

  22. A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum Likelihood from Incomplete Data Via the EM Algorithm,” J. Royal Stat. Soc., Ser. B, No. 34, 1–38 (1977).

    Google Scholar 

  23. G. McLachlan and D. Peel, Finite Mixture Models (John Wiley & Sons, New York, 2000).

    Book  MATH  Google Scholar 

  24. U. Naonori, N. Ryohei, Z. Ghahramani, and G. E. Hinton, “SMEM Algorithm for Mixture Models,” Neural Comput. 12, 2109–2128 (2000).

    Article  Google Scholar 

Download references

Authors

Additional information

Original Russian Text © V.N. Sorokin, A.I. Tsyplikhin, 2010, published in Informatsionnye Protsessy, 2010, Vol. 10, No. 2, pp. 87–104.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sorokin, V.N., Tsyplikhin, A.I. Speaker verification using the spectral and time parameters of voice signal. J. Commun. Technol. Electron. 55, 1561–1574 (2010). https://doi.org/10.1134/S1064226910120302

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1134/S1064226910120302

Keywords

Navigation