Semi-automatic Speaker Verification System Based on Analysis of Formant, Durational and Pitch Characteristics

  • Elena BulgakovaEmail author
  • Aleksey Sholohov
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9811)


Modern speaker verification systems take advantage of a number of complementary base classifiers by fusing them to get reliable verification decisions. The paper presents a semi-automatic speaker verification system based on fusion of formant frequencies, phone durations and pitch characteristics. Experimental results demonstrate that combination of these characteristics improves speaker verification performance. For improved and cost-effective performance of the pitch subsystem further we selected the most informative pitch characteristics.


Formant frequencies Phone durations Pitch characteristics Speaker verification Feature selection 



This work was financially supported by the Government of the Russian Federation, Grant 074-U01.


  1. 1.
    Rose, P.: Forensic Speaker Identification. Taylor and Francis, London (2002)CrossRefGoogle Scholar
  2. 2.
    Tanner, D.C., Tanner, M.E.: Forensic Aspects of Speech Patterns: Voice Prints, Speaker Profiling, Lie and Intoxication Detection. Lawyers and Judges Publishing, Tucson (2004)Google Scholar
  3. 3.
    Bulgakova, E., Sholohov, A., Tomashenko, N., Matveev, Y.: Speaker verification using spectral and durational segmental characteristics. In: Ronzhin, A., Potapova, R., Fakotakis, N. (eds.) SPECOM 2015. LNCS, vol. 9319, pp. 397–404. Springer, Heidelberg (2015)CrossRefGoogle Scholar
  4. 4.
    Smirnova, N., et al.: Using parameters of identical pitch contour elements for speaker discrimination. In: Proceedings of the 12th International Conference on Speech and Computer, pp. 361–366 (2007)Google Scholar
  5. 5.
    Becker, T., Jessen, M., Grigoras, C.: Forensic speaker verification using formant features and Gaussian mixture models. In: Proceedings of Interspeech, pp. 1505–1508 (2008)Google Scholar
  6. 6.
    Reynolds, D., Quatieri, T., Dunn, R.: Speaker verification using adapted Gaussian mixture models. Digit. Signal Proc. 10, 19–41 (2000)CrossRefGoogle Scholar
  7. 7.
    Jain, A.K., Flynn, P., Ross, A.A. (eds.): Handbook of Biometrics. Springer-Verlag New York, Inc., New York (2008)Google Scholar
  8. 8.
    The NIST year 2010 Speaker Recognition Evaluation plan.

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.ITMO UniversitySt. PetersburgRussia
  2. 2.Speech Technology CenterSt. PetersburgRussia

Personalised recommendations