Vulnerability of Voice Verification Systems to Spoofing Attacks by TTS Voices Based on Automatically Labeled Telephone Speech

  • Vadim Shchemelinin
  • Mariia Topchina
  • Konstantin Simonchik
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8773)

Abstract

This paper explores the robustness of a text-dependent voice verification system against spoofing attacks that use synthesized speech based on automatically labeled telephone speech. Our experiments show that when manual labeling is not used in creating the synthesized voice, and the voice is based on telephone speech rather than studio recordings, False Acceptance error rate decreases significantly compared to high-quality synthesized speech.

Keywords

spoofing speech synthesis unit selection HMM speaker recognition 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Matveev, Y.: Biometric technologies of person identification by voice and other modalities, Vestnik MGTU. Priborostroenie. Biometric Technologies 3(3), 46–61 (2012)Google Scholar
  2. 2.
    The NIST Year 2012 Speaker Recognition Evaluation Plan, http://www.nist.gov/itl/iad/mig/upload/NIST_SRE12_evalplan-v17-r1.pdf
  3. 3.
    Wu, Z., Kinnunen, T., Chng, E.S., Li, H., Ambikairajah, E.: A Study on spoofing attack in state-of-the-art speaker verification: the telephone speech case. In: Proc. of the APSIPA ASC, Hollywood, USA, pp. 1–5 (December 2012)Google Scholar
  4. 4.
    Wu, Z., Kinnunen, T., Chng, E.S., Li, H.: Speaker verification system against two different voice conversion techniques in spoofing attacks, Technical report (2013), http://www3.ntu.edu.sg/home/wuzz/
  5. 5.
    Kinnunen, T., Wu, Z., Lee, K.A., Sedlak, F., Chng, E.S., Li, H.: Vulnerability of Speaker Verification Systems Against Voice Conversion Spoofing Attacks: the Case of Telephone Speech. In: Proc. of the ICASSP, Kyoto, Japan, pp. 4401–4404 (March 2012)Google Scholar
  6. 6.
    Villalba, E., Lleida, E.: Speaker verification performance degradation against spoofing and tampering attacks. In: Proc. of the FALA 2010 Workshop, pp. 131–134 (2010)Google Scholar
  7. 7.
    Shchemelinin, V., Simonchik, K.: Examining Vulnerability of Voice Verification Systems to Spoofing Attacks by Means of a TTS System. In: Železný, M., Habernal, I., Ronzhin, A. (eds.) SPECOM 2013. LNCS, vol. 8113, pp. 132–137. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  8. 8.
    Kenny, P.: Bayesian speaker verification with heavy tailed priors. In: Proc. of the Odyssey Speaker and Language Recognition Workshop, Brno, Czech Republic (June 2010)Google Scholar
  9. 9.
    Simonchik, K., Pekhovsky, T., Shulipa, A., Afanasyev, A.: Supervized Mixture of PLDA Models for Cross-Channel Speaker Verification. In: Proc. of the 13th Annual Conference of the International Speech Communication Association, Interspeech 2012, Portland, Oregon, USA, September 9-13 (2012)Google Scholar
  10. 10.
    Aleinik, S., Matveev, Y., Raev, A.: Method of evaluation of speech signal clipping level. Scientific and Technical Journal of Information Technologies, Mechanics and Optics 79(3), 79–83 (2012)Google Scholar
  11. 11.
    Pelecanos, J., Sridharan, S.: Feature warping for robust speaker verication. In: Proc. of the Speaker Odyssey, the Speaker Recognition Workshop, Crete, Greece (2001)Google Scholar
  12. 12.
    Matveev, Y., Simonchik, K.: The speaker identification system for the NIST SRE 2010. In: Proc. of the 20th International Conference on Computer Graphics and Vision, GraphiCon 2010, St. Petersburg, Russia, September 20-24, pp. 315–319 (2010)Google Scholar
  13. 13.
    Kozlov, A., Kudashev, O., Matveev, Y., Pekhovsky, T., Simonchik, K., Shulipa, A.: Speaker recognition system for the NIST SRE 2012. SPIIRAS Proceedings 25(2), 350–370 (2012)Google Scholar
  14. 14.
    Chistikov, P., Korolkov, E.: Data-driven Speech Parameter Generation for Russian Text-to-Speech System. Computational Linguistics and Intellectual Technologies. In: Annual International Conference “Dialogue”, pp. 103–111 (2012)Google Scholar
  15. 15.
    Simonchik, K., Shchemelinn, V.: “STC SPOOFING” Database for Text-Dependent Speaker Recognition Evaluation. In: Proc. of SLTU-2014 Workshop, St. Petersburg, Russia, May 14-16, pp. 221–224 (2014)Google Scholar
  16. 16.
    Solomennik, A., Chistikov, P., Rybin, S., Talanov, A., Tomashenko, N.: Automation of New Voice Creation Procedure For a Russian TTS System. Vestnik MGTU. Priborostroenie, “Biometric Technologies” 2, 29–32 (2013)Google Scholar
  17. 17.
    “YOHO Speaker Verification” database, Joseph Campbell and Alan Higgins, http://www.ldc.upenn.edu/Catalog/catalogEntry.jsp?catalogId=LDC94S16

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Vadim Shchemelinin
    • 1
    • 2
  • Mariia Topchina
    • 2
  • Konstantin Simonchik
    • 2
  1. 1.National Research University of Information Technologies, Mechanics and OpticsSt.PetersburgRussia
  2. 2.Speech Technology Center LimitedSt.PetersburgRussia

Personalised recommendations