Multimedia Tools and Applications

, Volume 78, Issue 2, pp 1569–1582 | Cite as

Person authentication using speech as a biometric against play back attacks

  • A. Revathi
  • C. JeyalakshmiEmail author
  • K. Thenmozhi


This work presents the modules for authenticating the persons by using speech as a biometric against recorded playback attacks. It involves the implementation of feature extraction, modeling technique and testing procedure for authenticating the persons. Playback attacks are simulated by recording the original speech utterances using the speakers and mikes in a laptop using Audacity software. This work mainly involves the process for distinguishing original and recorded speeches and authenticating the speakers based on voice as a biometric. Features extracted from the original and recorded speeches are used to develop models for them. Voice passwords are assigned to the speakers and features are extracted from the training speech created by fusing the password specific original speech utterances. These features are applied to the training algorithm to generate password specific speaker models. Testing procedure involves the feature extraction and application of features to the models pertaining to recorded and original speech models. If the test speech belongs to the recorded speech, it is prevented from undergoing the further process. If it is an original speech, feature vectors of the test speech are applied to the password specific speaker models and based on the classification criteria, a speaker is identified and authenticated. Our system is found to be robust against playback attacks and has given better performance in authenticating sixteen speakers considered in our work. Passwords are isolated words and digits chosen from “TIMIT” speech database. This work is also extended to using AVSpoof database for authenticating 44 speakers against replay attacks and the performance is analyzed in terms of rejection rate.


Mel frequency perceptual linear predictive cepstrum (MFPLPC) Probabilty Playback attacks Robustness Speaker authentication Vector quantization (VQ) Replay attacks Peak signal to noise ratio (PSNR) Rejection rate 


Compliance with ethical standards

Competing Interest

The authors have declared that no competing interest exists.


  1. 1.
    Bigun J, Fierrez-Aguilar J, Ortega-Garcia J, Gonzalez-Rodriguez J (2003) Multimodal Biometric Authentication using Quality Signals in Mobile Communications.
  2. 2.
    Cui J, Liu Y, Xu Y, Zhao H, Zha H (2013) Tracking generic human motion via fusion of low-and high-dimensional approaches. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 43(4). 996-1002.
  3. 3.
    Das RK, Jeli S, and Mahadeva Prasanna SR (2016) Development of Multi-Level Speech based Person Authentication System. 1-13.
  4. 4.
    Dey S, Barman S, Bhukya RK, Das RK, Haris BC, Prasanna SRM, Sinha R (2015) Speech Biometric Based Attendance System.
  5. 5.
    Duc B, Bigiin ES, Bigiin J, Maitre G, Fischer S (1997) Fusion of audio and video information for multi modal person authentication. 835-843.
  6. 6.
    Ergünay SK, Khoury E, Lazaridis A, Marcel S (2015) On the vulnerability of speaker verification to realistic voice spoofing. Int Proc. Int. Conf. on Biometrics: Theory, Applications and Systems (BTAS).
  7. 7.
    Hermansky H, Morgan N (1994) RASTA processing of speech. IEEE transactions on speech and audio processing, 2(4): 578-589.
  8. 8.
    Hermansky H, Tsuga K, Makino S, Wakita H (1986) Perceptually based processing in automatic speech recognition. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing 11:1971–1974. CrossRefGoogle Scholar
  9. 9.
    Hermansky H, Margon N, Bayya A, Kohn P (1991) The challenge of Inverse E: The RASTA PLP method. Proceedings of Twenty Fifth IEEE Asilomar Conference on Signals, Systems and Computers 2:800–804. CrossRefGoogle Scholar
  10. 10.
    Leng L, Teoh ABJ (2015). Alignment-free row-co-occurrence cancelable palmprint Fuzzy Vault. International Journal on Pattern Recognition 48. 2290–2303.
  11. 11.
    Leng L, Teoh ABJ 2017. Simplified 2D PalmHash code for Secure Palmprint Verification. International Journal of Multimedia Tools and Applications. 76(6). 8373-8398.
  12. 12.
    Leng L, Teoh ABJ, Li M, Khan MK (2014) Analysis of correlation of 2DPalmHash Code and orientation range suitable for transposition. international journal on Neurocomputing. 131. 377-387.
  13. 13.
    Leng L, Teoh ABJ, Li M, Khan MK (2014) A remote cancelable palmprint authentication protocol based on multi-directional two-dimensional PalmPhasor fusion. Security And Communication Networks. 7. 1860–1871.
  14. 14.
    Leng L, Teoh ABJ, Li M, Khan MK (2015) Orientation range of transposition for vertical correlation suppression of 2DPalmPhasor Code. International Journalon Multimedia Tools and Applications 74(24):11683–11701 CrossRefGoogle Scholar
  15. 15.
    Liu Y, Nie L, Han L, Zhang L, Rosenblum DS (2015) Action2Activity: Recognizing Complex Activities from Sensor Data. Proceedings of the 24th International Conference on Artificial Intelligence. 1617-1623.
  16. 16.
    Liu Y, Zhang L, Nie L, Yan Y, Rosenblum DS (2016) Fortune teller: Predicting your career path. Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16), 201-207.
  17. 17.
    Liu Y, Nie L, Liu L, Rosenblum DS (2016) From action to activity: Sensor-based activity recognition. Iinternational journal on Neurocomputing. 181: 108–115.
  18. 18.
    Liu Y, Zheng Y, Liang Y, Liu S, Rosenblum DS (2016) Urban Water Quality Prediction based on Multi-task Multi-view Learning. Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-16). 2576-2582.
  19. 19.
    McCool C, Marcel S, Hadid A, Pietikäinen M, Matĕjka P, Cernocký J, Poh N, Kittler J, Larcher A, Lévy C, Matrouf D, Bonastre J-F, Tresadern P, Cootes T (2012) Bi-Modal Person Recognition on a Mobile Phone: using mobile phone data. 635-638.
  20. 20.
    Pal M, Saha G (2015) On robustness of speech based biometric systems against voice conversion attack. 30. 214-228.
  21. 21.
    Rabiner L, Juang BH (1993) Fundamentals of speech recognition. Prentice Hall, New JerseyGoogle Scholar
  22. 22.
    Rani R, Sachdeva R (2016) Genetic Algorithm using Speech and Signature of Biometrics.03(12).240-245.
  23. 23.
    Revathi A, Venkataramani Y (2011) Speaker Independent Continuous Speech and Isolated Digit Recognition using VQ and HMM. Proceedings of IEEE sponsored International conference on Communication and Signal processing:198–202.
  24. 24.
    Safavi S, Gan H, Mporas I, Sotudeh R (2016) Fraud Detection in Voice-based Identity Authentication Applications and Services.1074-1081.
  25. 25.
    Sanderson C, Paliwal KK (2004) Identity verification using speech and face information. 449-480.
  26. 26.
    Sarria-Paja M, Senoussaoui M, Falk TH (2015) The effects of whispered speech on state-of-the-art voice based biometrics systems. 1254-1259.

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of ECE/SEEESASTRA Deemed UniversityThanjavurIndia
  2. 2.Department of ECEK.Ramakrishnan College of EngineeringTrichyIndia

Personalised recommendations