Skip to main content
Log in

Person authentication using speech as a biometric against play back attacks

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

This work presents the modules for authenticating the persons by using speech as a biometric against recorded playback attacks. It involves the implementation of feature extraction, modeling technique and testing procedure for authenticating the persons. Playback attacks are simulated by recording the original speech utterances using the speakers and mikes in a laptop using Audacity software. This work mainly involves the process for distinguishing original and recorded speeches and authenticating the speakers based on voice as a biometric. Features extracted from the original and recorded speeches are used to develop models for them. Voice passwords are assigned to the speakers and features are extracted from the training speech created by fusing the password specific original speech utterances. These features are applied to the training algorithm to generate password specific speaker models. Testing procedure involves the feature extraction and application of features to the models pertaining to recorded and original speech models. If the test speech belongs to the recorded speech, it is prevented from undergoing the further process. If it is an original speech, feature vectors of the test speech are applied to the password specific speaker models and based on the classification criteria, a speaker is identified and authenticated. Our system is found to be robust against playback attacks and has given better performance in authenticating sixteen speakers considered in our work. Passwords are isolated words and digits chosen from “TIMIT” speech database. This work is also extended to using AVSpoof database for authenticating 44 speakers against replay attacks and the performance is analyzed in terms of rejection rate.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Bigun J, Fierrez-Aguilar J, Ortega-Garcia J, Gonzalez-Rodriguez J (2003) Multimodal Biometric Authentication using Quality Signals in Mobile Communications. http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=1234017&isnumber=27656

  2. Cui J, Liu Y, Xu Y, Zhao H, Zha H (2013) Tracking generic human motion via fusion of low-and high-dimensional approaches. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 43(4). 996-1002. https://ieeexplore.ieee.org/document/6425496/

  3. Das RK, Jeli S, and Mahadeva Prasanna SR (2016) Development of Multi-Level Speech based Person Authentication System. 1-13. https://springer.com/article/10.1007/s11265-016-1148-z

  4. Dey S, Barman S, Bhukya RK, Das RK, Haris BC, Prasanna SRM, Sinha R (2015) Speech Biometric Based Attendance System. http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6811345&isnumber=6811235

  5. Duc B, Bigiin ES, Bigiin J, Maitre G, Fischer S (1997) Fusion of audio and video information for multi modal person authentication. 835-843. https://doi.org/10.1016/S0167-8655(97)00071-8

  6. Ergünay SK, Khoury E, Lazaridis A, Marcel S (2015) On the vulnerability of speaker verification to realistic voice spoofing. Int Proc. Int. Conf. on Biometrics: Theory, Applications and Systems (BTAS). https://ieeexplore.ieee.org/document/7358783/

  7. Hermansky H, Morgan N (1994) RASTA processing of speech. IEEE transactions on speech and audio processing, 2(4): 578-589. https://www.ee.columbia.edu/~dpwe/papers/HermM94-rasta.pdf

  8. Hermansky H, Tsuga K, Makino S, Wakita H (1986) Perceptually based processing in automatic speech recognition. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing 11:1971–1974. https://doi.org/10.1109/ICASSP.1986.1168649

    Article  Google Scholar 

  9. Hermansky H, Margon N, Bayya A, Kohn P (1991) The challenge of Inverse E: The RASTA PLP method. Proceedings of Twenty Fifth IEEE Asilomar Conference on Signals, Systems and Computers 2:800–804. https://doi.org/10.1109/ACSSC.1991.186557

    Article  Google Scholar 

  10. Leng L, Teoh ABJ (2015). Alignment-free row-co-occurrence cancelable palmprint Fuzzy Vault. International Journal on Pattern Recognition 48. 2290–2303. https://www.sciencedirect.com/science/article/pii/S0031320315000400

  11. Leng L, Teoh ABJ 2017. Simplified 2D PalmHash code for Secure Palmprint Verification. International Journal of Multimedia Tools and Applications. 76(6). 8373-8398. https://link.springer.com/article/10.1007/s11042-016-3458-3

  12. Leng L, Teoh ABJ, Li M, Khan MK (2014) Analysis of correlation of 2DPalmHash Code and orientation range suitable for transposition. international journal on Neurocomputing. 131. 377-387. https://www.sciencedirect.com/science/article/pii/S0925231213009351

  13. Leng L, Teoh ABJ, Li M, Khan MK (2014) A remote cancelable palmprint authentication protocol based on multi-directional two-dimensional PalmPhasor fusion. Security And Communication Networks. 7. 1860–1871. https://onlinelibrary.wiley.com/doi/abs/10.1002/sec.900

  14. Leng L, Teoh ABJ, Li M, Khan MK (2015) Orientation range of transposition for vertical correlation suppression of 2DPalmPhasor Code. International Journalon Multimedia Tools and Applications 74(24):11683–11701 https://link.springer.com/article/10.1007/s11042-014-2255-0

    Article  Google Scholar 

  15. Liu Y, Nie L, Han L, Zhang L, Rosenblum DS (2015) Action2Activity: Recognizing Complex Activities from Sensor Data. Proceedings of the 24th International Conference on Artificial Intelligence. 1617-1623. https://arxiv.org/abs/1611.01872

  16. Liu Y, Zhang L, Nie L, Yan Y, Rosenblum DS (2016) Fortune teller: Predicting your career path. Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16), 201-207. https://dl.acm.org/citation.cfm?id=3015842

  17. Liu Y, Nie L, Liu L, Rosenblum DS (2016) From action to activity: Sensor-based activity recognition. Iinternational journal on Neurocomputing. 181: 108–115. https://www.sciencedirect.com/science/article/pii/S0925231215016331

  18. Liu Y, Zheng Y, Liang Y, Liu S, Rosenblum DS (2016) Urban Water Quality Prediction based on Multi-task Multi-view Learning. Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-16). 2576-2582. https://dl.acm.org/citation.cfm?id=3060981

  19. McCool C, Marcel S, Hadid A, Pietikäinen M, Matĕjka P, Cernocký J, Poh N, Kittler J, Larcher A, Lévy C, Matrouf D, Bonastre J-F, Tresadern P, Cootes T (2012) Bi-Modal Person Recognition on a Mobile Phone: using mobile phone data. 635-638. http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6266494&isnumber=6266221

  20. Pal M, Saha G (2015) On robustness of speech based biometric systems against voice conversion attack. 30. 214-228. www.sciencedirect.com/science/article/pii/S1568494615000551

  21. Rabiner L, Juang BH (1993) Fundamentals of speech recognition. Prentice Hall, New Jersey

    Google Scholar 

  22. Rani R, Sachdeva R (2016) Genetic Algorithm using Speech and Signature of Biometrics.03(12).240-245. https://www.irjet.net/archives/V3/i12/IRJET-V3I1299.pdf

  23. Revathi A, Venkataramani Y (2011) Speaker Independent Continuous Speech and Isolated Digit Recognition using VQ and HMM. Proceedings of IEEE sponsored International conference on Communication and Signal processing:198–202. https://doi.org/10.1109/ICCSP.2011.5739300

  24. Safavi S, Gan H, Mporas I, Sotudeh R (2016) Fraud Detection in Voice-based Identity Authentication Applications and Services.1074-1081. http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7836786&isnumber=7836631

  25. Sanderson C, Paliwal KK (2004) Identity verification using speech and face information. 449-480. https://doi.org/10.1016/j.dsp.2004.05.001

  26. Sarria-Paja M, Senoussaoui M, Falk TH (2015) The effects of whispered speech on state-of-the-art voice based biometrics systems. 1254-1259. http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7129458&isnumber=7129089

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to C. Jeyalakshmi.

Ethics declarations

Competing Interest

The authors have declared that no competing interest exists.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Revathi, A., Jeyalakshmi, C. & Thenmozhi, K. Person authentication using speech as a biometric against play back attacks. Multimed Tools Appl 78, 1569–1582 (2019). https://doi.org/10.1007/s11042-018-6258-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-018-6258-0

Keywords

Navigation