Skip to main content

Advertisement

Log in

Voice spoofing detection based on acoustic and glottal flow features using conventional machine learning techniques

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Automatic Speaker Verification (ASV) systems are vulnerable to spoofing attacks. Most existing spoofing detection systems rely on two main points; the feature extraction and the classification methodology. In this paper, we propose a new strategy to recognize the veritable discourse from the spoofed one. The thought depends on the investigation of the human voice to identify the relevant acoustic and glottal features. Those features will be utilized to separate between a veritable discourse and a spoofed one. We have tested numerous of speech acoustic and glottal flow features from all data sets of ASVspoof challenge 2015 and ASVspoof challenge 2017. Several features are extracted and analyzed to choose the most pertinent ones using feature engineering methodology. To detect the genuine speech from the spoofed one, conventional machine learning techniques are applied as classification techniques mainly Support Vector Machine (SVM) and eXtreme Gradient Boosting (XGBoost). Features exploration and analysis leads to pick up pertinent ones. These features are used then as input for the SVM with multiple kernel and for XGBoost classification techniques. The highest rate of achieving accuracy is about 98.80% obtained with the XGBoost classification technique. Experimental results show the validity and the robustness of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Alam M, Kenny P, Bhattacharya G, Stafylakis T (2015) Development of CRIM system for the automatic speaker verification spoofing and countermeasures challenge interspeech

  2. Amin l, Shantanu Ch (2011) An overview of statistical pattern recognition techniques for speaker verification. IEEE Circ Syst Mag 11(2):62–81

    Article  Google Scholar 

  3. Ben Ayed Mezghani D, Zribi Boujelbene S, Ellouze N (2010) Evaluation of SVM kernels and conventional machine learning algorithms for speaker identification. Int J Hybrid Inf Technol 3:3

    Google Scholar 

  4. Bhattacharyya D, Ranjan R, Alisherov F, Choi AM (2009) Biometric Authentication:, a review, International Journal of u-and e-Service. Sci Technol 2:3

    Article  Google Scholar 

  5. Cemal H, Figen E (2011) Impact of voice excitation features on speaker verification. ELECO 7th International Conference on Electrical and Electronics Engineering, pp 157–160

  6. Chen T, Guestrin C (2016) XGBOost: A Scalable Tree Boosting System. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 785–794

  7. Chen N, Qiany D, Chen H, YuK B (2015) Robust deep feature for spoofing detection-the SJTU system for ASVspoof 2015 challenge. 16th Annual Conference of the International Speech Communication Association interspeech, pp 2097–2101

  8. Chen Z, Xie Z, Zhang W, Xu X (2017) Resnet and model fusion for automatic spoofing detection, interspeech, pp 102-106

  9. Chennoukh S, Gerrits A, GMiet R (2001) Sluijter, Speech enhancement via frequency bandwidth extension using line spectral frequencies, acoustics, speech, and signal processing, 2001 international conference on acoustics. Speech Sign Process 1:665–668

    Google Scholar 

  10. Childers DG (1995) Glottal source modeling for voice conversion. Speech Comm 16(2):127–138

    Article  Google Scholar 

  11. Chow D, Abdulla WH (2004) Speaker identification based on log area ratio and gaussian mixture models in Narrow-Band speech. PRICAI, pp 901–908

  12. Cummings KE, Clements MA (1995) Analysis of the glottal excitation of emotionally styled and stressed speech. J Acoust Soc Am 98(1):88–98

    Article  Google Scholar 

  13. Dave N (2013) Feature Extraction Methods LPC, PLP and MFCC In Speech Recognition, international journal for advance research in engineering and technology

  14. De Leon PL, Apsingekar VR, Pucher M, Yamagishi J (2010) Revisiting the security of speaker verification systems against imposture using synthetic speech. IEEE Signal Processing Society, pp 1798–1801

  15. Drugman T, Thomas M, Gudnason J, Naylor P, Dutoit T (2012) Detection of glottal closure instants from speech signals: A quantitative review. IEEE Trans Audio Speech Lang Process 20(3):994–1006

    Article  Google Scholar 

  16. Duraibi S, Alhamdani W, Sheldon FT (2020) Voice Feature Learning using Convolutional Neural Networks Designed to Avoid Replay Attacks. IEEE Symposium Series on Computational Intelligence, pp 1845–1851

  17. EBENUWA SH, SHARIF MH, ALAZAB M, AL-NEMRAT SAEED A (2019) Variance ranking attributes selection techniques for binary classification problem in imbalance data. IEEE Access 7:24649–24666

  18. Estévez PA, Tesmer M, Perez CA, Zurada JM (2009) Normalized mutual information feature selection. IEEE Trans Neural Netw 20(2):189–201

    Article  Google Scholar 

  19. Fang F, Yamagishi J, Echizen I, Sahidullah MD, Kinnunen T (2018) Transforming acoustic characteristics to deceive playback spoofing countermeasures of speaker verification systems. IEEE International Workshop on Information Forensics and Security, pp 1–9

  20. Font R, Espin JM, Cano MJ (2017) Experimental analysis of features for replay attack detection–Results on the ASVspoof. Chall Interspeech 7-11:2017

    Google Scholar 

  21. H YU Z, Y ZHANG ZMA, GUO J (2017) DNN Filter bank cepstral coefficients for spoofing detection. IEEE Access, pp 4779–4787

  22. Ji Z, Li Z, Li P, An M, Gao S, Wu D, Zhao F (2017) Ensemble learning for countermeasure of audio replay spoofing attack in ASVspoof2017. Interspeech 2017:87–91

    Article  Google Scholar 

  23. Kim On CH, Pandiyan PM, Yaacob S, Saudi A (2006) Mel-Frequency Cepstral coefficient analysis in speech recognition international conference on computing & informatics

  24. Kinnunen T, Sahidullah M, Delgado H, Todisco M, Evans N, Yamagishi J, Lee A (2017) The ASVspoof. Challenge:, Assess Limits Replay Spoofing Attack Detect Interspeech 2-6:2017

    Google Scholar 

  25. Kinnunen T, Zhang B, Zhu J, Wang Y (2007) Speaker Verification with Adaptive Spectral Subband Centroids, international conference on Advances in Biometrics, pp 58–66

  26. Lavrentyeva G, Novoselov S, Tseren A, Volkova M, Gorlanov A (2019) A Kozlov, STC antispoofing systems for the ASVspoof2019 challenge, interspeech, pp 1033–1037

  27. Novoselov S, Kozlov A, Lavrentyeva G, Simonchik K, Shchemelinin V (2016) STC Antispoofing systems for the ASVspoof 2015 challenge. IEEE international conference on acoustics speech and signal processing, pp 5475–5479

  28. Novoselov S, Kozlov A, Lavrentyeva G, Simonchik K, Shchemelinin V (2016) STC Antispoofing systems for the ASVspoof 2015 challenge, IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP)

  29. Patel T (2015) Patil, Combining evidences from mel cepstral, cochlear filter cepstral and instantaneous frequency features for detection of natural vs. spoofed speech interspeech 16th Annual Conference of the International Speech Communication Association

  30. Patil H, Kamble M, Patel T, Soni M (2017) Novel variable length teager energy separation based instantaneous frequency features for replay detection, interspeech, pp 12–16

  31. Paul D, Sahidullah Md, Saha G (2017) Generalization of spoofing coutermeasures : A case study with ASVSPOOF 2015 and BTAS 2016 corpora, IEEE International Conference on Acoustics. Speech and Signal Processing, pp 2047–2051

  32. Rahmeni R, Aicha AB, Ben Ayed Y (2019) Speech spoofing countermeasures based on source voice analysis and machine learning techniques, pp 668–675

  33. Rahmeni R, Aicha AB, Ben Ayed Y (2020) Speech spoofing detection using SVM and ELM technique with acoustic features, pp 1–4

  34. Rahmeni R, Aicha AB, Ben Ayed Y (2020) Acoustic features exploration and examination for voice spoofing counter measures with boosting machine learning techniques, pp 1073–1082

  35. Rosenberg AE (1976) Automatic speaker verification: a review. Proc IEEE 64(4):475–487

    Article  Google Scholar 

  36. Satoh T, Masuko T, Kobayashi T, Tokuda K (2001) A Robust Speaker Verification System against Imposture Using an HMM-based Speech Synthesis System. Eurospeech, pp 759–762

  37. Satoh T, Masuko T, Kobayashi T, Tokuda K (2001) A robust speaker verification system against imposture using a HMM-based speech synthesis system. Eurospeech, pp 759–762

  38. Scholkopf B, Smola AJ (2001) Learning with kernels: support vector machines, regularization, optimization, and beyond MIT press

  39. Sheridan RP, Min Wang W, Liaw A, Ma J, Gifford EM (2016) Extreme gradient boosting as a method for quantitative Structure–Activity relationships. J Chem Inf Model 56(12):2353–2360

    Article  Google Scholar 

  40. Sin Chee L, Chia Ai O, Hariharan M, Yaacob S (2009) Automatic detection of prolongations and repetitions using LPCC. International Conference for Technical Postgraduates, pp 1–4

  41. Sri Rama Murty K, Yegnanarayana B (2006) Combining evidence from residual phase and MFCC features for speaker recognition. IEEE Sign Process Lett 13(1):52–55

    Article  Google Scholar 

  42. Todisco M, Delgado H, Evans N (2016) A new feature for automatic speaker verification anti-spoofing: constant q cepstral coefficients odyssey

  43. Viswanathan R, Makhoul J (1975) Quantization properties of transmission parameters in linear predictive systems. IEEE Trans Acoustic Speech Sign Process 23(3):309–321

    Article  Google Scholar 

  44. Williams ChKI (2003) Learning with kernels: support vector machines, regularization, optimization, and beyond. J Am Stat Assoc 98(462):489–489

    Google Scholar 

  45. Witkowski M, Kacprzak S, Zelasko P, Kowalczyk K, Gałka J (2017) Audio Replay Attack Detection Using High-Frequency Features, interspeech, pp 27–31

  46. Wu Z, Evans N, Kinnunen T, Yamagishi J, Alegre F, Li H (2014) Spoofing and countermeasures for speaker verification: a survey. Speech Comm 66:130–153

    Article  Google Scholar 

  47. Xiao X, Tian X, Du S, Xu H, Chng ES, Haizhou L (2015) Spoofing Speech Detection Using High Dimensional Magnitude and Phase Features:, the NTU Approach for ASVspoof 2015 Challenge, interspeech, pp 2052–2056

  48. Xiao X, Tian X, Du S, Xu H, Chng E, Li H (2015) Spoofing speech detection using high dimensional magnitude and phase features: the NTU approach for ASVspoof 2015 challenge Interspeech

  49. Yu L, Liu H (2003) Feature selection for High-Dimensional data: a fast Correlation-Based filter solution, machine learning. Proceedings of the Twentieth International Conference, pp 856–863

  50. Yu B, Qiu W, Chen Ch, Ma A, Jiang J, Zhou H, Ma Q (2020) Submito-XGBoost: predicting protein submitochondrial localization by fusing multiple feature information and eXtreme gradient boosting. Bioinformatics 36 (4):1074–1081

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Raoudha Rahmeni.

Ethics declarations

Conflict of Interests

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rahmeni, R., Aicha, A.B. & Ayed, Y.B. Voice spoofing detection based on acoustic and glottal flow features using conventional machine learning techniques. Multimed Tools Appl 81, 31443–31467 (2022). https://doi.org/10.1007/s11042-022-12606-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-12606-8

Keywords

Navigation