Voice spoofing detection based on acoustic and glottal flow features using conventional machine learning techniques

Rahmeni, Raoudha; Aicha, Anis Ben; Ayed, Yassine Ben

doi:10.1007/s11042-022-12606-8

Voice spoofing detection based on acoustic and glottal flow features using conventional machine learning techniques

Published: 09 April 2022

Volume 81, pages 31443–31467, (2022)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

431 Accesses
10 Citations
1 Altmetric
Explore all metrics

Abstract

Automatic Speaker Verification (ASV) systems are vulnerable to spoofing attacks. Most existing spoofing detection systems rely on two main points; the feature extraction and the classification methodology. In this paper, we propose a new strategy to recognize the veritable discourse from the spoofed one. The thought depends on the investigation of the human voice to identify the relevant acoustic and glottal features. Those features will be utilized to separate between a veritable discourse and a spoofed one. We have tested numerous of speech acoustic and glottal flow features from all data sets of ASVspoof challenge 2015 and ASVspoof challenge 2017. Several features are extracted and analyzed to choose the most pertinent ones using feature engineering methodology. To detect the genuine speech from the spoofed one, conventional machine learning techniques are applied as classification techniques mainly Support Vector Machine (SVM) and eXtreme Gradient Boosting (XGBoost). Features exploration and analysis leads to pick up pertinent ones. These features are used then as input for the SVM with multiple kernel and for XGBoost classification techniques. The highest rate of achieving accuracy is about 98.80% obtained with the XGBoost classification technique. Experimental results show the validity and the robustness of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 2

Anti-spoofing Methods for Automatic Speaker Verification System

Increasing anti-spoofing protection in speaker verification using linear prediction

Article Open access 16 April 2016

Linear prediction residual features for automatic speaker verification anti-spoofing

Article 05 September 2017

References

Alam M, Kenny P, Bhattacharya G, Stafylakis T (2015) Development of CRIM system for the automatic speaker verification spoofing and countermeasures challenge interspeech
Amin l, Shantanu Ch (2011) An overview of statistical pattern recognition techniques for speaker verification. IEEE Circ Syst Mag 11(2):62–81
Article Google Scholar
Ben Ayed Mezghani D, Zribi Boujelbene S, Ellouze N (2010) Evaluation of SVM kernels and conventional machine learning algorithms for speaker identification. Int J Hybrid Inf Technol 3:3
Google Scholar
Bhattacharyya D, Ranjan R, Alisherov F, Choi AM (2009) Biometric Authentication:, a review, International Journal of u-and e-Service. Sci Technol 2:3
Article Google Scholar
Cemal H, Figen E (2011) Impact of voice excitation features on speaker verification. ELECO 7th International Conference on Electrical and Electronics Engineering, pp 157–160
Chen T, Guestrin C (2016) XGBOost: A Scalable Tree Boosting System. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 785–794
Chen N, Qiany D, Chen H, YuK B (2015) Robust deep feature for spoofing detection-the SJTU system for ASVspoof 2015 challenge. 16th Annual Conference of the International Speech Communication Association interspeech, pp 2097–2101
Chen Z, Xie Z, Zhang W, Xu X (2017) Resnet and model fusion for automatic spoofing detection, interspeech, pp 102-106
Chennoukh S, Gerrits A, GMiet R (2001) Sluijter, Speech enhancement via frequency bandwidth extension using line spectral frequencies, acoustics, speech, and signal processing, 2001 international conference on acoustics. Speech Sign Process 1:665–668
Google Scholar
Childers DG (1995) Glottal source modeling for voice conversion. Speech Comm 16(2):127–138
Article Google Scholar
Chow D, Abdulla WH (2004) Speaker identification based on log area ratio and gaussian mixture models in Narrow-Band speech. PRICAI, pp 901–908
Cummings KE, Clements MA (1995) Analysis of the glottal excitation of emotionally styled and stressed speech. J Acoust Soc Am 98(1):88–98
Article Google Scholar
Dave N (2013) Feature Extraction Methods LPC, PLP and MFCC In Speech Recognition, international journal for advance research in engineering and technology
De Leon PL, Apsingekar VR, Pucher M, Yamagishi J (2010) Revisiting the security of speaker verification systems against imposture using synthetic speech. IEEE Signal Processing Society, pp 1798–1801
Drugman T, Thomas M, Gudnason J, Naylor P, Dutoit T (2012) Detection of glottal closure instants from speech signals: A quantitative review. IEEE Trans Audio Speech Lang Process 20(3):994–1006
Article Google Scholar
Duraibi S, Alhamdani W, Sheldon FT (2020) Voice Feature Learning using Convolutional Neural Networks Designed to Avoid Replay Attacks. IEEE Symposium Series on Computational Intelligence, pp 1845–1851
EBENUWA SH, SHARIF MH, ALAZAB M, AL-NEMRAT SAEED A (2019) Variance ranking attributes selection techniques for binary classification problem in imbalance data. IEEE Access 7:24649–24666
Estévez PA, Tesmer M, Perez CA, Zurada JM (2009) Normalized mutual information feature selection. IEEE Trans Neural Netw 20(2):189–201
Article Google Scholar
Fang F, Yamagishi J, Echizen I, Sahidullah MD, Kinnunen T (2018) Transforming acoustic characteristics to deceive playback spoofing countermeasures of speaker verification systems. IEEE International Workshop on Information Forensics and Security, pp 1–9
Font R, Espin JM, Cano MJ (2017) Experimental analysis of features for replay attack detection–Results on the ASVspoof. Chall Interspeech 7-11:2017
Google Scholar
H YU Z, Y ZHANG ZMA, GUO J (2017) DNN Filter bank cepstral coefficients for spoofing detection. IEEE Access, pp 4779–4787
Ji Z, Li Z, Li P, An M, Gao S, Wu D, Zhao F (2017) Ensemble learning for countermeasure of audio replay spoofing attack in ASVspoof2017. Interspeech 2017:87–91
Article Google Scholar
Kim On CH, Pandiyan PM, Yaacob S, Saudi A (2006) Mel-Frequency Cepstral coefficient analysis in speech recognition international conference on computing & informatics
Kinnunen T, Sahidullah M, Delgado H, Todisco M, Evans N, Yamagishi J, Lee A (2017) The ASVspoof. Challenge:, Assess Limits Replay Spoofing Attack Detect Interspeech 2-6:2017
Google Scholar
Kinnunen T, Zhang B, Zhu J, Wang Y (2007) Speaker Verification with Adaptive Spectral Subband Centroids, international conference on Advances in Biometrics, pp 58–66
Lavrentyeva G, Novoselov S, Tseren A, Volkova M, Gorlanov A (2019) A Kozlov, STC antispoofing systems for the ASVspoof2019 challenge, interspeech, pp 1033–1037
Novoselov S, Kozlov A, Lavrentyeva G, Simonchik K, Shchemelinin V (2016) STC Antispoofing systems for the ASVspoof 2015 challenge. IEEE international conference on acoustics speech and signal processing, pp 5475–5479
Novoselov S, Kozlov A, Lavrentyeva G, Simonchik K, Shchemelinin V (2016) STC Antispoofing systems for the ASVspoof 2015 challenge, IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP)
Patel T (2015) Patil, Combining evidences from mel cepstral, cochlear filter cepstral and instantaneous frequency features for detection of natural vs. spoofed speech interspeech 16th Annual Conference of the International Speech Communication Association
Patil H, Kamble M, Patel T, Soni M (2017) Novel variable length teager energy separation based instantaneous frequency features for replay detection, interspeech, pp 12–16
Paul D, Sahidullah Md, Saha G (2017) Generalization of spoofing coutermeasures : A case study with ASVSPOOF 2015 and BTAS 2016 corpora, IEEE International Conference on Acoustics. Speech and Signal Processing, pp 2047–2051
Rahmeni R, Aicha AB, Ben Ayed Y (2019) Speech spoofing countermeasures based on source voice analysis and machine learning techniques, pp 668–675
Rahmeni R, Aicha AB, Ben Ayed Y (2020) Speech spoofing detection using SVM and ELM technique with acoustic features, pp 1–4
Rahmeni R, Aicha AB, Ben Ayed Y (2020) Acoustic features exploration and examination for voice spoofing counter measures with boosting machine learning techniques, pp 1073–1082
Rosenberg AE (1976) Automatic speaker verification: a review. Proc IEEE 64(4):475–487
Article Google Scholar
Satoh T, Masuko T, Kobayashi T, Tokuda K (2001) A Robust Speaker Verification System against Imposture Using an HMM-based Speech Synthesis System. Eurospeech, pp 759–762
Satoh T, Masuko T, Kobayashi T, Tokuda K (2001) A robust speaker verification system against imposture using a HMM-based speech synthesis system. Eurospeech, pp 759–762
Scholkopf B, Smola AJ (2001) Learning with kernels: support vector machines, regularization, optimization, and beyond MIT press
Sheridan RP, Min Wang W, Liaw A, Ma J, Gifford EM (2016) Extreme gradient boosting as a method for quantitative Structure–Activity relationships. J Chem Inf Model 56(12):2353–2360
Article Google Scholar
Sin Chee L, Chia Ai O, Hariharan M, Yaacob S (2009) Automatic detection of prolongations and repetitions using LPCC. International Conference for Technical Postgraduates, pp 1–4
Sri Rama Murty K, Yegnanarayana B (2006) Combining evidence from residual phase and MFCC features for speaker recognition. IEEE Sign Process Lett 13(1):52–55
Article Google Scholar
Todisco M, Delgado H, Evans N (2016) A new feature for automatic speaker verification anti-spoofing: constant q cepstral coefficients odyssey
Viswanathan R, Makhoul J (1975) Quantization properties of transmission parameters in linear predictive systems. IEEE Trans Acoustic Speech Sign Process 23(3):309–321
Article Google Scholar
Williams ChKI (2003) Learning with kernels: support vector machines, regularization, optimization, and beyond. J Am Stat Assoc 98(462):489–489
Google Scholar
Witkowski M, Kacprzak S, Zelasko P, Kowalczyk K, Gałka J (2017) Audio Replay Attack Detection Using High-Frequency Features, interspeech, pp 27–31
Wu Z, Evans N, Kinnunen T, Yamagishi J, Alegre F, Li H (2014) Spoofing and countermeasures for speaker verification: a survey. Speech Comm 66:130–153
Article Google Scholar
Xiao X, Tian X, Du S, Xu H, Chng ES, Haizhou L (2015) Spoofing Speech Detection Using High Dimensional Magnitude and Phase Features:, the NTU Approach for ASVspoof 2015 Challenge, interspeech, pp 2052–2056
Xiao X, Tian X, Du S, Xu H, Chng E, Li H (2015) Spoofing speech detection using high dimensional magnitude and phase features: the NTU approach for ASVspoof 2015 challenge Interspeech
Yu L, Liu H (2003) Feature selection for High-Dimensional data: a fast Correlation-Based filter solution, machine learning. Proceedings of the Twentieth International Conference, pp 856–863
Yu B, Qiu W, Chen Ch, Ma A, Jiang J, Zhou H, Ma Q (2020) Submito-XGBoost: predicting protein submitochondrial localization by fusing multiple feature information and eXtreme gradient boosting. Bioinformatics 36 (4):1074–1081
Article Google Scholar

Download references

Author information

Authors and Affiliations

National School of Engineers of Sfax (ENIS)/(MIRACL) Multimedia, InfoRmation systems and Advanced Computing Laboratory, University of Sfax, Sfax, Tunisia
Raoudha Rahmeni
Faculty of Sciences of Bizerte (FSB)/ (COSIM) Communication, Signals and IMages, University of Carthage, Carthage, Tunisia
Anis Ben Aicha
Higher Institute of Computer Sciences and Multimedia (ISIMS)/ (MIRACL) Multimedia, InfoRmation systems and Advanced Computing Laboratory, University of Sfax, Sfax, Tunisia
Yassine Ben Ayed

Authors

Raoudha Rahmeni
View author publications
You can also search for this author in PubMed Google Scholar
Anis Ben Aicha
View author publications
You can also search for this author in PubMed Google Scholar
Yassine Ben Ayed
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Raoudha Rahmeni.

Ethics declarations

Conflict of Interests

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rahmeni, R., Aicha, A.B. & Ayed, Y.B. Voice spoofing detection based on acoustic and glottal flow features using conventional machine learning techniques. Multimed Tools Appl 81, 31443–31467 (2022). https://doi.org/10.1007/s11042-022-12606-8

Download citation

Received: 11 June 2021
Revised: 15 January 2022
Accepted: 09 February 2022
Published: 09 April 2022
Issue Date: September 2022
DOI: https://doi.org/10.1007/s11042-022-12606-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Voice spoofing detection based on acoustic and glottal flow features using conventional machine learning techniques

Abstract

Access this article

Similar content being viewed by others

Anti-spoofing Methods for Automatic Speaker Verification System

Increasing anti-spoofing protection in speaker verification using linear prediction

Linear prediction residual features for automatic speaker verification anti-spoofing

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Voice spoofing detection based on acoustic and glottal flow features using conventional machine learning techniques

Abstract

Access this article

Similar content being viewed by others

Anti-spoofing Methods for Automatic Speaker Verification System

Increasing anti-spoofing protection in speaker verification using linear prediction

Linear prediction residual features for automatic speaker verification anti-spoofing

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation