Abstract
Recently people have difficulties distinguishing real speech from computer-generated speech so that the synthetic voice is getting closer to a natural-sounding voice, due to the advancements in deep learning and voice-generation techniques. This high level of naturalness allows for synthesizing a person target voice and reproduce someone's voice with high accuracy. Audio forensics is care with improving the speech intelligibility, enhancement, and analysis in order to prove the authentication of recorded audio files it is authentic and no manipulations were done to them. This work proposes the Audio verification model in Forensic Investigation using one Dimensional Convolutional Neural networks with new light architecture. The proposed model build a natural voice characteristics pattern by analysis the information available in an audio file and extract salient features in order to build a feature maps, and then tracked these feature maps over time is the same model to make the decision about whether the audio is Synthetic or not. The proposed model performed in two main stages: the pre-processing stage then verification predictor stage, each stage contained several steps performed different functions. The experiments were conducted on a last version benchmark dataset called the Fake-or-Real Dataset from the APTLY lab and ASVspoof 2019 dataset. The proposed model achieve successful results, so that the accuracy realizes 99.89% (± 1.78%), and the value of the loss function is 0.007 on the Fake-or-Real Dataset, and achieved a 0.006% Equal Error Rate (EER) on ASVSpoof 2019 dataset.
Similar content being viewed by others
References
Jain AK, Ross A, Prabhakar S (2004) An introduction to biometric recognition. IEEE Trans Circuits Syst Video Technol 14(1):4–20
Reimao R (2019) Synthetic speech detection using deep neural networks. Computer Science and Engineering York University, Toronto
Britz MT (2021) Computer forensics and cyber crime: an introduction, 3rd edn. Pearson, London
Prachin Bhoyar RD, Sahare P, Hashmi MF, Dhok SB (2023) Lightweight architecture for fault detection in Simeck cryptographic algorithms on FPGA. Int J Inf Technol 15(8)
Vazirani K (2023) Evaluating the economic disparities in the world: sentiment analysis on central bank speeches from third world and first world countries. Int J Inf Technol 15(8)
Muh MAQHAAHMG (2021) Digital audio forensics: microphone and environment classification using deep learning. IEEE Access 9:62719–62733
L. C. S. M. M. T. P. Aichroth (2013) Audio tampering detection via microphone classification. In: 2013 IEEE 15th Int. Work. Multimed. Signal Process, pp. 177–182
Chen J, Xiang S, Huang H, Liu W (2016) Detecting and locating digital audio forgeries based on singularity analysis with wavelet packet. Multimed Tools Appl 75:2303–2325
Xiaodan Lin XK (2017) Exposing speech tampering via spectral phase analysis. Digit Signal Process 60:63–74
Lei Z, Yang Y, Liu C, Ye J (2020) Siamese convolutional neural network using gaussian probability feature for spoofing speech detection school of computer and information engineering. Jiangxi Normal University, Nanchang, pp 1116–1120
Shim HJ, Heo HS, Jung JW, Yu HJ (2020) Self-supervised pre-training with acoustic configurations for replay spoofing detection. In: Proc. Annu. Conf. Int. Speech Commun. Assoc. INTERSPEECH. Vol. 2020, pp. 1091–1095
Hiren Mewada QN, Al-Asad JF, Almalki FA, Khan AH, Almujally NA, El-Nakla S (2023) Gaussian-filtered high-frequency-feature trained optimized BiLSTM network for spoofed-speech classification. Sensors (Basel) 23(14):1–24
Wu Z, Das RK, Jichen Yang HL (2020) Light Convolutional Neural Network with Feature Genuinization for Detection of Synthetic Speech Attacks. In: Proc. Annu. Conf. Int. Speech Commun. Assoc. INTERSPEECH. Vol. 2020, pp. 1101–1105
Duan YZFJZ (2021) One-class learning towards synthetic voice spoofing detection. IEEE Signal Process Lett 28(8):37–941
Gomez-Alanis A, Peinado AM, Gonzalez JA, Gomez AM (2018) Performance evaluation of front- and back-end techniques for ASV spoofing detection systems based on deep features. 4th Int Conf IberSPEECH 2018(November):45–49
Białobrzeski R, Kośmider M, Matuszewski M, Plata M (2019) Robust Bayesian and Light Neural Networks for Voice Spoofing Detection. In: Proc. Annu. Conf. Int. Speech Commun. Assoc. INTERSPEECH. Vol. 2019, pp. 1028–1032
Lai CI, Abad A, Richmond K, Yamagishi J, Dehak N, King S (2019) Attentive filtering networks for audio replay attack detection. ICASSP 2019–2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), vol 2019. IEEE, New Jersey, pp 6316–6320
Cakir E, Heittola T, Huttunen H, Virtanen T (2015) Polyphonic sound event detection using multi label deep neural networks. 2015 International Joint Conference on Neural Networks. IEEE, New Jersey
Liu T, Yan D, Wang R, Yan N, Chen G (2021) Identification of fake stereo audio using SVM and CNN. Information 12(7):263
Lavrentyeva G, Novoselov S, Malykh E, Kozlov A, Kudashev O, Shchemelinin V (2017) Audio replay attack detection with deep learning frameworks. Proc. Interspeech 2017, pp. 82–86
Sivamani PSK (2021) Numerical analysis and implementation of artificial neural network algorithm for nonlinear function. Int J Inf Technol 13(5):2059–2068
Mishra ASK (2001) Performance analysis of machine learning based optimized feature selection approaches for breast cancer diagnosis. Int J Inf Technol 14(4):1949–1960
Salini Y, HariKiran J (2023) DeepFake Videos Detection Using Crowd Computing. Int J Inf Technol 15(7):1–18
Sarker IH (2021) Machine learning: algorithms, real-world applications and research directions. SN Comput Sci 2(3):1–21
Jung Y (2018) Multiple predicting K-fold cross-validation for model selection. J Nonparametr Stat 30(1):197–215
Moss HB, Leslie DS, Rayson P, (2018) Using J-K fold Cross Validation to Reduce Variance When Tuning NLP Models. In: Proceedings of the 27th International Conference on Computational Linguistics, 2978–2989
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
No funds, grants, or other support was received.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
AL-Shakarchy, N.D., Abdullah, Z.N., Alameen, Z.M. et al. Audio verification in forensic investigation using light deep neural network. Int. j. inf. tecnol. (2024). https://doi.org/10.1007/s41870-024-01812-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s41870-024-01812-2