Skip to main content
Log in

Audio verification in forensic investigation using light deep neural network

  • Original Research
  • Published:
International Journal of Information Technology Aims and scope Submit manuscript

Abstract

Recently people have difficulties distinguishing real speech from computer-generated speech so that the synthetic voice is getting closer to a natural-sounding voice, due to the advancements in deep learning and voice-generation techniques. This high level of naturalness allows for synthesizing a person target voice and reproduce someone's voice with high accuracy. Audio forensics is care with improving the speech intelligibility, enhancement, and analysis in order to prove the authentication of recorded audio files it is authentic and no manipulations were done to them. This work proposes the Audio verification model in Forensic Investigation using one Dimensional Convolutional Neural networks with new light architecture. The proposed model build a natural voice characteristics pattern by analysis the information available in an audio file and extract salient features in order to build a feature maps, and then tracked these feature maps over time is the same model to make the decision about whether the audio is Synthetic or not. The proposed model performed in two main stages: the pre-processing stage then verification predictor stage, each stage contained several steps performed different functions. The experiments were conducted on a last version benchmark dataset called the Fake-or-Real Dataset from the APTLY lab and ASVspoof 2019 dataset. The proposed model achieve successful results, so that the accuracy realizes 99.89% (± 1.78%), and the value of the loss function is 0.007 on the Fake-or-Real Dataset, and achieved a 0.006% Equal Error Rate (EER) on ASVSpoof 2019 dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Algorithm 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Jain AK, Ross A, Prabhakar S (2004) An introduction to biometric recognition. IEEE Trans Circuits Syst Video Technol 14(1):4–20

    Article  Google Scholar 

  2. Reimao R (2019) Synthetic speech detection using deep neural networks. Computer Science and Engineering York University, Toronto

    Google Scholar 

  3. Britz MT (2021) Computer forensics and cyber crime: an introduction, 3rd edn. Pearson, London

    Google Scholar 

  4. Prachin Bhoyar RD, Sahare P, Hashmi MF, Dhok SB (2023) Lightweight architecture for fault detection in Simeck cryptographic algorithms on FPGA. Int J Inf Technol 15(8)

  5. Vazirani K (2023) Evaluating the economic disparities in the world: sentiment analysis on central bank speeches from third world and first world countries. Int J Inf Technol 15(8)

  6. Muh MAQHAAHMG (2021) Digital audio forensics: microphone and environment classification using deep learning. IEEE Access 9:62719–62733

    Article  Google Scholar 

  7. L. C. S. M. M. T. P. Aichroth (2013) Audio tampering detection via microphone classification. In: 2013 IEEE 15th Int. Work. Multimed. Signal Process, pp. 177–182

  8. Chen J, Xiang S, Huang H, Liu W (2016) Detecting and locating digital audio forgeries based on singularity analysis with wavelet packet. Multimed Tools Appl 75:2303–2325

    Article  Google Scholar 

  9. Xiaodan Lin XK (2017) Exposing speech tampering via spectral phase analysis. Digit Signal Process 60:63–74

    Article  Google Scholar 

  10. Lei Z, Yang Y, Liu C, Ye J (2020) Siamese convolutional neural network using gaussian probability feature for spoofing speech detection school of computer and information engineering. Jiangxi Normal University, Nanchang, pp 1116–1120

    Google Scholar 

  11. Shim HJ, Heo HS, Jung JW, Yu HJ (2020) Self-supervised pre-training with acoustic configurations for replay spoofing detection. In: Proc. Annu. Conf. Int. Speech Commun. Assoc. INTERSPEECH. Vol. 2020, pp. 1091–1095

  12. Hiren Mewada QN, Al-Asad JF, Almalki FA, Khan AH, Almujally NA, El-Nakla S (2023) Gaussian-filtered high-frequency-feature trained optimized BiLSTM network for spoofed-speech classification. Sensors (Basel) 23(14):1–24

    Google Scholar 

  13. Wu Z, Das RK, Jichen Yang HL (2020) Light Convolutional Neural Network with Feature Genuinization for Detection of Synthetic Speech Attacks. In: Proc. Annu. Conf. Int. Speech Commun. Assoc. INTERSPEECH. Vol. 2020, pp. 1101–1105

  14. Duan YZFJZ (2021) One-class learning towards synthetic voice spoofing detection. IEEE Signal Process Lett 28(8):37–941

    Google Scholar 

  15. Gomez-Alanis A, Peinado AM, Gonzalez JA, Gomez AM (2018) Performance evaluation of front- and back-end techniques for ASV spoofing detection systems based on deep features. 4th Int Conf IberSPEECH 2018(November):45–49

    Article  Google Scholar 

  16. Białobrzeski R, Kośmider M, Matuszewski M, Plata M (2019) Robust Bayesian and Light Neural Networks for Voice Spoofing Detection. In: Proc. Annu. Conf. Int. Speech Commun. Assoc. INTERSPEECH. Vol. 2019, pp. 1028–1032

  17. Lai CI, Abad A, Richmond K, Yamagishi J, Dehak N, King S (2019) Attentive filtering networks for audio replay attack detection. ICASSP 2019–2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), vol 2019. IEEE, New Jersey, pp 6316–6320

    Chapter  Google Scholar 

  18. Cakir E, Heittola T, Huttunen H, Virtanen T (2015) Polyphonic sound event detection using multi label deep neural networks. 2015 International Joint Conference on Neural Networks. IEEE, New Jersey

    Google Scholar 

  19. Liu T, Yan D, Wang R, Yan N, Chen G (2021) Identification of fake stereo audio using SVM and CNN. Information 12(7):263

    Article  Google Scholar 

  20. Lavrentyeva G, Novoselov S, Malykh E, Kozlov A, Kudashev O, Shchemelinin V (2017) Audio replay attack detection with deep learning frameworks. Proc. Interspeech 2017, pp. 82–86

  21. Sivamani PSK (2021) Numerical analysis and implementation of artificial neural network algorithm for nonlinear function. Int J Inf Technol 13(5):2059–2068

    Google Scholar 

  22. Mishra ASK (2001) Performance analysis of machine learning based optimized feature selection approaches for breast cancer diagnosis. Int J Inf Technol 14(4):1949–1960

    Google Scholar 

  23. Salini Y, HariKiran J (2023) DeepFake Videos Detection Using Crowd Computing. Int J Inf Technol 15(7):1–18

    Google Scholar 

  24. Sarker IH (2021) Machine learning: algorithms, real-world applications and research directions. SN Comput Sci 2(3):1–21

    Article  MathSciNet  Google Scholar 

  25. Jung Y (2018) Multiple predicting K-fold cross-validation for model selection. J Nonparametr Stat 30(1):197–215

    Article  MathSciNet  Google Scholar 

  26. Moss HB, Leslie DS, Rayson P, (2018) Using J-K fold Cross Validation to Reduce Variance When Tuning NLP Models. In: Proceedings of the 27th International Conference on Computational Linguistics, 2978–2989

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Noor D. AL-Shakarchy.

Ethics declarations

Conflict of interest

No funds, grants, or other support was received.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

AL-Shakarchy, N.D., Abdullah, Z.N., Alameen, Z.M. et al. Audio verification in forensic investigation using light deep neural network. Int. j. inf. tecnol. (2024). https://doi.org/10.1007/s41870-024-01812-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s41870-024-01812-2

Keywords

Navigation