Skip to main content
Log in

Detecting audio copy-move forgery with an artificial neural network

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

Given how easily audio data can be obtained, audio recordings are subject to both malicious and unmalicious tampering and manipulation that can compromise the integrity and reliability of audio data. Because audio recordings can be used in many strategic areas, detecting such tampering and manipulation of audio data is critical. Although the literature demonstrates the lack of any accurate, integrated system for detecting copy-move forgery, the field shows great promise for research. Thus, our proposed method seeks to support the detection of the passive technique of audio copy-move forgery. For our study, forgery audio data were obtained from the TIMIT dataset, and 4378 audio recordings were used: 2189 of original audio and 2189 of audio created by copy-move forgery. After the voiced and unvoiced regions in the audio signal were determined by the yet another algorithm for pitch tracking, the features were obtained from the signals using Mel frequency cepstrum coefficients (MFCCs), delta (Δ) MFCCs, and ΔΔMFCCs coefficients together, along with linear prediction coefficients (LPCs). In turn, those features were classified using artificial neural networks. Our experimental results demonstrate that the best results were 75.34% detection with the MFCC method, 73.97% detection with the ΔMFCC method, 72.37% detection with the ΔΔMFCC method, 76.48% detection with the MFCC + ΔMFCC + ΔΔMFCC method, and 74.77% detection with the LPC method. Using the MFCC + ΔMFCC + ΔΔMFCC method, in which the features are used together, we determined that the models give far superior results even with relatively few epochs. The proposed method is also more robust than other methods in the literature because it does not use threshold values.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Data availability

Data will be made available on reasonable request.

References

  1. Khan, M.K., Zakariah, M., Malik, H., Choo, K.K.R.: A novel audio forensic data-set for digital multimedia forensics. Aust. J. Forensic Sci. 50(5), 525–542 (2018). https://doi.org/10.1080/00450618.2017.1296186

    Article  Google Scholar 

  2. Bourouis, S., Alroobaea, R., Alharbi, A.M., Andejany, M., Rubaiee, S.: Recent advances in digital multimedia tampering detection for forensics analysis. Symmetry 12(11), 1811 (2020). https://doi.org/10.3390/sym12111811

    Article  Google Scholar 

  3. Sunitha, K., Krishna, A.N., Prasad, B.G.: Copy-move tampering detection using keypoint based hybrid feature extraction and improved transformation model. Appl. Intell. 52(13), 15405–15416 (2022)

    Article  Google Scholar 

  4. Patel, R., Lad, K., Patel, M.: Study and investigation of video steganography over uncompressed and compressed domain: a comprehensive review. Multimedia Syst. 27(5), 985–1024 (2021)

    Article  Google Scholar 

  5. Kasapoğlu, B., Turgay, K.O.Ç.: Sentetik ve Dönüştürülmüş Konuşmaların Tespitinde Genlik ve Faz Tabanlı Spektral Özniteliklerin Kullanılması. Avrupa Bilim ve Teknoloji Dergisi, pp. 398–406. (2020). https://doi.org/10.31590/ejosat.780650

  6. Javed, A., Malik, K.M., Irtaza, A., Malik, H.: Towards protecting cyber-physical and IoT systems from single-and multi-order voice spoofing attacks. Appl. Acoust. 183, 108283 (2021). https://doi.org/10.1016/j.apacoust.2021.108283

    Article  Google Scholar 

  7. Yan, Q., Yang, R., Huang, J.: Copy-move detection of audio recording with pitch similarity. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 1782–1786). IEEE (2015)

  8. Imran, M., Ali, Z., Bakhsh, S.T., Akram, S.: Blind detection of copy-move forgery in digital audio forensics. IEEE Access 5, 12843–12855 (2017). https://doi.org/10.1109/ACCESS.2017.2717842

    Article  Google Scholar 

  9. Wang, Z., Yang, Y., Zeng, C., Kong, S., Feng, S., Zhao, N.: Shallow and deep feature fusion for digital audio tampering detection. EURASIP J. Adv. Signal Process. 2022(1), 1–20 (2022)

    Article  Google Scholar 

  10. Maher, R.C.: Audio forensic examination. IEEE Signal Process. Mag. 26(2), 84–94 (2009). https://doi.org/10.1109/MSP.2008.931080

    Article  Google Scholar 

  11. Wang, F., Li, C., Tian, L.: An algorithm of detecting audio copy-move forgery based on DCT and SVD. In: 2017 IEEE 17th International Conference on Communication Technology (ICCT) (pp. 1652–1657). IEEE (2017)

  12. Jadhav, S., Patole, R., Rege, P.: Audio splicing detection using convolutional neural network. In: 2019 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT) (pp. 1–5). IEEE (2019)

  13. Chen, J., Xiang, S., Huang, H., Liu, W.: Detecting and locating digital audio forgeries based on singularity analysis with wavelet packet. Multimedia Tools Appl. 75(4), 2303–2325 (2016). https://doi.org/10.1007/s11042-014-2406-3

    Article  Google Scholar 

  14. Yang, R., Qu, Z., Huang, J.: Detecting digital audio forgeries by checking frame offsets. In: Proceedings of the 10th ACM Workshop on Multimedia and Security (pp. 21–26) (2008)

  15. Gupta, S., Cho, S., Kuo, C.C.J.: Current developments and future trends in audio authentication. IEEE Multimedia 19(1), 50–59 (2011). https://doi.org/10.1109/MMUL.2011.74

    Article  Google Scholar 

  16. Yan, Q., Yang, R., Huang, J.: Robust copy-move detection of speech recording using similarities of pitch and formant. IEEE Trans. Inform. Forensics Secur. 14(9), 2331–2341 (2019). https://doi.org/10.1109/TIFS.2019.2895965

    Article  Google Scholar 

  17. Liu, Z., Lu, W.: Fast copy-move detection of digital audio. In: 2017 IEEE Second international conference on data science in cyberspace (DSC) (pp. 625–629). IEEE (2017)

  18. Ali, Z., Imran, M., Alsulaiman, M.: An automatic digital audio authentication/forensics system. IEEE Access 5, 2994–3007 (2017). https://doi.org/10.1109/ACCESS.2017.2672681

    Article  Google Scholar 

  19. Bevinamarad, P.R., Shirldonkar, M.S.: Audio forgery detection techniques: present and past review. In: 2020 4th International Conference on Trends in Electronics and Informatics (ICOEI)(48184) (pp. 613–618). IEEE (2020)

  20. Li, C., Sun, Y., Meng, X., Tian, L.: Homologous audio copy-move tampering detection method based on pitch. In: 2019 IEEE 19th International Conference on Communication Technology (ICCT) (pp. 530–534). IEEE (2019)

  21. Xie, Z., Lu, W., Liu, X., Xue, Y., Yeung, Y.: Copy-move detection of digital audio based on multi-feature decision. J. Inform. Secur. Appl. 43, 37–46 (2018). https://doi.org/10.1016/j.jisa.2018.10.003

    Article  Google Scholar 

  22. Akdeniz, F., Becerikli, Y.: Detection of copy-move forgery in audio signal with mel frequency and delta-mel frequency kepstrum coefficients. In: 2021 Innovations in Intelligent Systems and Applications Conference (ASYU) (pp. 1–6). IEEE (2021)

  23. Akdeniz, F., Becerikli, Y.: Linear prediction coefficients based copy-move forgery detection in audio signal. In: 2022 International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT) (pp. 770–773). IEEE (2022)

  24. Su, Z., Li, M., Zhang, G., Wu, Q., Wang, Y.: Robust audio copy-move forgery detection on short forged slices using sliding window. J. Inform. Secur. Appl. 75, 103507 (2023)

    Google Scholar 

  25. Huang, X., Liu, Z., Lu, W., Liu, H., Xiang, S.: Fast and effective copy-move detection of digital audio based on auto segment. In: Digital Forensics and Forensic Investigations: Breakthroughs in Research and Practice (pp. 127–142). IGI Global (2020). https://doi.org/10.4018/978-1-7998-3025-2.ch011

  26. Xiao, J.N., Jia, Y.Z., Fu, E.D., Huang, Z., Li, Y., Shi, S.P.: Audio authenticity: duplicated audio segment detection in waveform audio file. J. Shanghai Jiaotong Univ. (Sci.) 19(4), 392–397 (2014). https://doi.org/10.1007/s12204-014-1515-5

    Article  Google Scholar 

  27. Kadiri, S.R., Yegnanarayana, B.: Estimation of fundamental frequency from singing voice using harmonics of impulse-like excitation source. In: Interspeech (pp. 2319–2323) (2018)

  28. Zahorian, S.A., Hu, H.: A spectral/temporal method for robust fundamental frequency tracking. J. Acoust. Soc. Am. 123(6), 4559–4571 (2008). https://doi.org/10.1121/1.2916590

    Article  PubMed  Google Scholar 

  29. Kasi, K.: Yet another algorithm for pitch tracking: (YAAPT) (Doctoral dissertation, Old Dominion University) (2002)

  30. Muda, L., Begam, M., Elamvazuthi, I.: Voice recognition algorithms using mel frequency cepstral coefficient (MFCC) and dynamic time warping (DTW) techniques. arXiv preprint arXiv:1003.4083. (2010). https://doi.org/10.48550/arXiv.1003.4083

  31. Ancilin, J., Milton, A.: Improved speech emotion recognition with Mel frequency magnitude coefficient. Appl. Acoust. 179, 108046 (2021)

    Article  Google Scholar 

  32. Hasan, M.R., Jamil, M., Rahman, M.G.R.M.S.: Speaker identification using mel frequency cepstral coefficients. Variations 1(4), 565–568 (2004)

    Google Scholar 

  33. Das, P.P., Allayear, S.M., Amin, R., Rahman, Z.: Bangladeshi dialect recognition using Mel frequency cepstral coefficient, delta, delta-delta and Gaussian mixture model. In: 2016 Eighth International Conference on Advanced Computational Intelligence (ICACI) (pp. 359–364). IEEE (2016)

  34. Hossan, M.A., Memon, S., Gregory, M.A.: A novel approach for MFCC feature extraction. In: 2010 4th International Conference on Signal Processing and Communication Systems (pp. 1–5). IEEE (2010)

  35. Abo-Zahhad, M., Farrag, M., Abbas, S.N., Ahmed, S.M.: A comparative approach between cepstral features for human authentication using heart sounds. SIViP 10(5), 843–851 (2016)

    Article  Google Scholar 

  36. YÜCESOY, E.: MFKK Özniteliklerine Eklenen Logaritmik Enerji ve Delta Parametrelerinin Yaş ve Cinsiyet Sınıflandırma Üzerindeki Etkileri. J. Ins. Sci. Technol. 11(1), 32–43 (2021). https://doi.org/10.21597/jist.772804

    Article  Google Scholar 

  37. Akdeniz, F., Kayikcioglu, İ, Kayikcioglu, T.: Classification of cardiac arrhythmias using Zhao-Atlas-Marks time-frequency distribution. Multimedia Tools Appl. 80(20), 30523–30537 (2021). https://doi.org/10.1007/s11042-021-10945-6

    Article  Google Scholar 

  38. Gupta, S., Shukla, R.S., Shukla, R.K.: Weighted Mel frequency cepstral coefficient based feature extraction for automatic assessment of stuttered speech using Bi-directional LSTM. Indian J. Sci. Technol. 14(5), 457–472 (2021). https://doi.org/10.17485/IJST/v14i5.2276

    Article  Google Scholar 

  39. Abeysinghe, A., Fard, M., Jazar, R., Zambetta, F., Davy, J.: Mel frequency cepstral coefficient temporal feature integration for classifying squeak and rattle noise. J. Acoust. Soc. Am. 150(1), 193–201 (2021). https://doi.org/10.1121/10.0005201

    Article  PubMed  Google Scholar 

  40. Prabakaran, D., Shyamala, R.: A review on performance of voice feature extraction techniques. In: 2019 3rd International Conference on Computing and Communications Technologies (ICCCT) (pp. 221–231). IEEE (2019)

  41. Sharma, G., Umapathy, K., Krishnan, S.: Trends in audio signal feature extraction methods. Appl. Acoust. 158, 107020 (2020)

    Article  Google Scholar 

  42. Valin, J.M., Skoglund, J.: LPCNet: improving neural speech synthesis through linear prediction. In: ICASSP 2019–2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 5891–5895). IEEE (2019)

  43. Juvela, L., Tsiaras, V., Bollepalli, B., Airaksinen, M., Yamagishi, J., Alku, P.: Speaker-independent raw waveform model for glottal excitation. arXiv preprint arXiv:1804.09593 (2018)

  44. Siniscalchi, S.M., Svendsen, T., Lee, C.H.: An artificial neural network approach to automatic speech processing. Neurocomputing 140, 326–338 (2014). https://doi.org/10.1016/j.neucom.2014.03.005

    Article  Google Scholar 

  45. Güraksin, G.E.: Kalp seslerinin yapay sinir ağları ile sınıflandırılması (Master's thesis, Fen Bilimleri Enstitüsü).(2009)

  46. Akdeniz, F., Becerikli, Y.: Performance comparison of support vector machine, k-nearest-neighbor, artificial neural networks, and recurrent neural networks in gender recognition from voice signals. In: 2019 3rd International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT) (pp. 1–4). IEEE (2019)

  47. Machado, T.J., Vieira Filho, J., de Oliveira, M.A.: Forensic speaker verification using ordinary least squares. Sensors 19(20), 4385 (2019). https://doi.org/10.3390/s19204385

    Article  PubMed  PubMed Central  Google Scholar 

  48. Tibdewal, M.N., Fate, R.R., Mahadevappa, M., Ray, A.K., Malokar, M.: Classification of artifactual EEG signal and detection of multiple eye movement artifact zones using novel time-amplitude algorithm. SIViP 11(2), 333–340 (2017)

    Article  Google Scholar 

  49. Wang, F., Chen, Z., Wu, C., Yang, Y.: Prediction on sound insulation properties of ultrafine glass wool mats with artificial neural networks. Appl. Acoust. 146, 164–171 (2019). https://doi.org/10.1016/j.apacoust.2018.11.018

    Article  Google Scholar 

  50. Kır Savaş, B., Becerikli, Y.: Behavior-based driver fatigue detection system with deep belief network. Neural Comput. Appl. (2022). https://doi.org/10.1007/s00521-022-07141-4

    Article  Google Scholar 

  51. Garofolo, J.S., Lamel, L.F., Fisher, W.M., Fiscus, J.G., Pallett, D.S.: DARPA TIMIT acoustic-phonetic continous speech corpus CD-ROM. NIST speech disc 1–1.1. NASA STI/Recon Tech. Rep N 93, 27403 (1993)

    Google Scholar 

Download references

Acknowledgements

Our study was supported by the Scientific and Technological Research Council of Turkey (Grant No. 121E725).

Funding

We received research funding from the Scientific and Technological Research Council of Turkey (Grant No. 121E725).

Author information

Authors and Affiliations

Authors

Contributions

FA was involved in the conceptualization, method, software, validation, formal analysis, investigation, software, resources, and data curation. YB contributed to the software, validation, supervision, conceptualization, review, and editing.

Corresponding author

Correspondence to Fulya Akdeniz.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Akdeniz, F., Becerikli, Y. Detecting audio copy-move forgery with an artificial neural network. SIViP 18, 2117–2133 (2024). https://doi.org/10.1007/s11760-023-02856-w

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-023-02856-w

Keywords

Navigation