Detecting audio copy-move forgery with an artificial neural network

Akdeniz, Fulya; Becerikli, Yaşar

doi:10.1007/s11760-023-02856-w

Detecting audio copy-move forgery with an artificial neural network

Original Paper
Published: 11 January 2024

Volume 18, pages 2117–2133, (2024)
Cite this article

Signal, Image and Video Processing Aims and scope Submit manuscript

Fulya Akdeniz¹ &
Yaşar Becerikli¹

352 Accesses
1 Citation
Explore all metrics

Abstract

Given how easily audio data can be obtained, audio recordings are subject to both malicious and unmalicious tampering and manipulation that can compromise the integrity and reliability of audio data. Because audio recordings can be used in many strategic areas, detecting such tampering and manipulation of audio data is critical. Although the literature demonstrates the lack of any accurate, integrated system for detecting copy-move forgery, the field shows great promise for research. Thus, our proposed method seeks to support the detection of the passive technique of audio copy-move forgery. For our study, forgery audio data were obtained from the TIMIT dataset, and 4378 audio recordings were used: 2189 of original audio and 2189 of audio created by copy-move forgery. After the voiced and unvoiced regions in the audio signal were determined by the yet another algorithm for pitch tracking, the features were obtained from the signals using Mel frequency cepstrum coefficients (MFCCs), delta (Δ) MFCCs, and ΔΔMFCCs coefficients together, along with linear prediction coefficients (LPCs). In turn, those features were classified using artificial neural networks. Our experimental results demonstrate that the best results were 75.34% detection with the MFCC method, 73.97% detection with the ΔMFCC method, 72.37% detection with the ΔΔMFCC method, 76.48% detection with the MFCC + ΔMFCC + ΔΔMFCC method, and 74.77% detection with the LPC method. Using the MFCC + ΔMFCC + ΔΔMFCC method, in which the features are used together, we determined that the models give far superior results even with relatively few epochs. The proposed method is also more robust than other methods in the literature because it does not use threshold values.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Recurrent neural network and long short-term memory models for audio copy-move forgery detection: a comprehensive study

Article Open access 29 April 2024

Robust copy-move detection and localization of digital audio based CFCC feature

Article 10 May 2024

Mel spectrogram-based audio forgery detection using CNN

Article 19 December 2022

Data availability

Data will be made available on reasonable request.

References

Khan, M.K., Zakariah, M., Malik, H., Choo, K.K.R.: A novel audio forensic data-set for digital multimedia forensics. Aust. J. Forensic Sci. 50(5), 525–542 (2018). https://doi.org/10.1080/00450618.2017.1296186
Article Google Scholar
Bourouis, S., Alroobaea, R., Alharbi, A.M., Andejany, M., Rubaiee, S.: Recent advances in digital multimedia tampering detection for forensics analysis. Symmetry 12(11), 1811 (2020). https://doi.org/10.3390/sym12111811
Article Google Scholar
Sunitha, K., Krishna, A.N., Prasad, B.G.: Copy-move tampering detection using keypoint based hybrid feature extraction and improved transformation model. Appl. Intell. 52(13), 15405–15416 (2022)
Article Google Scholar
Patel, R., Lad, K., Patel, M.: Study and investigation of video steganography over uncompressed and compressed domain: a comprehensive review. Multimedia Syst. 27(5), 985–1024 (2021)
Article Google Scholar
Kasapoğlu, B., Turgay, K.O.Ç.: Sentetik ve Dönüştürülmüş Konuşmaların Tespitinde Genlik ve Faz Tabanlı Spektral Özniteliklerin Kullanılması. Avrupa Bilim ve Teknoloji Dergisi, pp. 398–406. (2020). https://doi.org/10.31590/ejosat.780650
Javed, A., Malik, K.M., Irtaza, A., Malik, H.: Towards protecting cyber-physical and IoT systems from single-and multi-order voice spoofing attacks. Appl. Acoust. 183, 108283 (2021). https://doi.org/10.1016/j.apacoust.2021.108283
Article Google Scholar
Yan, Q., Yang, R., Huang, J.: Copy-move detection of audio recording with pitch similarity. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 1782–1786). IEEE (2015)
Imran, M., Ali, Z., Bakhsh, S.T., Akram, S.: Blind detection of copy-move forgery in digital audio forensics. IEEE Access 5, 12843–12855 (2017). https://doi.org/10.1109/ACCESS.2017.2717842
Article Google Scholar
Wang, Z., Yang, Y., Zeng, C., Kong, S., Feng, S., Zhao, N.: Shallow and deep feature fusion for digital audio tampering detection. EURASIP J. Adv. Signal Process. 2022(1), 1–20 (2022)
Article Google Scholar
Maher, R.C.: Audio forensic examination. IEEE Signal Process. Mag. 26(2), 84–94 (2009). https://doi.org/10.1109/MSP.2008.931080
Article Google Scholar
Wang, F., Li, C., Tian, L.: An algorithm of detecting audio copy-move forgery based on DCT and SVD. In: 2017 IEEE 17th International Conference on Communication Technology (ICCT) (pp. 1652–1657). IEEE (2017)
Jadhav, S., Patole, R., Rege, P.: Audio splicing detection using convolutional neural network. In: 2019 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT) (pp. 1–5). IEEE (2019)
Chen, J., Xiang, S., Huang, H., Liu, W.: Detecting and locating digital audio forgeries based on singularity analysis with wavelet packet. Multimedia Tools Appl. 75(4), 2303–2325 (2016). https://doi.org/10.1007/s11042-014-2406-3
Article Google Scholar
Yang, R., Qu, Z., Huang, J.: Detecting digital audio forgeries by checking frame offsets. In: Proceedings of the 10th ACM Workshop on Multimedia and Security (pp. 21–26) (2008)
Gupta, S., Cho, S., Kuo, C.C.J.: Current developments and future trends in audio authentication. IEEE Multimedia 19(1), 50–59 (2011). https://doi.org/10.1109/MMUL.2011.74
Article Google Scholar
Yan, Q., Yang, R., Huang, J.: Robust copy-move detection of speech recording using similarities of pitch and formant. IEEE Trans. Inform. Forensics Secur. 14(9), 2331–2341 (2019). https://doi.org/10.1109/TIFS.2019.2895965
Article Google Scholar
Liu, Z., Lu, W.: Fast copy-move detection of digital audio. In: 2017 IEEE Second international conference on data science in cyberspace (DSC) (pp. 625–629). IEEE (2017)
Ali, Z., Imran, M., Alsulaiman, M.: An automatic digital audio authentication/forensics system. IEEE Access 5, 2994–3007 (2017). https://doi.org/10.1109/ACCESS.2017.2672681
Article Google Scholar
Bevinamarad, P.R., Shirldonkar, M.S.: Audio forgery detection techniques: present and past review. In: 2020 4th International Conference on Trends in Electronics and Informatics (ICOEI)(48184) (pp. 613–618). IEEE (2020)
Li, C., Sun, Y., Meng, X., Tian, L.: Homologous audio copy-move tampering detection method based on pitch. In: 2019 IEEE 19th International Conference on Communication Technology (ICCT) (pp. 530–534). IEEE (2019)
Xie, Z., Lu, W., Liu, X., Xue, Y., Yeung, Y.: Copy-move detection of digital audio based on multi-feature decision. J. Inform. Secur. Appl. 43, 37–46 (2018). https://doi.org/10.1016/j.jisa.2018.10.003
Article Google Scholar
Akdeniz, F., Becerikli, Y.: Detection of copy-move forgery in audio signal with mel frequency and delta-mel frequency kepstrum coefficients. In: 2021 Innovations in Intelligent Systems and Applications Conference (ASYU) (pp. 1–6). IEEE (2021)
Akdeniz, F., Becerikli, Y.: Linear prediction coefficients based copy-move forgery detection in audio signal. In: 2022 International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT) (pp. 770–773). IEEE (2022)
Su, Z., Li, M., Zhang, G., Wu, Q., Wang, Y.: Robust audio copy-move forgery detection on short forged slices using sliding window. J. Inform. Secur. Appl. 75, 103507 (2023)
Google Scholar
Huang, X., Liu, Z., Lu, W., Liu, H., Xiang, S.: Fast and effective copy-move detection of digital audio based on auto segment. In: Digital Forensics and Forensic Investigations: Breakthroughs in Research and Practice (pp. 127–142). IGI Global (2020). https://doi.org/10.4018/978-1-7998-3025-2.ch011
Xiao, J.N., Jia, Y.Z., Fu, E.D., Huang, Z., Li, Y., Shi, S.P.: Audio authenticity: duplicated audio segment detection in waveform audio file. J. Shanghai Jiaotong Univ. (Sci.) 19(4), 392–397 (2014). https://doi.org/10.1007/s12204-014-1515-5
Article Google Scholar
Kadiri, S.R., Yegnanarayana, B.: Estimation of fundamental frequency from singing voice using harmonics of impulse-like excitation source. In: Interspeech (pp. 2319–2323) (2018)
Zahorian, S.A., Hu, H.: A spectral/temporal method for robust fundamental frequency tracking. J. Acoust. Soc. Am. 123(6), 4559–4571 (2008). https://doi.org/10.1121/1.2916590
Article PubMed Google Scholar
Kasi, K.: Yet another algorithm for pitch tracking: (YAAPT) (Doctoral dissertation, Old Dominion University) (2002)
Muda, L., Begam, M., Elamvazuthi, I.: Voice recognition algorithms using mel frequency cepstral coefficient (MFCC) and dynamic time warping (DTW) techniques. arXiv preprint arXiv:1003.4083. (2010). https://doi.org/10.48550/arXiv.1003.4083
Ancilin, J., Milton, A.: Improved speech emotion recognition with Mel frequency magnitude coefficient. Appl. Acoust. 179, 108046 (2021)
Article Google Scholar
Hasan, M.R., Jamil, M., Rahman, M.G.R.M.S.: Speaker identification using mel frequency cepstral coefficients. Variations 1(4), 565–568 (2004)
Google Scholar
Das, P.P., Allayear, S.M., Amin, R., Rahman, Z.: Bangladeshi dialect recognition using Mel frequency cepstral coefficient, delta, delta-delta and Gaussian mixture model. In: 2016 Eighth International Conference on Advanced Computational Intelligence (ICACI) (pp. 359–364). IEEE (2016)
Hossan, M.A., Memon, S., Gregory, M.A.: A novel approach for MFCC feature extraction. In: 2010 4th International Conference on Signal Processing and Communication Systems (pp. 1–5). IEEE (2010)
Abo-Zahhad, M., Farrag, M., Abbas, S.N., Ahmed, S.M.: A comparative approach between cepstral features for human authentication using heart sounds. SIViP 10(5), 843–851 (2016)
Article Google Scholar
YÜCESOY, E.: MFKK Özniteliklerine Eklenen Logaritmik Enerji ve Delta Parametrelerinin Yaş ve Cinsiyet Sınıflandırma Üzerindeki Etkileri. J. Ins. Sci. Technol. 11(1), 32–43 (2021). https://doi.org/10.21597/jist.772804
Article Google Scholar
Akdeniz, F., Kayikcioglu, İ, Kayikcioglu, T.: Classification of cardiac arrhythmias using Zhao-Atlas-Marks time-frequency distribution. Multimedia Tools Appl. 80(20), 30523–30537 (2021). https://doi.org/10.1007/s11042-021-10945-6
Article Google Scholar
Gupta, S., Shukla, R.S., Shukla, R.K.: Weighted Mel frequency cepstral coefficient based feature extraction for automatic assessment of stuttered speech using Bi-directional LSTM. Indian J. Sci. Technol. 14(5), 457–472 (2021). https://doi.org/10.17485/IJST/v14i5.2276
Article Google Scholar
Abeysinghe, A., Fard, M., Jazar, R., Zambetta, F., Davy, J.: Mel frequency cepstral coefficient temporal feature integration for classifying squeak and rattle noise. J. Acoust. Soc. Am. 150(1), 193–201 (2021). https://doi.org/10.1121/10.0005201
Article PubMed Google Scholar
Prabakaran, D., Shyamala, R.: A review on performance of voice feature extraction techniques. In: 2019 3rd International Conference on Computing and Communications Technologies (ICCCT) (pp. 221–231). IEEE (2019)
Sharma, G., Umapathy, K., Krishnan, S.: Trends in audio signal feature extraction methods. Appl. Acoust. 158, 107020 (2020)
Article Google Scholar
Valin, J.M., Skoglund, J.: LPCNet: improving neural speech synthesis through linear prediction. In: ICASSP 2019–2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 5891–5895). IEEE (2019)
Juvela, L., Tsiaras, V., Bollepalli, B., Airaksinen, M., Yamagishi, J., Alku, P.: Speaker-independent raw waveform model for glottal excitation. arXiv preprint arXiv:1804.09593 (2018)
Siniscalchi, S.M., Svendsen, T., Lee, C.H.: An artificial neural network approach to automatic speech processing. Neurocomputing 140, 326–338 (2014). https://doi.org/10.1016/j.neucom.2014.03.005
Article Google Scholar
Güraksin, G.E.: Kalp seslerinin yapay sinir ağları ile sınıflandırılması (Master's thesis, Fen Bilimleri Enstitüsü).(2009)
Akdeniz, F., Becerikli, Y.: Performance comparison of support vector machine, k-nearest-neighbor, artificial neural networks, and recurrent neural networks in gender recognition from voice signals. In: 2019 3rd International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT) (pp. 1–4). IEEE (2019)
Machado, T.J., Vieira Filho, J., de Oliveira, M.A.: Forensic speaker verification using ordinary least squares. Sensors 19(20), 4385 (2019). https://doi.org/10.3390/s19204385
Article PubMed PubMed Central Google Scholar
Tibdewal, M.N., Fate, R.R., Mahadevappa, M., Ray, A.K., Malokar, M.: Classification of artifactual EEG signal and detection of multiple eye movement artifact zones using novel time-amplitude algorithm. SIViP 11(2), 333–340 (2017)
Article Google Scholar
Wang, F., Chen, Z., Wu, C., Yang, Y.: Prediction on sound insulation properties of ultrafine glass wool mats with artificial neural networks. Appl. Acoust. 146, 164–171 (2019). https://doi.org/10.1016/j.apacoust.2018.11.018
Article Google Scholar
Kır Savaş, B., Becerikli, Y.: Behavior-based driver fatigue detection system with deep belief network. Neural Comput. Appl. (2022). https://doi.org/10.1007/s00521-022-07141-4
Article Google Scholar
Garofolo, J.S., Lamel, L.F., Fisher, W.M., Fiscus, J.G., Pallett, D.S.: DARPA TIMIT acoustic-phonetic continous speech corpus CD-ROM. NIST speech disc 1–1.1. NASA STI/Recon Tech. Rep N 93, 27403 (1993)
Google Scholar

Download references

Acknowledgements

Our study was supported by the Scientific and Technological Research Council of Turkey (Grant No. 121E725).

Funding

We received research funding from the Scientific and Technological Research Council of Turkey (Grant No. 121E725).

Author information

Authors and Affiliations

Department of Computer Engineering, Kocaeli University, Kocaeli, Turkey
Fulya Akdeniz & Yaşar Becerikli

Authors

Fulya Akdeniz
View author publications
You can also search for this author in PubMed Google Scholar
Yaşar Becerikli
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

FA was involved in the conceptualization, method, software, validation, formal analysis, investigation, software, resources, and data curation. YB contributed to the software, validation, supervision, conceptualization, review, and editing.

Corresponding author

Correspondence to Fulya Akdeniz.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Akdeniz, F., Becerikli, Y. Detecting audio copy-move forgery with an artificial neural network. SIViP 18, 2117–2133 (2024). https://doi.org/10.1007/s11760-023-02856-w

Download citation

Received: 23 September 2022
Revised: 11 October 2023
Accepted: 17 October 2023
Published: 11 January 2024
Issue Date: April 2024
DOI: https://doi.org/10.1007/s11760-023-02856-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Detecting audio copy-move forgery with an artificial neural network

Abstract

Access this article

Similar content being viewed by others

Recurrent neural network and long short-term memory models for audio copy-move forgery detection: a comprehensive study

Robust copy-move detection and localization of digital audio based CFCC feature

Mel spectrogram-based audio forgery detection using CNN

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Detecting audio copy-move forgery with an artificial neural network

Abstract

Access this article

Similar content being viewed by others

Recurrent neural network and long short-term memory models for audio copy-move forgery detection: a comprehensive study

Robust copy-move detection and localization of digital audio based CFCC feature

Mel spectrogram-based audio forgery detection using CNN

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation