An encrypted speech authentication and tampering recovery method based on perceptual hashing

Zhang, Qiu-yu; Zhang, Deng-hai; Xu, Fu-jiu

doi:10.1007/s11042-021-10905-0

An encrypted speech authentication and tampering recovery method based on perceptual hashing

Published: 12 April 2021

Volume 80, pages 24925–24948, (2021)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

420 Accesses
7 Citations
Explore all metrics

Abstract

With the progress of speech retrieval technology in the cloud, it brings a lot of conveniences for speech user. Yet, the inquiry encrypted speech results from the speech retrieval system are faced with some secure issues to settle, such as integrity authentication and tampering recovery. In this paper, an encrypted speech authentication and tampering recovery method based on perceptual hashing is proposed. Firstly, the original speech is scrambled by Duffing mapping to construct an encrypted speech library in the cloud, through extracting product of uniform sub-band spectrum variance and spectral entropy of encrypted speech and constructing a perceptual hashing sequence to generate the hashing template of the cloud. From this, a one-to-one correspondence between the encrypted speech and perceptual hashing sequence is established. Secondly, the authentication digest of encrypted speech is extracted according to the inquiry result during the retrieval. Then, the authentication digest and the perceptual hashing sequence of the hashing template in the cloud are matched by the Hamming distance algorithm. Finally, for encrypted speech that fails authentication, tampering detection and location are performed, and the tampered samples are recovered by the least square curve fitting method. The simulation results show that the proposed method can extract the authentication digest directly in the encrypted speech, and the authentication digest not only has good discrimination and robustness, but it accurately locates the tampered area for malicious substitution and mute attacks. In addition, the proposed method can recover tampered speech signals in high quality without any extra information.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 4

Fig. 5

A comprehensive survey on automatic speech recognition using neural networks

Article 15 August 2023

Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques

Article Open access 03 January 2024

A Deep Learning Framework for Audio Deepfake Detection

Article 08 November 2021

References

Ali AH, George LE, Zaidan AA, Mokhtar MR (2018) High capacity, transparent and secure audio steganography model based on fractal coding and chaotic map in temporal domain. Multimed Tools Appl 77(23):31487–31516. https://doi.org/10.1007/s11042-018-6213-0
Article Google Scholar
Chen N, Wan WG (2010) Robust speech hash function. ETRI J 32(2):345–347. https://doi.org/10.4218/etrij.10.0209.0309
Article MathSciNet Google Scholar
Chen J, Zheng P, Guo J, Zhang W, Huang JW (2018) A privacy-preserving multipurpose watermarking scheme for audio authentication and protection. In 17th IEEE international conference on trust, security and privacy in computing and communications (IEEE TrustCom) / 12th IEEE international conference on big data science and engineering (IEEE BigDataSE). IEEE 86-91. https://doi.org/10.1109/TrustCom/BigDataSE.2018.00023
He SF, Zhao H (2017) A retrieval algorithm of encrypted speech based on syllable-level perceptual hashing. Comput Sci Inf Syst 14(3):703–718. https://doi.org/10.2298/CSIS170112024H
Article Google Scholar
Jawad AK, Abdullah HN, Hreshee SS (2018) Secure speech communication system based on scrambling and masking by chaotic maps. In 2018 international conference on advance of sustainable engineering and its application (ICASEA), 2018 international conference on. IEEE 7-12. https://doi.org/10.1109/ICASEA.2018.8370947
Jin X, Yu S, Liang Z, Chen Z, Pei J (2018) Video logo removal detection based on sparse representation. Multimed Tools Appl 77(22):29303–29322. https://doi.org/10.1007/s11042-018-5959-8
Article Google Scholar
Jithin KC, Sankar S (2020) Colour image encryption algorithm combining, Arnold map, DNA sequence operation, and a Mandelbrot set. J Inform Secur Appl 50:102428. https://doi.org/10.1016/j.jisa.2019.102428
Article Google Scholar
Kaur A, Dutta MK (2018) High embedding capacity and robust audio watermarking for secure transmission using tamper detection. ETRI J 40(1):133–145. https://doi.org/10.4218/etrij.2017-0092
Article Google Scholar
Kocal OH, Yürüklü E, Dilaveroğlu E (2016) Speech steganalysis based on the delay vector variance method. Turkish J Electric Eng Comput Sci 24(5):4129–4141. https://doi.org/10.3906/elk-1411-167
Article Google Scholar
Kumar R, Goyal R (2019) On cloud security requirements, threats, vulnerabilities and countermeasures: a survey. Comput Sci Rev 33:1–48. https://doi.org/10.1016/j.cosrev.2019.05.002
Article MathSciNet Google Scholar
Kusuma EJ, Indriani OR, Sari CA, Rachmawanto EH (2017) An imperceptible LSB image hiding on edge region using DES encryption. In 2017 international conference on innovative and creative information technology (ICITech). IEEE 1-6. https://doi.org/10.1109/INNOCIT.2017.8319132
Li JF, Wang HX, Jing Y (2015) Audio perceptual hashing based on NMF and MDCT coefficients. Chin J Electron 24(3):579–588. https://doi.org/10.1049/cje.2015.07.024
Article Google Scholar
Liu Z, Wang H (2014) A novel speech content authentication algorithm based on Bessel–Fourier moments. Digital Signal Process 24:197–208. https://doi.org/10.1016/j.dsp.2013.09.007
Article MathSciNet Google Scholar
Liu ZH, Zhang F, Wang J, Wang HX, Huang JW (2016) Authentication and recovery algorithm for speech signal based on digital watermarking. Signal Process 123:157–166. https://doi.org/10.1016/j.sigpro.2015.10.023
Article Google Scholar
Liu Y, Tang S, Liu R, Zhang L, Ma Z (2018) Secure and robust digital image watermarking scheme using logistic and RSA encryption. Expert Syst Appl 97:95–105. https://doi.org/10.1016/j.eswa.2017.12.003
Article Google Scholar
Lu W, Chen Z, Li L, Cao X, Wei J, Xiong N, Dang J (2018) Watermarking based on compressive sensing for digital speech detection and recovery. Sensors 18(7):2390–1-22. https://doi.org/10.3390/s18072390
Article Google Scholar
Luo XR, Xiang SJ (2014) Fragile audio watermarking with perfect restoration capacity based on an adapted integer transform. Wuhan Univ J Nat Sci 19(6):497–504. https://doi.org/10.1007/s11859-014-1044-y
Article MathSciNet Google Scholar
Menendez-Ortiz A, Feregrino-Uribe C, Garcia-Hernandez JJ, Guzman-Zavaleta ZJ (2017) Self-recovery scheme for audio restoration after a content replacement attack. Multimed Tools Appl 76(12):14197–14224. https://doi.org/10.1007/s11042-016-3783-6
Article Google Scholar
Menendez-Ortiz A, Feregrino-Uribe C, Garcia-Hernandez JJ (2018) Self-recovery scheme for audio restoration using auditory masking. PloS one 13(9):e0204442, 1-23. https://doi.org/10.1371/journal.pone.0204442
Article Google Scholar
Mostafa A, Soliman NF, Abdalluh M, EI-samie FE (2015) Speech encryption using two dimensional chaotic maps. In Computer Engineering Conference (ICENCO), 2015 Computer engineering conference on. IEEE 235–240. https://doi.org/10.1109/ICENCO.2015.7416354
Mustafa I, Abbas Z, Arif A, Javed T, Ghaffari A (2020) Stability analysis for multiple solutions of boundary layer flow towards a shrinking sheet: analytical solution by using least square method. Physica A-StatisticMechan Applications 540:123028. https://doi.org/10.1016/j.physa.2019.123028
Article MathSciNet Google Scholar
Qian Q, Wang H, Shi C, Wang H (2016) An efficient content authentication scheme in encrypted speech based on integer wavelet transform. In 2016 Asia-Pacific signal and information processing association annual summit and conference (APSIPA). IEEE 1–8. https://doi.org/10.1109/APSIPA.2016.7820814
Qian Q, Wang HX, Hu Y, Zhou LN, Li JF (2016) A dual fragile watermarking scheme for speech authentication. Multimed Tools Appl 75(21):13431–13450. https://doi.org/10.1007/s11042-015-2801-4
Article Google Scholar
Qian Q, Wang HX, Sun XM, Cui YH, Wang H, Shi CH (2018) Speech authentication and content recovery scheme for security communication and storage. Telecommun Syst 67(4):635–649. https://doi.org/10.1007/s11235-017-0360-x
Article Google Scholar
Rihan SD, Khalid A, Osman SEF (2015) A performance comparison of encryption algorithms AES and DES. Int J Eng Res Technol (IJERT) 4(12):151–154. https://doi.org/10.1109/ICICT.2005.1598556
Article Google Scholar
Shahadi HI (2018) Covert communication model for speech signals based on an indirect and adaptive encryption technique. Comput Electric Eng 68:425–436. https://doi.org/10.1016/j.compeleceng.2018.04.018
Article Google Scholar
Sheela SJ, Suresh KV, Tandur D (2017) Chaos based speech encryption using modified Henon map. In 2017 second international conference on electrical, computer and communication technologies (ICECCT). 2017 international conference on. IEEE 1-7. https://doi.org/10.1109/ICECCT.2017.8117918
Sun F, Li Y, Liu Z, Qi C (2019) Speech forensics based on sample correlation degree. Advances in computer communication and computational sciences. Springer, Singapore, pp 173–183. https://doi.org/10.1007/978-981-13-0344-9_14
Book Google Scholar
Wang W, Hu GM, Yang L, Huang DF, Zhou Y (2016) Research of endpoint detection based on spectral subtraction and uniform sub-band spectrum variance. Audio Eng 40(5):40–43. https://doi.org/10.16311/j.audioe.2016.05.09
Article Google Scholar
Wu XZ, Xia LX, Zhang X, Zhou C (2019) Voice activity detection method based on MFPH. J Beijing Univ Posts Telecommun 42(2):83–89. https://doi.org/10.13190/j.jbupt.2018-228
Article Google Scholar
Xiang S, He J (2017) Database authentication watermarking scheme in encrypted domain. IET Inf Secur 12(1):42–51. https://doi.org/10.1049/iet-ifs.2017.0092
Article Google Scholar
Yang WX, Tang SY, Li MQ, Zhou BB, Jiang YJ (2018) Markov bidirectional transfer matrix for detecting LSB speech steganography with low embedding rates. Multimed Tools Appl 77(14):17937–17952. https://doi.org/10.1007/s11042-017-5505-0
Article Google Scholar
Zhang QY, Hu WJ, Huang YB, Qiao SB (2018) An efficient perceptual hashing based on improved spectral entropy for speech authentication. Multimed Tools Appl 77(2):1555–1581. https://doi.org/10.1007/s11042-017-4381-y
Article Google Scholar
Zhang QY, Zhou L, Zhang T, Zhang DH (2019) A retrieval algorithm of encrypted speech based on short-term cross-correlation and perceptual hashing. Multimed Tools Appl 78(13):17825–17846. https://doi.org/10.1007/s11042-019-7180-9
Zhao H, He SF (2016) A retrieval algorithm for encrypted speech based on perceptual hashing. In IEEE 2016 natural computation, fuzzy systems and knowledge discovery (ICNC-FSKD). IEEE 1840-1845. 10.1109 / FSKD.2016.7603458

Download references

Acknowledgments

This work is supported by the National Natural Science Foundation of China (No. 61862041, 61363078). The authors would like to thank the anonymous reviewers for their helpful comments and suggestions.

Author information

Authors and Affiliations

School of Computer and Communication, Lanzhou University of Technology, Lanzhou, 730050, China
Qiu-yu Zhang, Deng-hai Zhang & Fu-jiu Xu

Authors

Qiu-yu Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Deng-hai Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Fu-jiu Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qiu-yu Zhang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, Qy., Zhang, Dh. & Xu, Fj. An encrypted speech authentication and tampering recovery method based on perceptual hashing. Multimed Tools Appl 80, 24925–24948 (2021). https://doi.org/10.1007/s11042-021-10905-0

Download citation

Received: 27 April 2020
Revised: 23 December 2020
Accepted: 01 April 2021
Published: 12 April 2021
Issue Date: July 2021
DOI: https://doi.org/10.1007/s11042-021-10905-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An encrypted speech authentication and tampering recovery method based on perceptual hashing

Abstract

Access this article

Similar content being viewed by others

A comprehensive survey on automatic speech recognition using neural networks

Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques

A Deep Learning Framework for Audio Deepfake Detection

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An encrypted speech authentication and tampering recovery method based on perceptual hashing

Abstract

Access this article

Similar content being viewed by others

A comprehensive survey on automatic speech recognition using neural networks

Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques

A Deep Learning Framework for Audio Deepfake Detection

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation