Skip to main content
Log in

An encrypted speech authentication and tampering recovery method based on perceptual hashing

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

With the progress of speech retrieval technology in the cloud, it brings a lot of conveniences for speech user. Yet, the inquiry encrypted speech results from the speech retrieval system are faced with some secure issues to settle, such as integrity authentication and tampering recovery. In this paper, an encrypted speech authentication and tampering recovery method based on perceptual hashing is proposed. Firstly, the original speech is scrambled by Duffing mapping to construct an encrypted speech library in the cloud, through extracting product of uniform sub-band spectrum variance and spectral entropy of encrypted speech and constructing a perceptual hashing sequence to generate the hashing template of the cloud. From this, a one-to-one correspondence between the encrypted speech and perceptual hashing sequence is established. Secondly, the authentication digest of encrypted speech is extracted according to the inquiry result during the retrieval. Then, the authentication digest and the perceptual hashing sequence of the hashing template in the cloud are matched by the Hamming distance algorithm. Finally, for encrypted speech that fails authentication, tampering detection and location are performed, and the tampered samples are recovered by the least square curve fitting method. The simulation results show that the proposed method can extract the authentication digest directly in the encrypted speech, and the authentication digest not only has good discrimination and robustness, but it accurately locates the tampered area for malicious substitution and mute attacks. In addition, the proposed method can recover tampered speech signals in high quality without any extra information.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Ali AH, George LE, Zaidan AA, Mokhtar MR (2018) High capacity, transparent and secure audio steganography model based on fractal coding and chaotic map in temporal domain. Multimed Tools Appl 77(23):31487–31516. https://doi.org/10.1007/s11042-018-6213-0

    Article  Google Scholar 

  2. Chen N, Wan WG (2010) Robust speech hash function. ETRI J 32(2):345–347. https://doi.org/10.4218/etrij.10.0209.0309

    Article  MathSciNet  Google Scholar 

  3. Chen J, Zheng P, Guo J, Zhang W, Huang JW (2018) A privacy-preserving multipurpose watermarking scheme for audio authentication and protection. In 17th IEEE international conference on trust, security and privacy in computing and communications (IEEE TrustCom) / 12th IEEE international conference on big data science and engineering (IEEE BigDataSE). IEEE 86-91. https://doi.org/10.1109/TrustCom/BigDataSE.2018.00023

  4. He SF, Zhao H (2017) A retrieval algorithm of encrypted speech based on syllable-level perceptual hashing. Comput Sci Inf Syst 14(3):703–718. https://doi.org/10.2298/CSIS170112024H

    Article  Google Scholar 

  5. Jawad AK, Abdullah HN, Hreshee SS (2018) Secure speech communication system based on scrambling and masking by chaotic maps. In 2018 international conference on advance of sustainable engineering and its application (ICASEA), 2018 international conference on. IEEE 7-12. https://doi.org/10.1109/ICASEA.2018.8370947

  6. Jin X, Yu S, Liang Z, Chen Z, Pei J (2018) Video logo removal detection based on sparse representation. Multimed Tools Appl 77(22):29303–29322. https://doi.org/10.1007/s11042-018-5959-8

    Article  Google Scholar 

  7. Jithin KC, Sankar S (2020) Colour image encryption algorithm combining, Arnold map, DNA sequence operation, and a Mandelbrot set. J Inform Secur Appl 50:102428. https://doi.org/10.1016/j.jisa.2019.102428

    Article  Google Scholar 

  8. Kaur A, Dutta MK (2018) High embedding capacity and robust audio watermarking for secure transmission using tamper detection. ETRI J 40(1):133–145. https://doi.org/10.4218/etrij.2017-0092

    Article  Google Scholar 

  9. Kocal OH, Yürüklü E, Dilaveroğlu E (2016) Speech steganalysis based on the delay vector variance method. Turkish J Electric Eng Comput Sci 24(5):4129–4141. https://doi.org/10.3906/elk-1411-167

    Article  Google Scholar 

  10. Kumar R, Goyal R (2019) On cloud security requirements, threats, vulnerabilities and countermeasures: a survey. Comput Sci Rev 33:1–48. https://doi.org/10.1016/j.cosrev.2019.05.002

    Article  MathSciNet  Google Scholar 

  11. Kusuma EJ, Indriani OR, Sari CA, Rachmawanto EH (2017) An imperceptible LSB image hiding on edge region using DES encryption. In 2017 international conference on innovative and creative information technology (ICITech). IEEE 1-6. https://doi.org/10.1109/INNOCIT.2017.8319132

  12. Li JF, Wang HX, Jing Y (2015) Audio perceptual hashing based on NMF and MDCT coefficients. Chin J Electron 24(3):579–588. https://doi.org/10.1049/cje.2015.07.024

    Article  Google Scholar 

  13. Liu Z, Wang H (2014) A novel speech content authentication algorithm based on Bessel–Fourier moments. Digital Signal Process 24:197–208. https://doi.org/10.1016/j.dsp.2013.09.007

    Article  MathSciNet  Google Scholar 

  14. Liu ZH, Zhang F, Wang J, Wang HX, Huang JW (2016) Authentication and recovery algorithm for speech signal based on digital watermarking. Signal Process 123:157–166. https://doi.org/10.1016/j.sigpro.2015.10.023

    Article  Google Scholar 

  15. Liu Y, Tang S, Liu R, Zhang L, Ma Z (2018) Secure and robust digital image watermarking scheme using logistic and RSA encryption. Expert Syst Appl 97:95–105. https://doi.org/10.1016/j.eswa.2017.12.003

    Article  Google Scholar 

  16. Lu W, Chen Z, Li L, Cao X, Wei J, Xiong N, Dang J (2018) Watermarking based on compressive sensing for digital speech detection and recovery. Sensors 18(7):2390–1-22. https://doi.org/10.3390/s18072390

    Article  Google Scholar 

  17. Luo XR, Xiang SJ (2014) Fragile audio watermarking with perfect restoration capacity based on an adapted integer transform. Wuhan Univ J Nat Sci 19(6):497–504. https://doi.org/10.1007/s11859-014-1044-y

    Article  MathSciNet  Google Scholar 

  18. Menendez-Ortiz A, Feregrino-Uribe C, Garcia-Hernandez JJ, Guzman-Zavaleta ZJ (2017) Self-recovery scheme for audio restoration after a content replacement attack. Multimed Tools Appl 76(12):14197–14224. https://doi.org/10.1007/s11042-016-3783-6

    Article  Google Scholar 

  19. Menendez-Ortiz A, Feregrino-Uribe C, Garcia-Hernandez JJ (2018) Self-recovery scheme for audio restoration using auditory masking. PloS one 13(9):e0204442, 1-23. https://doi.org/10.1371/journal.pone.0204442

    Article  Google Scholar 

  20. Mostafa A, Soliman NF, Abdalluh M, EI-samie FE (2015) Speech encryption using two dimensional chaotic maps. In Computer Engineering Conference (ICENCO), 2015 Computer engineering conference on. IEEE 235–240. https://doi.org/10.1109/ICENCO.2015.7416354

  21. Mustafa I, Abbas Z, Arif A, Javed T, Ghaffari A (2020) Stability analysis for multiple solutions of boundary layer flow towards a shrinking sheet: analytical solution by using least square method. Physica A-StatisticMechan Applications 540:123028. https://doi.org/10.1016/j.physa.2019.123028

    Article  MathSciNet  Google Scholar 

  22. Qian Q, Wang H, Shi C, Wang H (2016) An efficient content authentication scheme in encrypted speech based on integer wavelet transform. In 2016 Asia-Pacific signal and information processing association annual summit and conference (APSIPA). IEEE 1–8. https://doi.org/10.1109/APSIPA.2016.7820814

  23. Qian Q, Wang HX, Hu Y, Zhou LN, Li JF (2016) A dual fragile watermarking scheme for speech authentication. Multimed Tools Appl 75(21):13431–13450. https://doi.org/10.1007/s11042-015-2801-4

    Article  Google Scholar 

  24. Qian Q, Wang HX, Sun XM, Cui YH, Wang H, Shi CH (2018) Speech authentication and content recovery scheme for security communication and storage. Telecommun Syst 67(4):635–649. https://doi.org/10.1007/s11235-017-0360-x

    Article  Google Scholar 

  25. Rihan SD, Khalid A, Osman SEF (2015) A performance comparison of encryption algorithms AES and DES. Int J Eng Res Technol (IJERT) 4(12):151–154. https://doi.org/10.1109/ICICT.2005.1598556

    Article  Google Scholar 

  26. Shahadi HI (2018) Covert communication model for speech signals based on an indirect and adaptive encryption technique. Comput Electric Eng 68:425–436. https://doi.org/10.1016/j.compeleceng.2018.04.018

    Article  Google Scholar 

  27. Sheela SJ, Suresh KV, Tandur D (2017) Chaos based speech encryption using modified Henon map. In 2017 second international conference on electrical, computer and communication technologies (ICECCT). 2017 international conference on. IEEE 1-7. https://doi.org/10.1109/ICECCT.2017.8117918

  28. Sun F, Li Y, Liu Z, Qi C (2019) Speech forensics based on sample correlation degree. Advances in computer communication and computational sciences. Springer, Singapore, pp 173–183. https://doi.org/10.1007/978-981-13-0344-9_14

    Book  Google Scholar 

  29. Wang W, Hu GM, Yang L, Huang DF, Zhou Y (2016) Research of endpoint detection based on spectral subtraction and uniform sub-band spectrum variance. Audio Eng 40(5):40–43. https://doi.org/10.16311/j.audioe.2016.05.09

    Article  Google Scholar 

  30. Wu XZ, Xia LX, Zhang X, Zhou C (2019) Voice activity detection method based on MFPH. J Beijing Univ Posts Telecommun 42(2):83–89. https://doi.org/10.13190/j.jbupt.2018-228

    Article  Google Scholar 

  31. Xiang S, He J (2017) Database authentication watermarking scheme in encrypted domain. IET Inf Secur 12(1):42–51. https://doi.org/10.1049/iet-ifs.2017.0092

    Article  Google Scholar 

  32. Yang WX, Tang SY, Li MQ, Zhou BB, Jiang YJ (2018) Markov bidirectional transfer matrix for detecting LSB speech steganography with low embedding rates. Multimed Tools Appl 77(14):17937–17952. https://doi.org/10.1007/s11042-017-5505-0

    Article  Google Scholar 

  33. Zhang QY, Hu WJ, Huang YB, Qiao SB (2018) An efficient perceptual hashing based on improved spectral entropy for speech authentication. Multimed Tools Appl 77(2):1555–1581. https://doi.org/10.1007/s11042-017-4381-y

    Article  Google Scholar 

  34. Zhang QY, Zhou L, Zhang T, Zhang DH (2019) A retrieval algorithm of encrypted speech based on short-term cross-correlation and perceptual hashing. Multimed Tools Appl 78(13):17825–17846. https://doi.org/10.1007/s11042-019-7180-9

  35. Zhao H, He SF (2016) A retrieval algorithm for encrypted speech based on perceptual hashing. In IEEE 2016 natural computation, fuzzy systems and knowledge discovery (ICNC-FSKD). IEEE 1840-1845. 10.1109 / FSKD.2016.7603458

Download references

Acknowledgments

This work is supported by the National Natural Science Foundation of China (No. 61862041, 61363078). The authors would like to thank the anonymous reviewers for their helpful comments and suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qiu-yu Zhang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, Qy., Zhang, Dh. & Xu, Fj. An encrypted speech authentication and tampering recovery method based on perceptual hashing. Multimed Tools Appl 80, 24925–24948 (2021). https://doi.org/10.1007/s11042-021-10905-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-021-10905-0

Keywords

Navigation