Skip to main content
Log in

Artificial Intelligence Approach for Tuning Speech-Adaptive Watermarking using Higher-Order Statistics (HOS)

  • Published:
Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Abstract

The ever-increasing use of artificial intelligence (AI) to optimize embedding a watermark or find the optimal location in the host image for inserting a watermark might solve some problems in this field. However, the main problem, which is finding the optimum trade-off point among several watermarking criteria, has still not been investigated by researchers in this field, especially for speech signals. This paper aims to find the best trade-off among the watermarking requirements such as capacity, inaudibility, and robustness by applying an AI model. Moreover, a novel watermarking technique is proposed by modification of the probability density function (PDF) of the linear predictive (LP) residual and wavelet detail coefficient. For this method, a mathematical model is developed based on applying higher-order statistics for embedding and extracting the watermark. Sinh-arcsinh is used to shape the skewness and kurtosis of normal distribution for the LP residual or the wavelet high-frequency sub-bands, respectively, based on the watermark bits. Experimental results will show that although LP residual is not robust and it shows random behavior for modeling its PDF, the wavelet high-frequency band is quite robust and it can model the PDF of the wavelet. Moreover, it is demonstrated that AI has the capability to compromise among the watermarking criteria. Conclusions are drawn based on theoretical (maximum likelihood) and AI (machine learning) approaches, which confirm the effectiveness of the proposed model. Finally, in conclusion, several potential areas are discussed for further exploration.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. M.A. Akhaee, N.K. Kalantari, F. Marvasti, Robust audio and speech watermarking using Gaussian and Laplacian modeling. Signal Process. 90(8), 2487–2497 (2010)

    Article  Google Scholar 

  2. L. Alzubaidi et al., A survey on deep learning tools dealing with data scarcity: definitions, challenges, solutions, tips, and applications. J. Big Data 10(1), 46 (2023)

    Article  Google Scholar 

  3. P. Amrit, A.K. Singh, Survey on watermarking methods in the artificial intelligence domain and beyond. Comput. Commun. 188, 52–65 (2022)

    Article  Google Scholar 

  4. P. Bhinder, N. Jindal, K. Singh, An improved robust image-adaptive watermarking with two watermarks using statistical decoder. Multimed. Tools Appl. 79, 183–217 (2020)

    Article  Google Scholar 

  5. F. Deeba et al., Digital watermarking using deep neural network. Int. J. Mach. Learn. Comput. 10(2), 277–282 (2020)

    Article  Google Scholar 

  6. S. Gazor, W. Zhang, Speech probability distribution. IEEE Signal Process. Lett. 10(7), 204–207 (2003)

    Article  Google Scholar 

  7. C. Gu et al., Watermarking pre-trained language models with backdooring. arXiv preprint arXiv:2210.07543 (2022)

  8. C.C. Hsu, Synthesizing personalized non-speech vocalization from discrete speech representations. arXiv preprint arXiv:2206.12662 (2022). Available: https://www.resemble.ai/neural-speech-watermarker

  9. M.C. Jones, A. Pewsey, Sinh-arcsinh distributions. Biometrika 96(4), 761–780 (2009)

    Article  MathSciNet  Google Scholar 

  10. C.T. Leondes, Stochastic Digital Control System Techniques: Advances in Theory and Applications (Academic Press, 1996)

    Google Scholar 

  11. Y. Li, H. Wang, M. Barni, A survey of deep neural network watermarking techniques. Neurocomputing 461, 171–193 (2021)

    Article  Google Scholar 

  12. X. Liang, S. Xiang, Robust reversible audio watermarking based on high-order difference statistics. Signal Process. 173, 107584 (2020)

    Article  Google Scholar 

  13. S. Lounici et al. Yes we can: watermarking machine learning models beyond classification, in 2021 IEEE 34th Computer Security Foundations Symposium (CSF). IEEE (2021)

  14. C.O. Mawalim, M. Unoki, Speech watermarking method using McAdams coefficient based on random forest learning. Entropy 23(10), 1246 (2021)

    Article  Google Scholar 

  15. I. Miller, Probability, Random Variables, and Stochastic Processes (JSTOR, 1966)

    Book  Google Scholar 

  16. S.-M. Mun et al., Finding robust domain from attacks: a learning framework for blind watermarking. Neurocomputing 337, 191–202 (2019)

    Article  Google Scholar 

  17. M.A. Nematollahi, Digital speech watermarking for online speaker recognition systems (2015)

  18. M.A. Nematollahi, A machine learning approach for digital watermarking. Aust. J. Multi Discipl. Eng. (2023). https://doi.org/10.1080/14488388.2023.2200051

    Article  Google Scholar 

  19. M.A. Nematollahi et al., Speaker frame selection for digital speech watermarking. Natl. Acad. Sci. Lett. 39, 197–201 (2016)

    Article  Google Scholar 

  20. M.A. Nematollahi, S.A.R. Al-Haddad, An overview of digital speech watermarking. Int. J. Speech Technol. 16, 471–488 (2013)

    Article  Google Scholar 

  21. M.A. Nematollahi et al., Multi-factor authentication model based on multipurpose speech watermarking and online speaker recognition. Multimed. Tools Appl. 76, 7251–7281 (2017)

    Article  Google Scholar 

  22. M.A. Nematollahi, C. Vorakulpipat, H. Gamboa Rosales, Optimization of a blind speech watermarking technique against amplitude scaling. Secur. Commun. Netw. 2017 (2017)

  23. M.A. Nematollahi, C. Vorakulpipat, H. Gamboa Rosales, Semifragile speech watermarking based on least significant bit replacement of line spectral frequencies. Math. Probl. Eng. 2017 (2017)

  24. M.A. Nematollahi et al., Digital speech watermarking based on linear predictive analysis and singular value decomposition. Proc. Natl. Acad. Sci. India Sect. A 87, 433–446 (2017)

    Article  Google Scholar 

  25. M.A. Nematollahi, C. Vorakulpipat, H.G. Rosales, Digital Watermarking (Springer, 2017)

    Book  Google Scholar 

  26. K. Pavlović et al., Robust speech watermarking by a jointly trained embedder and detector using a DNN. Digital Signal Process. 122, 103381 (2022)

    Article  Google Scholar 

  27. M. Płachta et al., Detection of image steganography using deep learning and ensemble classifiers. Electronics 11(10), 1565 (2022)

    Article  Google Scholar 

  28. P. Rathi, S. Bhadauria, S. Rathi, Watermarking of deep recurrent neural network using adversarial examples to protect intellectual property. Appl. Artif. Intell. 36(1), 2008613 (2022)

    Article  Google Scholar 

  29. M. Steinebach et al. StirMark benchmark: audio watermarking attacks, in Proceedings International Conference on Information Technology: Coding and Computing. IEEE (2001)

  30. S. Sun et al. Detect and remove watermark in deep neural networks via generative adversarial networks. in Information Security: 24th International Conference, ISC 2021, Virtual Event, November 10–12, 2021, Proceedings 24. Springer (2021)

  31. L. Tegendal, Watermarking in audio using deep learning (2019)

  32. S. Verdú, A general formula for channel capacity. IEEE Trans. Inf. Theory 40(4), 1147–1157 (1994)

    Article  MathSciNet  Google Scholar 

  33. J. Zhang et al., An integrated multi-head dual sparse self-attention network for remaining useful life prediction. Reliab. Eng. Syst. Saf. 233, 109096 (2023)

    Article  Google Scholar 

  34. J. Zhang et al., Lifetime extension approach based on Levenberg-Marquardt neural network and power routing of DC–DC converters. IEEE Trans. Power Electron. (2023). https://doi.org/10.1109/TPEL.2023.3275791

    Article  Google Scholar 

  35. J. Zhang et al., A parallel hybrid neural network with integration of spatial and temporal features for remaining useful life prediction in prognostics. IEEE Trans. Instrum. Meas. 72, 1–12 (2022)

    Google Scholar 

  36. J. Zhang et al., An integrated multitasking intelligent bearing fault diagnosis scheme based on representation learning under imbalanced sample condition. IEEE Trans. Neural Netw. Learn. Syst. (2023). https://doi.org/10.1109/TNNLS.2022.3232147

    Article  Google Scholar 

  37. W.R. Zwet, Convex Transformations of Random Variables (Mathematisch Centrum, Amsterdam, 1964)

    Google Scholar 

Download references

Acknowledgements

The authors would like to appreciate the anonymous reviewers for all useful and constructive comments on the manuscript. All comments have been considered, and the paper is revised accordingly. This research was funded by Ministry of Education Humanities and Social Sciences Planning Fund Research Project,"Internet+Alzheimers Disease Personnel Safety Protection" Implementation Countermeasure Integration Research, Item Number: 16YJAZH040.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammad Ali Nematollahi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, X., Nematollahi, M.A. Artificial Intelligence Approach for Tuning Speech-Adaptive Watermarking using Higher-Order Statistics (HOS). Circuits Syst Signal Process 43, 3297–3323 (2024). https://doi.org/10.1007/s00034-024-02618-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00034-024-02618-0

Keywords

Navigation