Artificial Intelligence Approach for Tuning Speech-Adaptive Watermarking using Higher-Order Statistics (HOS)

Liu, Xin; Nematollahi, Mohammad Ali

doi:10.1007/s00034-024-02618-0

Artificial Intelligence Approach for Tuning Speech-Adaptive Watermarking using Higher-Order Statistics (HOS)

Published: 22 February 2024

Volume 43, pages 3297–3323, (2024)
Cite this article

Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

63 Accesses
1 Altmetric
Explore all metrics

Abstract

The ever-increasing use of artificial intelligence (AI) to optimize embedding a watermark or find the optimal location in the host image for inserting a watermark might solve some problems in this field. However, the main problem, which is finding the optimum trade-off point among several watermarking criteria, has still not been investigated by researchers in this field, especially for speech signals. This paper aims to find the best trade-off among the watermarking requirements such as capacity, inaudibility, and robustness by applying an AI model. Moreover, a novel watermarking technique is proposed by modification of the probability density function (PDF) of the linear predictive (LP) residual and wavelet detail coefficient. For this method, a mathematical model is developed based on applying higher-order statistics for embedding and extracting the watermark. Sinh-arcsinh is used to shape the skewness and kurtosis of normal distribution for the LP residual or the wavelet high-frequency sub-bands, respectively, based on the watermark bits. Experimental results will show that although LP residual is not robust and it shows random behavior for modeling its PDF, the wavelet high-frequency band is quite robust and it can model the PDF of the wavelet. Moreover, it is demonstrated that AI has the capability to compromise among the watermarking criteria. Conclusions are drawn based on theoretical (maximum likelihood) and AI (machine learning) approaches, which confirm the effectiveness of the proposed model. Finally, in conclusion, several potential areas are discussed for further exploration.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Digital Speech Watermarking Based on Linear Predictive Analysis and Singular Value Decomposition

Article 26 May 2017

High capacity, secure audio watermarking technique integrating spread spectrum and linear predictive coding

Article 08 November 2023

Adjustable audio watermarking algorithm based on DWPT and psychoacoustic modeling

Article 26 May 2017

References

M.A. Akhaee, N.K. Kalantari, F. Marvasti, Robust audio and speech watermarking using Gaussian and Laplacian modeling. Signal Process. 90(8), 2487–2497 (2010)
Article Google Scholar
L. Alzubaidi et al., A survey on deep learning tools dealing with data scarcity: definitions, challenges, solutions, tips, and applications. J. Big Data 10(1), 46 (2023)
Article Google Scholar
P. Amrit, A.K. Singh, Survey on watermarking methods in the artificial intelligence domain and beyond. Comput. Commun. 188, 52–65 (2022)
Article Google Scholar
P. Bhinder, N. Jindal, K. Singh, An improved robust image-adaptive watermarking with two watermarks using statistical decoder. Multimed. Tools Appl. 79, 183–217 (2020)
Article Google Scholar
F. Deeba et al., Digital watermarking using deep neural network. Int. J. Mach. Learn. Comput. 10(2), 277–282 (2020)
Article Google Scholar
S. Gazor, W. Zhang, Speech probability distribution. IEEE Signal Process. Lett. 10(7), 204–207 (2003)
Article Google Scholar
C. Gu et al., Watermarking pre-trained language models with backdooring. arXiv preprint arXiv:2210.07543 (2022)
C.C. Hsu, Synthesizing personalized non-speech vocalization from discrete speech representations. arXiv preprint arXiv:2206.12662 (2022). Available: https://www.resemble.ai/neural-speech-watermarker
M.C. Jones, A. Pewsey, Sinh-arcsinh distributions. Biometrika 96(4), 761–780 (2009)
Article MathSciNet Google Scholar
C.T. Leondes, Stochastic Digital Control System Techniques: Advances in Theory and Applications (Academic Press, 1996)
Google Scholar
Y. Li, H. Wang, M. Barni, A survey of deep neural network watermarking techniques. Neurocomputing 461, 171–193 (2021)
Article Google Scholar
X. Liang, S. Xiang, Robust reversible audio watermarking based on high-order difference statistics. Signal Process. 173, 107584 (2020)
Article Google Scholar
S. Lounici et al. Yes we can: watermarking machine learning models beyond classification, in 2021 IEEE 34th Computer Security Foundations Symposium (CSF). IEEE (2021)
C.O. Mawalim, M. Unoki, Speech watermarking method using McAdams coefficient based on random forest learning. Entropy 23(10), 1246 (2021)
Article Google Scholar
I. Miller, Probability, Random Variables, and Stochastic Processes (JSTOR, 1966)
Book Google Scholar
S.-M. Mun et al., Finding robust domain from attacks: a learning framework for blind watermarking. Neurocomputing 337, 191–202 (2019)
Article Google Scholar
M.A. Nematollahi, Digital speech watermarking for online speaker recognition systems (2015)
M.A. Nematollahi, A machine learning approach for digital watermarking. Aust. J. Multi Discipl. Eng. (2023). https://doi.org/10.1080/14488388.2023.2200051
Article Google Scholar
M.A. Nematollahi et al., Speaker frame selection for digital speech watermarking. Natl. Acad. Sci. Lett. 39, 197–201 (2016)
Article Google Scholar
M.A. Nematollahi, S.A.R. Al-Haddad, An overview of digital speech watermarking. Int. J. Speech Technol. 16, 471–488 (2013)
Article Google Scholar
M.A. Nematollahi et al., Multi-factor authentication model based on multipurpose speech watermarking and online speaker recognition. Multimed. Tools Appl. 76, 7251–7281 (2017)
Article Google Scholar
M.A. Nematollahi, C. Vorakulpipat, H. Gamboa Rosales, Optimization of a blind speech watermarking technique against amplitude scaling. Secur. Commun. Netw. 2017 (2017)
M.A. Nematollahi, C. Vorakulpipat, H. Gamboa Rosales, Semifragile speech watermarking based on least significant bit replacement of line spectral frequencies. Math. Probl. Eng. 2017 (2017)
M.A. Nematollahi et al., Digital speech watermarking based on linear predictive analysis and singular value decomposition. Proc. Natl. Acad. Sci. India Sect. A 87, 433–446 (2017)
Article Google Scholar
M.A. Nematollahi, C. Vorakulpipat, H.G. Rosales, Digital Watermarking (Springer, 2017)
Book Google Scholar
K. Pavlović et al., Robust speech watermarking by a jointly trained embedder and detector using a DNN. Digital Signal Process. 122, 103381 (2022)
Article Google Scholar
M. Płachta et al., Detection of image steganography using deep learning and ensemble classifiers. Electronics 11(10), 1565 (2022)
Article Google Scholar
P. Rathi, S. Bhadauria, S. Rathi, Watermarking of deep recurrent neural network using adversarial examples to protect intellectual property. Appl. Artif. Intell. 36(1), 2008613 (2022)
Article Google Scholar
M. Steinebach et al. StirMark benchmark: audio watermarking attacks, in Proceedings International Conference on Information Technology: Coding and Computing. IEEE (2001)
S. Sun et al. Detect and remove watermark in deep neural networks via generative adversarial networks. in Information Security: 24th International Conference, ISC 2021, Virtual Event, November 10–12, 2021, Proceedings 24. Springer (2021)
L. Tegendal, Watermarking in audio using deep learning (2019)
S. Verdú, A general formula for channel capacity. IEEE Trans. Inf. Theory 40(4), 1147–1157 (1994)
Article MathSciNet Google Scholar
J. Zhang et al., An integrated multi-head dual sparse self-attention network for remaining useful life prediction. Reliab. Eng. Syst. Saf. 233, 109096 (2023)
Article Google Scholar
J. Zhang et al., Lifetime extension approach based on Levenberg-Marquardt neural network and power routing of DC–DC converters. IEEE Trans. Power Electron. (2023). https://doi.org/10.1109/TPEL.2023.3275791
Article Google Scholar
J. Zhang et al., A parallel hybrid neural network with integration of spatial and temporal features for remaining useful life prediction in prognostics. IEEE Trans. Instrum. Meas. 72, 1–12 (2022)
Google Scholar
J. Zhang et al., An integrated multitasking intelligent bearing fault diagnosis scheme based on representation learning under imbalanced sample condition. IEEE Trans. Neural Netw. Learn. Syst. (2023). https://doi.org/10.1109/TNNLS.2022.3232147
Article Google Scholar
W.R. Zwet, Convex Transformations of Random Variables (Mathematisch Centrum, Amsterdam, 1964)
Google Scholar

Download references

Acknowledgements

The authors would like to appreciate the anonymous reviewers for all useful and constructive comments on the manuscript. All comments have been considered, and the paper is revised accordingly. This research was funded by Ministry of Education Humanities and Social Sciences Planning Fund Research Project,"Internet+Alzheimers Disease Personnel Safety Protection" Implementation Countermeasure Integration Research, Item Number: 16YJAZH040.

Author information

Authors and Affiliations

Information Engineering Institute, Shanxi Vocational University of Engineering Science and Technology, Jinzhong City, Shanxi Province, China
Xin Liu & Mohammad Ali Nematollahi

Authors

Xin Liu
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Ali Nematollahi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohammad Ali Nematollahi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Liu, X., Nematollahi, M.A. Artificial Intelligence Approach for Tuning Speech-Adaptive Watermarking using Higher-Order Statistics (HOS). Circuits Syst Signal Process 43, 3297–3323 (2024). https://doi.org/10.1007/s00034-024-02618-0

Download citation

Received: 17 July 2023
Revised: 07 January 2024
Accepted: 12 January 2024
Published: 22 February 2024
Issue Date: May 2024
DOI: https://doi.org/10.1007/s00034-024-02618-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial Intelligence Approach for Tuning Speech-Adaptive Watermarking using Higher-Order Statistics (HOS)

Abstract

Access this article

Similar content being viewed by others

Digital Speech Watermarking Based on Linear Predictive Analysis and Singular Value Decomposition

High capacity, secure audio watermarking technique integrating spread spectrum and linear predictive coding

Adjustable audio watermarking algorithm based on DWPT and psychoacoustic modeling

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Artificial Intelligence Approach for Tuning Speech-Adaptive Watermarking using Higher-Order Statistics (HOS)

Abstract

Access this article

Similar content being viewed by others

Digital Speech Watermarking Based on Linear Predictive Analysis and Singular Value Decomposition

High capacity, secure audio watermarking technique integrating spread spectrum and linear predictive coding

Adjustable audio watermarking algorithm based on DWPT and psychoacoustic modeling

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation