An overview of digital speech watermarking

Nematollahi, Mohammad Ali; Al-Haddad, S. A. R.

doi:10.1007/s10772-013-9192-6

An overview of digital speech watermarking

Published: 23 May 2013

Volume 16, pages 471–488, (2013)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

Mohammad Ali Nematollahi¹ &
S. A. R. Al-Haddad¹

1435 Accesses
26 Citations
Explore all metrics

Abstract

Digital speech watermarking is a robust way to hide and thus secure data like audio and video from any intentional or unintentional manipulation through transmission. In terms of some signal characteristics including bandwidth, voice/non-voice and production model, digital speech signal is different from audio, music and other signals. Although, various review articles on image, audio and video watermarking are available, there are still few review papers on digital speech watermarking. Therefore this article presents an overview of digital speech watermarking including issues of robustness, capacity and imperceptibility. Other issues discussed are types of digital speech watermarking, application, models and masking methods. This article further highlights the related challenges in the real world, research opportunities and future works in this area, yet to be explored fully.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Akhaee, M. A., Khademi Kalantari, N., & Marvasti, F. (2010). Robust audio and speech watermarking using Gaussian and Laplacian modeling. Signal Processing, 90(8), 2487–2497.
Article MATH Google Scholar
Alcántara, J. I., Dooley, G. J., Blamey, P. J., & Seligman, P. M. (1994). Preliminary evaluation of a formant enhancement algorithm on the perception of speech in noise for normally hearing listeners. International Journal of Audiology, 33(1), 15–27.
Article Google Scholar
Ali, A., & Ahmad, M. (2010). Digital audio watermarking based on the discrete wavelets transform and singular value decomposition. European Journal of Scientific Research, 39(1), 6–21.
Google Scholar
Arora, S., & Emmanuel, S. Adaptive spread spectrum based watermarking of speech (2013).
Barni, M., & Bartolini, F. (2004). Watermarking systems engineering: enabling digital assets security and other applications. Signal processing and communications series (Vol. 21). Boca Raton: CRC Press.
Google Scholar
Bender, W., Gruhl, D., Morimoto, N., & Lu, A. (1996). Techniques for data hiding. IBM Systems Journal, 35(3–4), 313–336.
Article Google Scholar
Blamey, P., Dowell, R., Clark, G. M., & Seligman, P. (1987). Acoustic parameters measured by a formant-estimating speech processor for a multiple-channel cochlear implant. The Journal of the Acoustical Society of America, 82, 38.
Article Google Scholar
Boll, S. (1979). Suppression of acoustic noise in speech using spectral subtraction. IEEE Transactions on Acoustics, Speech, and Signal Processing, 27(2), 113–120.
Article Google Scholar
Celik, M., Sharma, G., & Tekalp, A. M. (2005). Pitch and duration modification for speech watermarking. Paper presented at the Proceedings of the IEEE international conference on acoustics, speech, and signal processing (ICASSP, 2005).
Chen, S., & Leung, H. (2007). Speech bandwidth extension by data hiding and phonetic classification. Paper presented at the IEEE international conference on acoustics, speech and signal processing (ICASSP 2007).
Chen, O.-C., & Liu, C.-H. (2007). Content-dependent watermarking scheme in compressed speech with identifying manner and location of attacks. IEEE Transactions on Audio, Speech, and Language Processing, 15(5), 1605–1616.
Article Google Scholar
Chen, S.-H., & Yu, S.-Y. (2008). Speech watermarking based on wavelet transform and BCH Coding. Paper presented at the IEEE international conference on sensor networks, ubiquitous and trustworthy computing (SUTC’08).
Chen, N., & Zhu, J. (2007a). Multipurpose speech watermarking based on multistage vector quantization of linear prediction coefficients. The Journal of China Universities of Posts and Telecommunications, 14(4), 64–69.
Article Google Scholar
Chen, N., & Zhu, J. (2007b). Robust speech watermarking algorithm. Electronics Letters, 43(24), 1393–1395.
Article Google Scholar
Cheng, Y. M., & O’Shaughnessy, D. (1991). Speech enhancement based conceptually on auditory evidence. IEEE Transactions on Signal Processing, 39(9), 1943–1954.
Article Google Scholar
Cheng, Q., & Sorensen, J. (2001). Spread spectrum signaling for speech watermarking. Paper presented at the Proceedings of the IEEE international conference on acoustics, speech, and signal processing (ICASSP’01).
Cheng, Q., & Sorensen, J. S. (2005). Spread spectrum signaling for speech watermarking. Google Patents.
Chu, W. C. (2003). Speech coding algorithms: foundation and evolution of standardized coders. New York: Wiley-Interscience.
Book Google Scholar
Ciloglu, T., & Utku Karaaslan, S. (2000). An improved all-pass watermarking scheme for speech and audio. Paper presented at the IEEE international conference on multimedia and expo (ICME 2000).
Cover, T. M., & Thomas, J. A. (2006). Elements of information theory. New York: Wiley-Interscience.
MATH Google Scholar
Cox, I. J., Miller, M. L., & McKellips, A. L. (1999). Watermarking as communications with side information. Proceedings of the IEEE, 87(7), 1127–1141.
Article Google Scholar
Cox, I., Miller, M., Bloom, J., & Honsinger, C. (2002). Digital watermarking. Journal of Electronic Imaging, 11(3), 414.
Article Google Scholar
Dau, T., Püschel, D., & Kohlrausch, A. (1996a). A quantitative model of the “effective” signal processing in the auditory system. I. Model structure. The Journal of the Acoustical Society of America, 99, 3615.
Article Google Scholar
Dau, T., Püschel, D., & Kohlrausch, A. (1996b). A quantitative model of the “effective” signal processing in the auditory system. II. Simulations and measurements. The Journal of the Acoustical Society of America, 99, 3623.
Article Google Scholar
Deng, Z., Yang, Z., Shao, X., Xu, N., Wu, C., & Guo, H. (2007). Design and implementation of steganographic speech telephone. In Advances in multimedia information processing—PCM 2007 (pp. 429–432).
Chapter Google Scholar
Dong, X., Bocko, M. F., & Ignjatovic, Z. (2004). Data hiding via phase manipulation of audio signals. Paper presented at the Proceedings of the IEEE international conference on acoustics, speech, and signal processing (ICASSP’04).
Dowling, R., & Turner, L. (1993). Modelling the detectability of changes in auditory signals. Paper presented at the IEEE international conference on acoustics, speech, and signal processing (ICASSP-93).
Faundez-Zanuy, M. (2010). Digital watermarking: new speech and image applications. Advances in Nonlinear Speech Processing, 84–89.
Faundez-Zanuy, M., Hagmüller, M., & Kubin, G. (2006). Speaker verification security improvement by means of speech watermarking. Speech Communication, 48(12), 1608–1619.
Article Google Scholar
Faundez-Zanuy, M., Hagmüller, M., & Kubin, G. (2007). Speaker identification security improvement by means of speech watermarking. Pattern Recognition, 40(11), 3027–3034.
Article MATH Google Scholar
Faundez-Zanuy, M., Lucena-Molina, J. J., & Hagmüller, M. (2010). Speech watermarking: an approach for the forensic analysis of digital telephonic recordings. Journal of Forensic Sciences, 55(4), 1080–1087.
Article Google Scholar
Fazel, A., & Chakrabartty, S. (2011). An overview of statistical pattern recognition techniques for speaker verification. IEEE Circuits and Systems Magazine, 11(2), 62–81.
Article Google Scholar
Geiser, B., & Vary, P. (2008). High rate data hiding in ACELP speech codecs. Paper presented at the IEEE international conference on acoustics, speech and signal processing (ICASSP 2008).
Geiser, B., Jax, P., & Vary, P. (2005). Artificial bandwidth extension of speech supported by watermark-transmitted side information. Paper presented at the proceedings of the 9th European conference on speech communication and technology INTERSPEECH 2005-EUROSPEECH.
Girin, L., & Marchand, S. (2004). Watermarking of speech signals using the sinusoidal model and frequency modulation of the partials. Paper presented at the IEEE international conference on acoustics, speech, and signal processing (ICASSP’04).
Gray, R., Buzo, A., Gray, A. Jr., & Matsuyama, Y. (1980). Distortion measures for speech processing. IEEE Transactions on Acoustics, Speech, and Signal Processing, 28(4), 367–376.
Article MATH Google Scholar
Green, D. M., & Swets, J. A. (1966). Signal detection theory and psychophysics (Vol. 1974). New York: Wiley.
Google Scholar
Guillemot, L., & Moureaux, J. (2004). Hybrid transmission, compression and data hiding: quantisation index modulation as source coding strategy. Electronics Letters, 40(17), 1053–1055.
Article Google Scholar
Guillemot, L., & Moureaux, J.-M. (2006). Indexing lattice vectors in a joint watermarking and compression scheme. Paper presented at the Proceedings of the IEEE international conference on acoustics, speech and signal processing (ICASSP 2006).
Hagmüller, M., Hering, H., Kröpfl, A., & Kubin, G. (2004). Speech watermarking for air traffic control. Watermark, 8(9), 10.
Google Scholar
Harjito, B., Han, S., Potdar, V., Chang, E., & Xie, M. (2010). Secure communication in wireless multimedia sensor networks using watermarking. Paper presented at the 4th IEEE international conference on digital ecosystems and technologies (DEST).
Hatada, M., Sakai, T., Komatsu, N., & Yamazaki, Y. (2002). Digital watermarking based on process of speech production. Paper presented at the ITCom 2002: the convergence of information technologies and communications.
Hofbauer, K. (2009). Speech watermarking and air traffic control. Ph.D. dissertation, Graz University of Technology, Graz, Austria.
Hofbauer, K., Hering, H., & Kubin, G. (2005). Speech watermarking for the VHF radio channel. Paper presented at the proceedings of the 4th Eurocontrol innovative research workshop.
Hofbauer, K., Kubin, G., & Kleijn, W. B. (2009). Speech watermarking for analog flat-fading bandpass channels. IEEE Transactions on Audio, Speech, and Language Processing, 17(8), 1624–1637.
Article Google Scholar
Huang, H.-C., & Fang, W.-C. (2010). Metadata-based image watermarking for copyright protection. Simulation Modelling Practice and Theory, 18(4), 436–445.
Article Google Scholar
Huang, X., Acero, A., Hon, H.-W., & Reddy, R. (2001). Spoken language processing: a guide to theory, algorithm & system development. New York: Prentice Hall.
Google Scholar
Huang, H.-C., Chu, S.-C., Pan, J.-S., Huang, C.-Y., & Liao, B.-Y. (2011). Tabu search based multi-watermarks embedding algorithm with multiple description coding. Information Sciences, 181(16), 3379–3396.
Article Google Scholar
Jalil, Z. (2010). Copyright protection of plain text using digital watermarking.
Kiah, M. M., Zaidan, B., Zaidan, A., Ahmed, A. M., & Al-bakri, S. H. (2011). A review of audio based steganography and digital watermarking. International Journal of Physical Sciences, 6(16), 3837–3850.
Google Scholar
Kim, D.-S. (2003). Perceptual phase quantization of speech. IEEE Transactions on Speech and Audio Processing, 11(4), 355–364.
Article MATH Google Scholar
Kleijn, W. B., & Paliwal, K. K. (1995). Speech coding and synthesis. Amsterdam: Elsevier
Google Scholar
Kubin, G., Atal, B., & Kleijn, W. (1993). Performance of noise excitation for unvoiced speech. Paper presented at the Proceedings of the IEEE workshop on speech coding for telecommunications.
Kundur, D. (1999). Multiresolution digital watermarking: algorithms and implications for multimedia signals. University of Toronto.
Lacy, J., Quackenbush, S. R., Reibman, A. R., Shur, D., & Snyder, J. H. (1998). On combining watermarking with perceptual coding. Paper presented at the proceedings of the IEEE international conference on acoustics, speech and signal processing.
Levitt, H. (1971). Transformed up-down methods in psychoacoustics. The Journal of the Acoustical Society of America, 49, 467.
Article Google Scholar
Lin, Y.-P., & Vaidyanathan, P. (1998). A kaiser window approach for the design of prototype filters of cosine modulated filterbanks. IEEE Signal Processing Letters, 5(6), 132–134.
Article Google Scholar
Lin, Y.-C., Huang, Z.-K., Pong, R.-T., & Wang, C.-C. (2005). A robust watermarking scheme combined with the FSVQ for images. Paper presented at the third International conference on information technology and applications (ICITA 2005).
Liu, C.-H., & Chen, O.-C. (2004). Fragile speech watermarking scheme with recovering speech contents. Paper presented at the 47th Midwest symposium on circuits and systems (MWSCAS’04).
Lu, Z.-M., Xu, D.-G., & Sun, S.-H. (2005). Multipurpose image watermarking algorithm based on multistage vector quantization. IEEE Transactions on Image Processing, 14(6), 822–831.
Article Google Scholar
Ma, L., Wu, Z.-j., Hu, Y., & Yang, W. (2007). An information-hiding model for secure communication. In Advanced intelligent computing theories and applications. With aspects of theoretical and methodological issues (pp. 1305–1314).
Chapter Google Scholar
Malvar, H. S. (1990). Lapped transforms for efficient transform/subband coding. IEEE Transactions on Acoustics, Speech, and Signal Processing, 38(6), 969–978.
Article Google Scholar
Malvar, H. S. (1992a). Signal processing with lapped transforms. Norwood: Artech House.
MATH Google Scholar
Malvar, K. (1992b). Extended lapped transforms: properties, applications, and fast algorithms. IEEE Transactions on Signal Processing, 40(11), 2703–2714.
Article MATH Google Scholar
McLoughlin, I. (2009). Applied speech and audio processing: with Matlab examples. Cambridge: Cambridge University Press.
Book Google Scholar
Moulines, E., & Laroche, J. (1995). Non-parametric techniques for pitch-scale and time-scale modification of speech. Speech Communication, 16(2), 175–205.
Article Google Scholar
Narimannejad, M., & Ahadi, S. M. (2011). Watermarking of speech signal through phase quantization of sinusoidal model. Paper presented at the 19th Iranian conference on electrical engineering (ICEE).
Nussbaumer, H. (1981). Pseudo QMF filter bank. IBM Technical Disclosure Bulletin, 24(6), 3081–3087.
Google Scholar
Painter, T., & Spanias, A. (2000). Perceptual coding of digital audio. Proceedings of the IEEE, 88(4), 451–515.
Article Google Scholar
Paliwal, K. K., & Alsteris, L. (2003). Usefulness of phase spectrum in human speech perception. Paper presented at the proc. Eurospeech.
Pérez-González, F., Mosquera, C., Barni, M., & Abrardo, A. (2005). Rational dither modulation: a high-rate data-hiding method invariant to gain attacks. IEEE Transactions on Signal Processing, 53(10), 3960–3975.
Article MathSciNet Google Scholar
Pobloth, H. (2004). Perceptual and squared error aspects in speech and audio coding. Signaler, sensorer och system.
Pobloth, H., & Kleijn, W. B. (1999). On phase perception in speech. Paper presented at the Proceedings of the IEEE international conference on acoustics, speech, and signal processing.
Ruiz, F. J., & Deller, J. Jr. (2000). Digital watermarking of speech signals for the national gallery of the spoken word. Paper presented at the Proceedings of the IEEE international conference on acoustics, speech, and signal processing (ICASSP’00).
Sagi, A., & Malah, D. (2006). Bandwidth extension of telephone speech aided by data embedding. EURASIP Journal on Advances in Signal Processing, 2007.
Sang, J., Liao, X., & Alam, M. (2006). Neural-network-based zero-watermark scheme for digital images. Optical Engineering, 45(9), 097006.
Article Google Scholar
Saraswathi, S. (2010). Speech authentication based on audio watermarking. International Journal of Information Technology, 16(1).
Schroeder, M. R., Atal, B. S., & Hall, J. (1979). Optimizing digital speech coders by exploiting masking properties of the human ear. The Journal of the Acoustical Society of America, 66, 1647.
Article Google Scholar
Shen, L., Li, X., Wang, H., & Zhang, R. (2004). Speech hiding based on auditory wavelet. In Computational science and its applications (ICCSA 2004) (pp. 414–420).
Chapter Google Scholar
Shlien, S. (1997). The modulated lapped transform, its time-varying forms, and its applications to audio coding standards. IEEE Transactions on Speech and Audio Processing, 5(4), 359–366.
Article Google Scholar
Singh, J., Garg, P., & De, A. N. (2009). A combined watermarking and encryption algorithm for secure VoIP. Information Security Journal, 18(2), 99–105.
Google Scholar
Swanson, M. D., Zhu, B., Tewfik, A. H., & Boney, L. (1998). Robust audio watermarking using perceptual masking. Signal Processing, 66(3), 337–355.
Article MATH Google Scholar
Taal, C. H., Hendriks, R. C., & Heusdens, R. (2012). A low-complexity spectro-temporal distortion measure for audio processing applications. IEEE Transactions on Audio, Speech, and Language Processing, 20(5), 1553–1564.
Article Google Scholar
Tempest, W. (1985). The noise handbook. New York: Academic Press.
Google Scholar
Thomas, I. (1968). The influence of first and second formants on the intelligibility of clipped speech.
Unoki, M., Imabeppu, K., Hamada, D., Haniu, A., & Miyauchi, R. (2011). Embedding limitations with digital-audio watermarking method based on cochlear delay characteristics. Journal of Information Hiding and Multimedia Signal Processing, 2(1), 1–23.
Article Google Scholar
van de Par, S., Kohlrausch, A., Heusdens, R., Jensen, J., & Jensen, S. H. (2005). A perceptual model for sinusoidal audio coding based on spectral integration. EURASIP Journal on Applied Signal Processing, 2005, 1292–1304.
Article MATH Google Scholar
Vary, P., & Martin, R. (2006). Digital speech transmission: enhancement, coding and error concealment. New York: Wiley.
Book Google Scholar
William, S. (2006). Cryptography and network security (4th ed.). Delhi: Pearson Education India.
Google Scholar
Wu, C.-P., & Kuo, C.-C. J. (2002). Fragile speech watermarking based on exponential scale quantization for tamper detection. Paper presented at the IEEE international conference on acoustics, speech, and signal processing (ICASSP).
Yan, B., & Guo, Y.-J. (2011). Speech authentication by semi-fragile speech watermarking utilizing analysis by synthesis and spectral distortion optimization. Multimedia Tools and Applications, 1–23.
Yan, B., Lu, Z.-M., Sun, S.-H., & Pan, J.-S. (2005). Speech authentication by semi-fragile watermarking. Paper presented at the knowledge-based intelligent, information and engineering systems.
Zhao, X., Guo, Y., Liu, J., & Yan, Y. (2011). Quantization Index Modulation audio watermarking system using a psychoacoustic model. Paper presented at the 8th international conference on information, communications and signal processing (ICICS).
Zhe-Ming, L., Bin, Y., & Sheng-He, S. (2005). Watermarking combined with CELP speech coding for authentication. IEICE Transactions on Information and Systems, 88(2), 330–334.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer & Communication Systems Engineering, Faculty of Engineering, University Putra Malaysia, UPM Serdang, 43400, Selangor Darul Ehsan, Malaysia
Mohammad Ali Nematollahi & S. A. R. Al-Haddad

Authors

Mohammad Ali Nematollahi
View author publications
You can also search for this author in PubMed Google Scholar
S. A. R. Al-Haddad
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohammad Ali Nematollahi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nematollahi, M.A., Al-Haddad, S.A.R. An overview of digital speech watermarking. Int J Speech Technol 16, 471–488 (2013). https://doi.org/10.1007/s10772-013-9192-6

Download citation

Received: 07 September 2012
Accepted: 06 March 2013
Published: 23 May 2013
Issue Date: December 2013
DOI: https://doi.org/10.1007/s10772-013-9192-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An overview of digital speech watermarking

Abstract

Access this article

Similar content being viewed by others

Comments on "efficient SVD speech watermarking with encrypted images"

Digital Speech Watermarking Based on Linear Predictive Analysis and Singular Value Decomposition

Speech watermarking based tamper detection and recovery scheme with high tolerable tamper rate

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An overview of digital speech watermarking

Abstract

Access this article

Similar content being viewed by others

Comments on "efficient SVD speech watermarking with encrypted images"

Digital Speech Watermarking Based on Linear Predictive Analysis and Singular Value Decomposition

Speech watermarking based tamper detection and recovery scheme with high tolerable tamper rate

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation