Skip to main content
Log in

An overview of digital speech watermarking

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

Digital speech watermarking is a robust way to hide and thus secure data like audio and video from any intentional or unintentional manipulation through transmission. In terms of some signal characteristics including bandwidth, voice/non-voice and production model, digital speech signal is different from audio, music and other signals. Although, various review articles on image, audio and video watermarking are available, there are still few review papers on digital speech watermarking. Therefore this article presents an overview of digital speech watermarking including issues of robustness, capacity and imperceptibility. Other issues discussed are types of digital speech watermarking, application, models and masking methods. This article further highlights the related challenges in the real world, research opportunities and future works in this area, yet to be explored fully.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

References

  • Akhaee, M. A., Khademi Kalantari, N., & Marvasti, F. (2010). Robust audio and speech watermarking using Gaussian and Laplacian modeling. Signal Processing, 90(8), 2487–2497.

    Article  MATH  Google Scholar 

  • Alcántara, J. I., Dooley, G. J., Blamey, P. J., & Seligman, P. M. (1994). Preliminary evaluation of a formant enhancement algorithm on the perception of speech in noise for normally hearing listeners. International Journal of Audiology, 33(1), 15–27.

    Article  Google Scholar 

  • Ali, A., & Ahmad, M. (2010). Digital audio watermarking based on the discrete wavelets transform and singular value decomposition. European Journal of Scientific Research, 39(1), 6–21.

    Google Scholar 

  • Arora, S., & Emmanuel, S. Adaptive spread spectrum based watermarking of speech (2013).

  • Barni, M., & Bartolini, F. (2004). Watermarking systems engineering: enabling digital assets security and other applications. Signal processing and communications series (Vol. 21). Boca Raton: CRC Press.

    Google Scholar 

  • Bender, W., Gruhl, D., Morimoto, N., & Lu, A. (1996). Techniques for data hiding. IBM Systems Journal, 35(3–4), 313–336.

    Article  Google Scholar 

  • Blamey, P., Dowell, R., Clark, G. M., & Seligman, P. (1987). Acoustic parameters measured by a formant-estimating speech processor for a multiple-channel cochlear implant. The Journal of the Acoustical Society of America, 82, 38.

    Article  Google Scholar 

  • Boll, S. (1979). Suppression of acoustic noise in speech using spectral subtraction. IEEE Transactions on Acoustics, Speech, and Signal Processing, 27(2), 113–120.

    Article  Google Scholar 

  • Celik, M., Sharma, G., & Tekalp, A. M. (2005). Pitch and duration modification for speech watermarking. Paper presented at the Proceedings of the IEEE international conference on acoustics, speech, and signal processing (ICASSP, 2005).

  • Chen, S., & Leung, H. (2007). Speech bandwidth extension by data hiding and phonetic classification. Paper presented at the IEEE international conference on acoustics, speech and signal processing (ICASSP 2007).

  • Chen, O.-C., & Liu, C.-H. (2007). Content-dependent watermarking scheme in compressed speech with identifying manner and location of attacks. IEEE Transactions on Audio, Speech, and Language Processing, 15(5), 1605–1616.

    Article  Google Scholar 

  • Chen, S.-H., & Yu, S.-Y. (2008). Speech watermarking based on wavelet transform and BCH Coding. Paper presented at the IEEE international conference on sensor networks, ubiquitous and trustworthy computing (SUTC’08).

  • Chen, N., & Zhu, J. (2007a). Multipurpose speech watermarking based on multistage vector quantization of linear prediction coefficients. The Journal of China Universities of Posts and Telecommunications, 14(4), 64–69.

    Article  Google Scholar 

  • Chen, N., & Zhu, J. (2007b). Robust speech watermarking algorithm. Electronics Letters, 43(24), 1393–1395.

    Article  Google Scholar 

  • Cheng, Y. M., & O’Shaughnessy, D. (1991). Speech enhancement based conceptually on auditory evidence. IEEE Transactions on Signal Processing, 39(9), 1943–1954.

    Article  Google Scholar 

  • Cheng, Q., & Sorensen, J. (2001). Spread spectrum signaling for speech watermarking. Paper presented at the Proceedings of the IEEE international conference on acoustics, speech, and signal processing (ICASSP’01).

  • Cheng, Q., & Sorensen, J. S. (2005). Spread spectrum signaling for speech watermarking. Google Patents.

  • Chu, W. C. (2003). Speech coding algorithms: foundation and evolution of standardized coders. New York: Wiley-Interscience.

    Book  Google Scholar 

  • Ciloglu, T., & Utku Karaaslan, S. (2000). An improved all-pass watermarking scheme for speech and audio. Paper presented at the IEEE international conference on multimedia and expo (ICME 2000).

  • Cover, T. M., & Thomas, J. A. (2006). Elements of information theory. New York: Wiley-Interscience.

    MATH  Google Scholar 

  • Cox, I. J., Miller, M. L., & McKellips, A. L. (1999). Watermarking as communications with side information. Proceedings of the IEEE, 87(7), 1127–1141.

    Article  Google Scholar 

  • Cox, I., Miller, M., Bloom, J., & Honsinger, C. (2002). Digital watermarking. Journal of Electronic Imaging, 11(3), 414.

    Article  Google Scholar 

  • Dau, T., Püschel, D., & Kohlrausch, A. (1996a). A quantitative model of the “effective” signal processing in the auditory system. I. Model structure. The Journal of the Acoustical Society of America, 99, 3615.

    Article  Google Scholar 

  • Dau, T., Püschel, D., & Kohlrausch, A. (1996b). A quantitative model of the “effective” signal processing in the auditory system. II. Simulations and measurements. The Journal of the Acoustical Society of America, 99, 3623.

    Article  Google Scholar 

  • Deng, Z., Yang, Z., Shao, X., Xu, N., Wu, C., & Guo, H. (2007). Design and implementation of steganographic speech telephone. In Advances in multimedia information processing—PCM 2007 (pp. 429–432).

    Chapter  Google Scholar 

  • Dong, X., Bocko, M. F., & Ignjatovic, Z. (2004). Data hiding via phase manipulation of audio signals. Paper presented at the Proceedings of the IEEE international conference on acoustics, speech, and signal processing (ICASSP’04).

  • Dowling, R., & Turner, L. (1993). Modelling the detectability of changes in auditory signals. Paper presented at the IEEE international conference on acoustics, speech, and signal processing (ICASSP-93).

  • Faundez-Zanuy, M. (2010). Digital watermarking: new speech and image applications. Advances in Nonlinear Speech Processing, 84–89.

  • Faundez-Zanuy, M., Hagmüller, M., & Kubin, G. (2006). Speaker verification security improvement by means of speech watermarking. Speech Communication, 48(12), 1608–1619.

    Article  Google Scholar 

  • Faundez-Zanuy, M., Hagmüller, M., & Kubin, G. (2007). Speaker identification security improvement by means of speech watermarking. Pattern Recognition, 40(11), 3027–3034.

    Article  MATH  Google Scholar 

  • Faundez-Zanuy, M., Lucena-Molina, J. J., & Hagmüller, M. (2010). Speech watermarking: an approach for the forensic analysis of digital telephonic recordings. Journal of Forensic Sciences, 55(4), 1080–1087.

    Article  Google Scholar 

  • Fazel, A., & Chakrabartty, S. (2011). An overview of statistical pattern recognition techniques for speaker verification. IEEE Circuits and Systems Magazine, 11(2), 62–81.

    Article  Google Scholar 

  • Geiser, B., & Vary, P. (2008). High rate data hiding in ACELP speech codecs. Paper presented at the IEEE international conference on acoustics, speech and signal processing (ICASSP 2008).

  • Geiser, B., Jax, P., & Vary, P. (2005). Artificial bandwidth extension of speech supported by watermark-transmitted side information. Paper presented at the proceedings of the 9th European conference on speech communication and technology INTERSPEECH 2005-EUROSPEECH.

  • Girin, L., & Marchand, S. (2004). Watermarking of speech signals using the sinusoidal model and frequency modulation of the partials. Paper presented at the IEEE international conference on acoustics, speech, and signal processing (ICASSP’04).

  • Gray, R., Buzo, A., Gray, A. Jr., & Matsuyama, Y. (1980). Distortion measures for speech processing. IEEE Transactions on Acoustics, Speech, and Signal Processing, 28(4), 367–376.

    Article  MATH  Google Scholar 

  • Green, D. M., & Swets, J. A. (1966). Signal detection theory and psychophysics (Vol. 1974). New York: Wiley.

    Google Scholar 

  • Guillemot, L., & Moureaux, J. (2004). Hybrid transmission, compression and data hiding: quantisation index modulation as source coding strategy. Electronics Letters, 40(17), 1053–1055.

    Article  Google Scholar 

  • Guillemot, L., & Moureaux, J.-M. (2006). Indexing lattice vectors in a joint watermarking and compression scheme. Paper presented at the Proceedings of the IEEE international conference on acoustics, speech and signal processing (ICASSP 2006).

  • Hagmüller, M., Hering, H., Kröpfl, A., & Kubin, G. (2004). Speech watermarking for air traffic control. Watermark, 8(9), 10.

    Google Scholar 

  • Harjito, B., Han, S., Potdar, V., Chang, E., & Xie, M. (2010). Secure communication in wireless multimedia sensor networks using watermarking. Paper presented at the 4th IEEE international conference on digital ecosystems and technologies (DEST).

  • Hatada, M., Sakai, T., Komatsu, N., & Yamazaki, Y. (2002). Digital watermarking based on process of speech production. Paper presented at the ITCom 2002: the convergence of information technologies and communications.

  • Hofbauer, K. (2009). Speech watermarking and air traffic control. Ph.D. dissertation, Graz University of Technology, Graz, Austria.

  • Hofbauer, K., Hering, H., & Kubin, G. (2005). Speech watermarking for the VHF radio channel. Paper presented at the proceedings of the 4th Eurocontrol innovative research workshop.

  • Hofbauer, K., Kubin, G., & Kleijn, W. B. (2009). Speech watermarking for analog flat-fading bandpass channels. IEEE Transactions on Audio, Speech, and Language Processing, 17(8), 1624–1637.

    Article  Google Scholar 

  • Huang, H.-C., & Fang, W.-C. (2010). Metadata-based image watermarking for copyright protection. Simulation Modelling Practice and Theory, 18(4), 436–445.

    Article  Google Scholar 

  • Huang, X., Acero, A., Hon, H.-W., & Reddy, R. (2001). Spoken language processing: a guide to theory, algorithm & system development. New York: Prentice Hall.

    Google Scholar 

  • Huang, H.-C., Chu, S.-C., Pan, J.-S., Huang, C.-Y., & Liao, B.-Y. (2011). Tabu search based multi-watermarks embedding algorithm with multiple description coding. Information Sciences, 181(16), 3379–3396.

    Article  Google Scholar 

  • Jalil, Z. (2010). Copyright protection of plain text using digital watermarking.

  • Kiah, M. M., Zaidan, B., Zaidan, A., Ahmed, A. M., & Al-bakri, S. H. (2011). A review of audio based steganography and digital watermarking. International Journal of Physical Sciences, 6(16), 3837–3850.

    Google Scholar 

  • Kim, D.-S. (2003). Perceptual phase quantization of speech. IEEE Transactions on Speech and Audio Processing, 11(4), 355–364.

    Article  MATH  Google Scholar 

  • Kleijn, W. B., & Paliwal, K. K. (1995). Speech coding and synthesis. Amsterdam: Elsevier

    Google Scholar 

  • Kubin, G., Atal, B., & Kleijn, W. (1993). Performance of noise excitation for unvoiced speech. Paper presented at the Proceedings of the IEEE workshop on speech coding for telecommunications.

  • Kundur, D. (1999). Multiresolution digital watermarking: algorithms and implications for multimedia signals. University of Toronto.

  • Lacy, J., Quackenbush, S. R., Reibman, A. R., Shur, D., & Snyder, J. H. (1998). On combining watermarking with perceptual coding. Paper presented at the proceedings of the IEEE international conference on acoustics, speech and signal processing.

  • Levitt, H. (1971). Transformed up-down methods in psychoacoustics. The Journal of the Acoustical Society of America, 49, 467.

    Article  Google Scholar 

  • Lin, Y.-P., & Vaidyanathan, P. (1998). A kaiser window approach for the design of prototype filters of cosine modulated filterbanks. IEEE Signal Processing Letters, 5(6), 132–134.

    Article  Google Scholar 

  • Lin, Y.-C., Huang, Z.-K., Pong, R.-T., & Wang, C.-C. (2005). A robust watermarking scheme combined with the FSVQ for images. Paper presented at the third International conference on information technology and applications (ICITA 2005).

  • Liu, C.-H., & Chen, O.-C. (2004). Fragile speech watermarking scheme with recovering speech contents. Paper presented at the 47th Midwest symposium on circuits and systems (MWSCAS’04).

  • Lu, Z.-M., Xu, D.-G., & Sun, S.-H. (2005). Multipurpose image watermarking algorithm based on multistage vector quantization. IEEE Transactions on Image Processing, 14(6), 822–831.

    Article  Google Scholar 

  • Ma, L., Wu, Z.-j., Hu, Y., & Yang, W. (2007). An information-hiding model for secure communication. In Advanced intelligent computing theories and applications. With aspects of theoretical and methodological issues (pp. 1305–1314).

    Chapter  Google Scholar 

  • Malvar, H. S. (1990). Lapped transforms for efficient transform/subband coding. IEEE Transactions on Acoustics, Speech, and Signal Processing, 38(6), 969–978.

    Article  Google Scholar 

  • Malvar, H. S. (1992a). Signal processing with lapped transforms. Norwood: Artech House.

    MATH  Google Scholar 

  • Malvar, K. (1992b). Extended lapped transforms: properties, applications, and fast algorithms. IEEE Transactions on Signal Processing, 40(11), 2703–2714.

    Article  MATH  Google Scholar 

  • McLoughlin, I. (2009). Applied speech and audio processing: with Matlab examples. Cambridge: Cambridge University Press.

    Book  Google Scholar 

  • Moulines, E., & Laroche, J. (1995). Non-parametric techniques for pitch-scale and time-scale modification of speech. Speech Communication, 16(2), 175–205.

    Article  Google Scholar 

  • Narimannejad, M., & Ahadi, S. M. (2011). Watermarking of speech signal through phase quantization of sinusoidal model. Paper presented at the 19th Iranian conference on electrical engineering (ICEE).

  • Nussbaumer, H. (1981). Pseudo QMF filter bank. IBM Technical Disclosure Bulletin, 24(6), 3081–3087.

    Google Scholar 

  • Painter, T., & Spanias, A. (2000). Perceptual coding of digital audio. Proceedings of the IEEE, 88(4), 451–515.

    Article  Google Scholar 

  • Paliwal, K. K., & Alsteris, L. (2003). Usefulness of phase spectrum in human speech perception. Paper presented at the proc. Eurospeech.

  • Pérez-González, F., Mosquera, C., Barni, M., & Abrardo, A. (2005). Rational dither modulation: a high-rate data-hiding method invariant to gain attacks. IEEE Transactions on Signal Processing, 53(10), 3960–3975.

    Article  MathSciNet  Google Scholar 

  • Pobloth, H. (2004). Perceptual and squared error aspects in speech and audio coding. Signaler, sensorer och system.

  • Pobloth, H., & Kleijn, W. B. (1999). On phase perception in speech. Paper presented at the Proceedings of the IEEE international conference on acoustics, speech, and signal processing.

  • Ruiz, F. J., & Deller, J. Jr. (2000). Digital watermarking of speech signals for the national gallery of the spoken word. Paper presented at the Proceedings of the IEEE international conference on acoustics, speech, and signal processing (ICASSP’00).

  • Sagi, A., & Malah, D. (2006). Bandwidth extension of telephone speech aided by data embedding. EURASIP Journal on Advances in Signal Processing, 2007.

  • Sang, J., Liao, X., & Alam, M. (2006). Neural-network-based zero-watermark scheme for digital images. Optical Engineering, 45(9), 097006.

    Article  Google Scholar 

  • Saraswathi, S. (2010). Speech authentication based on audio watermarking. International Journal of Information Technology, 16(1).

  • Schroeder, M. R., Atal, B. S., & Hall, J. (1979). Optimizing digital speech coders by exploiting masking properties of the human ear. The Journal of the Acoustical Society of America, 66, 1647.

    Article  Google Scholar 

  • Shen, L., Li, X., Wang, H., & Zhang, R. (2004). Speech hiding based on auditory wavelet. In Computational science and its applications (ICCSA 2004) (pp. 414–420).

    Chapter  Google Scholar 

  • Shlien, S. (1997). The modulated lapped transform, its time-varying forms, and its applications to audio coding standards. IEEE Transactions on Speech and Audio Processing, 5(4), 359–366.

    Article  Google Scholar 

  • Singh, J., Garg, P., & De, A. N. (2009). A combined watermarking and encryption algorithm for secure VoIP. Information Security Journal, 18(2), 99–105.

    Google Scholar 

  • Swanson, M. D., Zhu, B., Tewfik, A. H., & Boney, L. (1998). Robust audio watermarking using perceptual masking. Signal Processing, 66(3), 337–355.

    Article  MATH  Google Scholar 

  • Taal, C. H., Hendriks, R. C., & Heusdens, R. (2012). A low-complexity spectro-temporal distortion measure for audio processing applications. IEEE Transactions on Audio, Speech, and Language Processing, 20(5), 1553–1564.

    Article  Google Scholar 

  • Tempest, W. (1985). The noise handbook. New York: Academic Press.

    Google Scholar 

  • Thomas, I. (1968). The influence of first and second formants on the intelligibility of clipped speech.

  • Unoki, M., Imabeppu, K., Hamada, D., Haniu, A., & Miyauchi, R. (2011). Embedding limitations with digital-audio watermarking method based on cochlear delay characteristics. Journal of Information Hiding and Multimedia Signal Processing, 2(1), 1–23.

    Article  Google Scholar 

  • van de Par, S., Kohlrausch, A., Heusdens, R., Jensen, J., & Jensen, S. H. (2005). A perceptual model for sinusoidal audio coding based on spectral integration. EURASIP Journal on Applied Signal Processing, 2005, 1292–1304.

    Article  MATH  Google Scholar 

  • Vary, P., & Martin, R. (2006). Digital speech transmission: enhancement, coding and error concealment. New York: Wiley.

    Book  Google Scholar 

  • William, S. (2006). Cryptography and network security (4th ed.). Delhi: Pearson Education India.

    Google Scholar 

  • Wu, C.-P., & Kuo, C.-C. J. (2002). Fragile speech watermarking based on exponential scale quantization for tamper detection. Paper presented at the IEEE international conference on acoustics, speech, and signal processing (ICASSP).

  • Yan, B., & Guo, Y.-J. (2011). Speech authentication by semi-fragile speech watermarking utilizing analysis by synthesis and spectral distortion optimization. Multimedia Tools and Applications, 1–23.

  • Yan, B., Lu, Z.-M., Sun, S.-H., & Pan, J.-S. (2005). Speech authentication by semi-fragile watermarking. Paper presented at the knowledge-based intelligent, information and engineering systems.

  • Zhao, X., Guo, Y., Liu, J., & Yan, Y. (2011). Quantization Index Modulation audio watermarking system using a psychoacoustic model. Paper presented at the 8th international conference on information, communications and signal processing (ICICS).

  • Zhe-Ming, L., Bin, Y., & Sheng-He, S. (2005). Watermarking combined with CELP speech coding for authentication. IEICE Transactions on Information and Systems, 88(2), 330–334.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammad Ali Nematollahi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nematollahi, M.A., Al-Haddad, S.A.R. An overview of digital speech watermarking. Int J Speech Technol 16, 471–488 (2013). https://doi.org/10.1007/s10772-013-9192-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10772-013-9192-6

Keywords

Navigation