Skip to main content
Log in

Speech authentication by semi-fragile speech watermarking utilizing analysis by synthesis and spectral distortion optimization

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript


This paper proposes an improved semi-fragile speech watermarking scheme by quantization of linear prediction (LP) parameters, i.e., the inverse sine (IS) parameters. The spectral distortion due to watermark embedding is controlled to meet the ‘transparency’ criterion in speech coding. A modified bit allocation algorithm combined with watermarking is developed to determine the quantization step so that the ‘transparency’ requirement is satisfied. Due to the statistical nature, the LP coefficients estimated from the watermarked speech signal are different from the watermarked LP coefficients even in the absence of attacks. This effect is the cause of increase in decoding error and minimum authentication length. To tackle this problem, an Analysis by Synthesis (AbS) scheme is developed to reduce the difference between the estimated LP coefficients and the watermarked ones. The watermark detection threshold and minimum authentication length are then derived according to the probability of error and the signal to noise ratio (SNR) requirements. Experimental results show that the proposed AbS based method can effectively reduce the difference between the watermarked IS parameter and the extracted IS parameter when there is no attacks. In addition, the modified bit allocation algorithm can automatically find the appropriate quantization step used in the odd-even modulation so that the transparency requirement is satisfied.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others


  1. Barni J, Bartolini F (2004) Watermarking systems engineering: enabling digital assets security and other applications. Marcel Dekker, New York

    Google Scholar 

  2. Chen OTC, Liu CH (2007) Content-dependent watermarking scheme in compressed speech with identifying manner and location of attacks. IEEE Transactions on Audio, Speech, and Language Processing 15(5):1605–1616

    Article  Google Scholar 

  3. Cheng Q, Sorensen S (2001) Spread spectrum signaling for speech watermarking. In: IEEE international conference on acoustics speech and signal processing, vol 3, pp 1337–1340

  4. Cheng Q, Sorensen S (2005) Spread spectrum signaling for speech watermarking. United States Patent, No. US6892175B1

  5. Chu WC (2003) Speech coding algorithms: foundation and evolution of standardized coders. Wiley-Interscience, New Jersey

    Book  MATH  Google Scholar 

  6. Ciloglu T, Karaaslan SU (2003) An improved all-pass watermarking scheme for speech and audio. In: International conference on multimedia and expo., vol 2, pp 1017–1020

  7. Cox IJ, Miller ML, McKellips AL (1999) Watermarking as communication with side information. Proc IEEE 87(7):1127–1141

    Article  Google Scholar 

  8. Deller JR, Hansen JHL, Proakis JG (1993) Discrete-time processing of speech signals. IEEE Press, Piscataway

    Google Scholar 

  9. Deng Z, Yang Z, Shao X, Xu N, Wu C, Guo H (2007) Design and implementation of steganographic speech telephone. In: Ip H, Au O, Leung H, Sun MT, Ma WY, Hu SM (eds) C PCM 2007: advances in multimedia information processing. Lecture notes in computer science, vol 4810. Springer, Berlin, pp 429–432

    Chapter  Google Scholar 

  10. Dutoit T, Marques F (2009) Applied signal processing, 1st edn. A MATLAB-based proof of concept. Springer, New York

    Google Scholar 

  11. Faundez-Zanuy M (2010) Digital watermarking: new speech and image applications. In: Sol-Casals J, Zaiats V (eds) Advances in nonlinear speech processing. Lecture notes in computer science, vol 5933. Springer, Berlin, pp 84–89

    Chapter  Google Scholar 

  12. Faundez-Zanuy M, Hagmüller M, Kubin G (2006) Speaker verification security improvement by means of speech watermarking. Speech Commun 48(12):1608–1619

    Article  Google Scholar 

  13. Faundez-Zanuy M, Hagmüller M, Kubin G (2007) Speaker identification security improvement by means of speech watermarking. Pattern Recogn 40(11):3027–3034

    Article  MATH  Google Scholar 

  14. Faundez-Zanuy M, Lucena-Molina JJ, Hagmüller M (2010) Speech watermarking: an approach for the forensic analysis of digital telephonic recordings. J Forensic Sci 55:1080–1087

    Article  Google Scholar 

  15. Fei C (2006) Analysis and design of watermark-based multimedia authentication system. Ph.D. thesis, University of Toronto

  16. Geiser B, Vary P (2008) High rate data hiding in ACELP speech codes.In: Proceedings of ICASSP 2008, pp 4005–4007

  17. Gurijala A, Deller JR, Jr, MSS, Hansen JHL (2002) Speech watermarking through parametric modeling. In: Proceedings of international conference on spoken language processing (ICSLP)

  18. Hagmüller M, Horst H, Kröpfl A, Kubin G (2004) Speech watermarking for air traffic control. In: Proceedings of 12th European signal processing conference. European Association for Signal, Speech and Image Processing, pp 1653–1656

  19. Harjito B, Song H, Potdar V, Chang E, Miao X (2010) Secure communication in wireless multimedia sensor networks using watermarking. In: Proceedings of the 2010 4th IEEE international conference on digital ecosystems and technologies (DEST), pp 640–645

  20. Hatada M, Sakai T, Komatsu N, Yamazaki Y (2002) Digital watermarking based on process of speech production. Proc SPIE 4861:258–267

    Article  Google Scholar 

  21. Hofbauer K, Kubin G, Kleijn WB (2009) Speech watermarking for analog flat-fading bandpass channels. IEEE Transactions on Audio, Speech and Language Processing 17:1624–1637

    Article  Google Scholar 

  22. Huang HC, Fang WC (2010) Metadata-based image watermarking for copyright protection. Simulation Modelling Practice and Theory 18(4):436–445

    Article  Google Scholar 

  23. Huang HC, Chu SC, Pan J, Huang CY, Liao,BY (2011) Tabu search based multi-watermarks embedding algorithm with multiple description coding. Inf Sci 181(16):3379–3396

    Article  Google Scholar 

  24. Kay SM (1988) Modern spectral estimation: theory and application. Prentice-Hall, Englewood Cliffs

    MATH  Google Scholar 

  25. Konrad H, Horst H, Gernot K (2005) Speech watermarking for the VHF radio channel. In: 4th Eurocontrol innovative research workshop.

  26. Kundur D (1999) Multiresolution digital watermarking: algorithms and implications for multimedia signals. Ph.D. thesis, Graduate Department of Electrical and Computer Engineering, University of Toronto

  27. Lacy J, Quackenbusch S, Reibman AR, Shur D, Snyder J (1998) On combining watermarking with perceptual coding. In: IEEE ICASSP’98, Seattle (USA), vol 6, pp 3725–3728

  28. Lin CC, Tsai WH (2004) Secret image sharing with steganography and authentication. J Syst Softw 73:405–414

    Article  Google Scholar 

  29. Liu CH, Chen OTC (2004) A fragile speech watermarking scheme with recovering speech contents. In: The 47th IEEE international midwest symposium on circuits and systems, pp II165–II168

  30. Liu JX, Lu ZM, Luo H A (2009) CELP-speech information hiding algorithm based on vector quantization. In: Proceedings of the fifth international conference on information assurance and security, pp 75–78

  31. Lu ZM, Yan B, Sun SH (2005) Watermarking combined with CELP speech coding for authentication. IEICE Trans Inf Syst E88-D(2):330–334

    Article  Google Scholar 

  32. Ma L, Wu Zj, Hu Y, Yang W (2007) An information-hiding model for secure communication. In: Huang DS, Heutte L, Loog M (eds) Advanced intelligent computing theories and applications. With aspects of theoretical and methodological issues. Lecture notes in computer science, vol 4681. Springer, Berlin, pp 1305–1314

    Google Scholar 

  33. Oppenheim AV, Schafer RW, Buck JR (1999) Discrete-time signal processing, 2nd edn. Prentice-Hall, Upper Saddle River

    Google Scholar 

  34. Painter T, Spanias A (2000) Perceptual coding of digital audio. Proc IEEE 88:451–515

    Article  Google Scholar 

  35. Park CM, Thapa D, Wang GN (2007) Speech authentication system using digital watermarking and pattern recovery. Pattern Recogn Lett 28(8):931–938

    Article  Google Scholar 

  36. Quatieri TF (2002) Discrete-time speech signal processing: principles and practice. Prentice Hall, Englewood Cliffs

    Google Scholar 

  37. Ramamurthy K, Spanias A (2010) MATLAB software for the code excited linear prediction algorithm: the federal standard-1016. Morgan and Claypool Publishers

  38. Ruiz FJ, Deller JRJ (2000) Digital watermarking of speech signals for the national gallery ofthe spoken word. In: IEEE international conference on acoustics speech and signal processing, vol 3, pp 1499–1502

  39. Sakaguchi S, Arai T, Murahara Y (2000) The effect of polarity inversion of speech on human perception and data hiding as an application. In: IEEE international conference on acoustics speech and signal processing, vol 2, pp 917–920

  40. Saraswathi S (2010) Speech authentication based on audio watermarking. Int J Inf Technol 16(1):34–43

    Google Scholar 

  41. Singh J, Garg P, De AN (2009) A combined watermarking and encryption algorithm for secure voip. Information Security Journal: A Global Perspective 18:99–105

    Article  Google Scholar 

  42. Stallings W (2005) Cryptography and network security principles and practices, 4th edn. Prentice Hall, Upper Saddle River

    Google Scholar 

  43. Unoki AM, Imabeppu K, Hamada D, Haniu A, Miyauchi R (2011) Embedding limitations with digital-audio watermarking method based on cochlear delay characteristics. Journal of Information Hiding and Multimedia Signal Processing 2(1):337–355

    Google Scholar 

  44. Wu CP, Kuo CCJ (2002) Fragile speech watermarking based on exponential scale quantization for tamper detection. In: Proceeding of the IEEE international conference on acoustic, speech and signal processing, pp 3305–3308

  45. Yan B, Lu ZM, Sun SH, Pan JS (2005) Speech authentication by semi-fragile watermarking. In: Proceedings of international workshop on intelligent information hiding and multimedia signal processing. Lecture notes in computer science, vol 3683, pp 497–504

  46. Yardyimci Y, Cetin AE, Ansari R (1997) Data hiding in speech using phase coding. In: Proceedings of Eurospeech 97, Rhodes, Greece, vol 3, pp 1679–1682

Download references


This work is supported by research project of ”SUST Spring Bud” under the grant number: 2009AZZ155. The work of the second author is also supported by the project of National Natural Science Foundation of China (NSFC) under project grant number: 61071087. The authors would like to thank the anonymous reviewers for their constructive comments and suggestions. We are indebted to the reviewers for their valuable time spent on the manuscript of this paper. The first author would like to thank Prof. Zhe-Ming Lu, Prof. Sheng-He Sun, Prof. Jeng-Shyang Pan and Prof. Xia-Mu Niu for their guidance and help in developing the basic algorithm upon which the extension in this paper is built.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Bin Yan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yan, B., Guo, YJ. Speech authentication by semi-fragile speech watermarking utilizing analysis by synthesis and spectral distortion optimization. Multimed Tools Appl 67, 383–405 (2013).

Download citation

  • Published:

  • Issue Date:

  • DOI: