A triple-layer steganography scheme for low bit-rate speech streams

Yan, Shufan; Tang, Guangming; Sun, Yifeng; Gao, Zhanzhan; Shen, Liuqing

doi:10.1007/s11042-014-2265-y

A triple-layer steganography scheme for low bit-rate speech streams

Published: 18 September 2014

Volume 74, pages 11763–11782, (2015)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Shufan Yan¹,
Guangming Tang²,
Yifeng Sun²,
Zhanzhan Gao¹ &
…
Liuqing Shen¹

301 Accesses
14 Citations
Explore all metrics

Abstract

With the wide application of low bit-rate codecs in speech communication systems, low bit-rate speech streams have become new cover media of great potential for steganography. In this paper, through analyzing the pitch period prediction process in G.729 codec, the pitch parameter of the second speech subframe is found suitable for performing embedding. Then a novel triple-layer steganography method is proposed for low bit-rate speech streams. In this method, modification directions (adding or subtracting one) of the pitch parameter are selected adaptively in order to achieve a high embedding efficiency. Based on the “Hamming + 1” scheme, we use the matrix encoding method twice to increase the hiding capacity. Experimental results show that while keeping a good perceived quality of the synthetic speech, the proposed method has a good real-time performance and a satisfactory steganography security.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Video steganography: recent advances and challenges

Article Open access 04 April 2023

PSNR vs SSIM: imperceptibility quality assessment for image steganography

Article 03 November 2020

A novel steganographic technique for medical image using SVM and IWT

Article 06 January 2023

References

An4 database (1991) http://www.speech.cs.cmu.edu/databases/an4/.
Crandall R (1998) Some notes on steganography. http://os.inf.tu-dresden.de/~westfeld/crandall.pdf. Accessed 1 December 2009
Dittmann J, Hesse D, Hillert R (2005) Steganography and steganalysis in voice over IP scenarios: Operational aspects and first experiences with a new steganalysis tool set. Proc SPIE 5681:607–618, Security, Steganography, and Watermarking of Multimedia Contents VII, San Jose
Huang, Y., Xiao, B., Xiao, H (2008) Implementation of Covert Communication Based on Steganography. Intelligent Information Hiding and Multimedia Signal Processing:1512–1515, IIH-MSP, International Conference on. IEEE.
ITU-T, Recommendation P (2001) 862-perceptual evaluation of speech quality (PESQ): an objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs. International Telecommunication Union-Telecommunication Standardization Sector (ITU-T).
ITU-T, Recommendation G107 (2002) The E-Model, a computational model for use in transmission planning.
Liu L, Li M, Li Q, Liang Y (2008) Perceptually Transparent Information Hiding in G.729 Bitstream. Intelligent Information Hiding and Multimedia Signal Processing:406–409, IIH-MSP, International Conference on. IEEE
Liu Qingzhong, Sung A H, Qiao Mengyu (2009) Temporal Derivative-based Spectrum and Mel-cepstrum Audio Steganalysis. Information Forensics and Security 4:359–368. IEEE Transactions on.
Mazurczyk W, Szczypiorski K (2008) Steganography of VoIP streams. On the Move to Meaningful Internet Systems: OTM 5332:1001–1018
Google Scholar
Tian H, Zhou K, Jiang H, Liu J, Huang Y, Feng D (2009) An adaptive steganography scheme for voice over IP. IEEE International Symposium on Circuits and Systems (ISCAS), Taipei, Taiwan, 24–27.
Tian H, Zhou K, Feng D (2010) Dynamic Matrix encoding strategy for voice-over-IP steganography. J Cent S Univ Technol 17:1285–1292
Article Google Scholar
Xiao B., Huang Y., Tang S. (2008) An approach to Information Hiding in Low bit-rate Speech Stream. Global Telecommunications Conference:1–5, IEEE GLOBECOM.
Yongfeng Huang, Chenghao Liu, Shanyu Tang, Sen Bai (2012) Steganography Integration Into a Low-Bit Rate Speech Codec. Information Forensics and Security 7:1865–1875, IEEE Transactions on.
Yu C, Huang L-S, Yang W (2012) A 3G Speech data hiding method based on pitch period. Journal of Chinese Computer Systems 33:1445–1449
Google Scholar
Zhang Weiming, Wang Shuozhong, Zhang Xinpeng (2007) Improving embedding efficiency of covering codes for applications in steganography. Communications Letters 11:680–682. IEEE.

Download references

Author information

Authors and Affiliations

Zhengzhou Information Science and Technology Institute, Henan, People’s Republic of China
Shufan Yan, Zhanzhan Gao & Liuqing Shen
Department of Information Security, Zhengzhou Information Science and Technology Institute, Zhengzhou, People’s Republic of China
Guangming Tang & Yifeng Sun

Authors

Shufan Yan
View author publications
You can also search for this author in PubMed Google Scholar
Guangming Tang
View author publications
You can also search for this author in PubMed Google Scholar
Yifeng Sun
View author publications
You can also search for this author in PubMed Google Scholar
Zhanzhan Gao
View author publications
You can also search for this author in PubMed Google Scholar
Liuqing Shen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shufan Yan.

Appendix

Assume that there are 2^k–1 speech frames. In the first layer, k secret bits can be embedded in 2^k–1 pitch parameters using the ME method. After calculating the value of r, there are two cases:

Case 1: r ≠ 0, P (Case 1) = (2^k–1)/2^k. In the third layer, we can embed another k secret bits into 2^k–1 L3-0 bits using the ME method. After calculating the value of h, there are also two cases:

Case 1.1: h ≠ 0, P (Case 1.1) = (2^k–1)/2^k. By controlling the modification direction of p ^r ₂, make the result of Eq. (12) not equal to the secret bit to be embedded, namely λ ≠ β. Then flip the L3-0_h bit. The total number of cover bits changed is 2.

Case 1.2: h = 0, P (Case 1.2) = 1/2^k. By controlling the modification direction of p ^r ₂, make the result of Eq. (12) equal to the secret bit to be embedded, namely λ = β. The total number of cover bits changed is 1.

Case 2: r = 0, P (Case 2) = 1/2^k. According to the values of λ and β, there are also two cases:

Case 2.1: λ = β, P (Case 2) = 1/2.

Case 2.1.1: h ≠ 0, P (Case 2.1.1) = P (Case 1.1). Flip the L3-0_f and L3-0_g bits. The total number of cover bits changed is 2.

Case 2.1.2: h = 0, P (Case 2.1.2) = P (Case 1.2). All the three layers have been completed, so the total number of cover bits changed is 0.

Case 2.2: λ ≠ β, P (Case 2.2) = 1/2.

Case 2.2.1: h ≠ 0, P (Case 2.2.1) = P (Case 1.1). Flip the L3-0_h bit. The total number of cover bits changed is 1.

Case 2.2.2: h = 0, P (Case 2.2.2) = P (Case 1.2). Flip the L3 ‐ 0_h ', L3 ‐ 0_f ' and L3 ‐ 0_g ' bits. The total number of cover bits changed is 3.

In conclusion, when embedding 2 k + 1 secret bits, the average number of cover bits changed is:

$$ \begin{array}{l}D=P\left(\mathrm{Case}\ 1\right)\times \left[P\left(\mathrm{Case}\ 1.1\right)\times 2+P\left(\mathrm{Case}\ 1.2\right)\times 1\right]\\ {}\kern1.5em +P\left(\mathrm{Case}\ 2\right)\times P\left(\mathrm{Case}\ 2.1\right)\times \left[P\left(\mathrm{Case}\ \mathrm{2.1.1}\right)\times 2+P\left(\mathrm{Case}\ \mathrm{2.1.2}\right)\times 0\right]\\ {}\kern1.5em +P\left(\mathrm{Case}\ 2\right)\times P\left(\mathrm{Case}\ 2.2\right)\times \left[P\left(\mathrm{Case}\ \mathrm{2.2.1}\right)\times 1+P\left(\mathrm{Case}\ \mathrm{2.2.2}\right)\times 3\right]\\ {}\kern0.75em =\frac{4^{k+1}-3\cdot {2}^k+2}{2^{2k+1}}\end{array} $$

(22)

Thus the bit-change rate (the average rate of being changed per cover bit) of the proposed method is:

$$ C=\frac{D}{N}=\frac{4^{k+1}-3\cdot {2}^k+2}{4\cdot \left({2}^{3k}-{2}^{2k}\right)} $$

(23)

Where N = 2 · (2^k–1) is the total number of cover bits. Therefore, we can figure out the embedding rate and the embedding efficiency:

$$ R=\frac{2k+1}{N}=\frac{2k+1}{2\cdot \left({2}^k-1\right)} $$

(24)

$$ E=\frac{R}{C}=\frac{\left(2k+1\right)\cdot {2}^{2k+1}}{4^{k+1}-3\cdot {2}^k+2} $$

(25)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yan, S., Tang, G., Sun, Y. et al. A triple-layer steganography scheme for low bit-rate speech streams. Multimed Tools Appl 74, 11763–11782 (2015). https://doi.org/10.1007/s11042-014-2265-y

Download citation

Received: 18 March 2014
Revised: 03 July 2014
Accepted: 26 August 2014
Published: 18 September 2014
Issue Date: December 2015
DOI: https://doi.org/10.1007/s11042-014-2265-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A triple-layer steganography scheme for low bit-rate speech streams

Abstract

Access this article

Similar content being viewed by others

Video steganography: recent advances and challenges

PSNR vs SSIM: imperceptibility quality assessment for image steganography

A novel steganographic technique for medical image using SVM and IWT

References

Author information

Authors and Affiliations

Corresponding author

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A triple-layer steganography scheme for low bit-rate speech streams

Abstract

Access this article

Similar content being viewed by others

Video steganography: recent advances and challenges

PSNR vs SSIM: imperceptibility quality assessment for image steganography

A novel steganographic technique for medical image using SVM and IWT

References

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation