Skip to main content

High-Performance Linguistic Steganalysis, Capacity Estimation and Steganographic Positioning

  • Conference paper
  • First Online:
Digital Forensics and Watermarking (IWDW 2020)

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 12617))

Included in the following conference series:

Abstract

With the rapid development of natural language processing technology, various linguistic steganographic methods have been proposed increasingly, which may bring great challenges in the governance of cyberspace security. The previous linguistic steganalysis methods based on neural networks with word embedding layer could only extract the context-independent word-level features, which are insufficient for capturing the complex semantic dependencies in sentences, thus may limit the performance of text steganalysis. In this paper, we propose a novel linguistic steganalysis model. We first employ the BERT or Glove component to extract the contextualized association relationships of words in the sentences. Then we put these extracted features into BiLSTM to further get context information. We use the attention mechanism to find out local parts that may be discordant in text. Finally, based on these extracted features, we use the softmax classifier to decide if the input sentence is cover or stego. Experimental results show that the proposed model can achieve currently the best performance of text steganalysis and hidden capacity estimation. Further experiments found that proposed model can even locate where the secret information may be embedded in the text to a certain extent. To the best of our knowledge, we made the first attempt to achieve text steganography positioning in the field of text steganalysis (Code and datasets are available at https://github.com/YangzlTHU/Linguistic-Steganography-and-Steganalysis).

Supported by the National Key Research and Development Program of China under Grant No. 2018YFB0804103 and the National Natural Science Foundation of China (No. U1936208, No. 61862002 and No. U1936216).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Pre-trained word embedding of GloVe can be downloaded from http://nlp.stanford.edu/projects/glove/.

References

  1. Bao, Y., Yang, H., Yang, Z., Liu, S., Huang, Y.: Text steganalysis with attentional LSTM-CNN. arXiv preprint arXiv:1912.12871 (2019)

  2. Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3(Feb), 1137–1155 (2003)

    MATH  Google Scholar 

  3. Boukis, A.C., Reiter, K., Frölich, M., Hofheinz, D., Meier, M.A.: Multicomponent reactions provide key molecules for secret communication. Nat. Commun. 9(1), 1–10 (2018)

    Article  Google Scholar 

  4. Chang, C.Y., Clark, S.: Practical linguistic steganography using contextual synonym substitution and vertex colour coding. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pp. 1194–1203 (2010)

    Google Scholar 

  5. Chang, C.Y., Clark, S.: Practical linguistic steganography using contextual synonym substitution and a novel vertex coding method. Comput. Linguist. 40(2), 403–448 (2014)

    Article  Google Scholar 

  6. Dai, F.Z., Cai, Z.: Towards near-imperceptible steganographic text. arXiv preprint arXiv:1907.06679 (2019)

  7. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

  8. Fang, T., Jaggi, M., Argyraki, K.: Generating steganographic text with LSTMs. arXiv preprint arXiv:1705.10742 (2017)

  9. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  10. Loshchilov, I., Hutter, F.: Fixing weight decay regularization in Adam (2018)

    Google Scholar 

  11. Luo, Y., Huang, Y., Li, F., Chang, C.: Text steganography based on Ci-poetry generation using Markov chain model. TIIS 10(9), 4568–4584 (2016)

    Google Scholar 

  12. Michel, J.B., et al.: Quantitative analysis of culture using millions of digitized books. Science 331(6014), 176–182 (2011)

    Article  Google Scholar 

  13. Pal, S.K., Mitra, S.: Multilayer perceptron, fuzzy sets, classification (1992)

    Google Scholar 

  14. Pennington, J., Socher, R., Manning, C.D.: GloVe: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)

    Google Scholar 

  15. Rizzo, S.G., Bertini, F., Montesi, D.: Content-preserving text watermarking through unicode homoglyph substitution. In: Proceedings of the 20th International Database Engineering & Applications Symposium, pp. 97–104 (2016)

    Google Scholar 

  16. Sarkar, T., Selvakumar, K., Motiei, L., Margulies, D.: Message in a molecule. Nat. Commun. 7(1), 1–9 (2016)

    Article  Google Scholar 

  17. Shannon, C.E.: Communication theory of secrecy systems. Bell Syst. Tech. J. 28(4), 656–715 (1949)

    Article  MathSciNet  Google Scholar 

  18. Simmons, G.J.: The prisoners’ problem and the subliminal channel. In: Chaum, D. (ed.) Advances in Cryptology, pp. 51–67. Springer, Boston (1984). https://doi.org/10.1007/978-1-4684-4730-9_5

    Chapter  Google Scholar 

  19. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. j. Mach. Learn. Res. 15(1), 1929–1958 (2014)

    MathSciNet  MATH  Google Scholar 

  20. Taskiran, C.M., Topkara, U., Topkara, M., Delp, E.J.: Attacks on lexical natural language steganography systems. In: Security, Steganography, and Watermarking of Multimedia Contents VIII, vol. 6072, p. 607209. International Society for Optics and Photonics (2006)

    Google Scholar 

  21. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)

    Google Scholar 

  22. Wang, Y., Zhang, W., Li, W., Yu, X., Yu, N.: Non-additive cost functions for color image steganography based on inter-channel correlations and differences. IEEE Trans. Inf. Forensics Secur. 15, 2081–2095 (2019)

    Article  Google Scholar 

  23. Wen, J., Zhou, X., Zhong, P., Xue, Y.: Convolutional neural network based text steganalysis. IEEE Signal Process. Lett. 26(3), 460–464 (2019)

    Article  Google Scholar 

  24. Wulf, W.A., Jones, A.K.: Reflections on cybersecurity. Science 326(5955), 943–944 (2009)

    Article  Google Scholar 

  25. Yang, H., Bao, Y., Yang, Z., Liu, S., Huang, Y., Jiao, S.: Linguistic steganalysis via densely connected LSTM with feature pyramid. In: Proceedings of the 2020 ACM Workshop on Information Hiding and Multimedia Security, pp. 5–10 (2020)

    Google Scholar 

  26. Yang, H., Cao, X.: Linguistic steganalysis based on meta features and immune mechanism. Chin. J. Electron. 19(4), 661–666 (2010)

    Google Scholar 

  27. Yang, Z., Guo, X., Chen, Z., Huang, Y., Zhang, Y.: RNN-stega: linguistic steganography based on recurrent neural networks. IEEE Trans. Inf. Forensics Secur. 14(5), 1280–1295 (2019). https://doi.org/10.1109/TIFS.2018.2871746

    Article  Google Scholar 

  28. Yang, Z., Du, X., Tan, Y., Huang, Y., Zhang, Y.J.: AAG-stega: automatic audio generation-based steganography. arXiv preprint arXiv:1809.03463 (2018)

  29. Yang, Z., Huang, Y., Zhang, Y.J.: A fast and efficient text steganalysis method. IEEE Signal Process. Lett. 26(4), 627–631 (2019)

    Article  Google Scholar 

  30. Yang, Z., Huang, Y., Zhang, Y.J.: TS-CSW: text steganalysis and hidden capacity estimation based on convolutional sliding windows. Multimed. Tools Appl. 1–24 (2020)

    Google Scholar 

  31. Yang, Z., Wang, K., Ma, S., Huang, Y., Kang, X., Zhao, X.: IStego100K: large-scale image steganalysis dataset. In: Wang, H., Zhao, X., Shi, Y., Kim, H.J., Piva, A. (eds.) IWDW 2019. LNCS, vol. 12022, pp. 352–364. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-43575-2_29

    Chapter  Google Scholar 

  32. Yang, Z., Wei, N., Liu, Q., Huang, Y., Zhang, Y.: GAN-TStega: text steganography based on generative adversarial networks. In: Wang, H., Zhao, X., Shi, Y., Kim, H.J., Piva, A. (eds.) IWDW 2019. LNCS, vol. 12022, pp. 18–31. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-43575-2_2

    Chapter  Google Scholar 

  33. Yang, Z., Wei, N., Sheng, J., Huang, Y., Zhang, Y.J.: TS-CNN: text steganalysis from semantic space based on convolutional neural network. arXiv preprint arXiv:1810.08136 (2018)

  34. Yang, Z., Zhang, P., Jiang, M., Huang, Y., Zhang, Y.-J.: RITS: real-time interactive text steganography based on automatic dialogue model. In: Sun, X., Pan, Z., Bertino, E. (eds.) ICCCS 2018. LNCS, vol. 11065, pp. 253–264. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00012-7_24

    Chapter  Google Scholar 

  35. Yang, Z., Zhang, S., Hu, Y., Hu, Z., Huang, Y.: VAE-stega: linguistic steganography based on variational auto-encoder. IEEE Trans. Inf. Forensics Secur. 16, 880–895 (2020)

    Article  Google Scholar 

  36. Ziegler, Z.M., Deng, Y., Rush, A.M.: Neural linguistic steganography. arXiv preprint arXiv:1909.01496 (2019)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhongliang Yang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zou, J., Yang, Z., Zhang, S., Rehman, S.u., Huang, Y. (2021). High-Performance Linguistic Steganalysis, Capacity Estimation and Steganographic Positioning. In: Zhao, X., Shi, YQ., Piva, A., Kim, H.J. (eds) Digital Forensics and Watermarking. IWDW 2020. Lecture Notes in Computer Science(), vol 12617. Springer, Cham. https://doi.org/10.1007/978-3-030-69449-4_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-69449-4_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-69448-7

  • Online ISBN: 978-3-030-69449-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics