Abstract
Most state-of-the-art detection methods against synonym substitution based steganography extract features based on statistical distortion. However, synonym substitution will cause not only statistical distortion but also semantic distortion. In this paper, we propose word embedding feature (WEF) to detect the semantic distortion. Furthermore, a fused feature called word embedding and statistical feature set (WESF) which consists of WEF and statistical feature based on word frequency is designed to improve detection performance. Experiments show that WESF can achieve lower detection error rates compared with prmethods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Pevný, T., Fridrich, J.: Benchmarking for steganography. In: Solanki, K., Sullivan, K., Madhow, U. (eds.) IH 2008. LNCS, vol. 5284, pp. 251–267. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88961-8_18
Fridrich, J.: Steganography in Digital Media: Principles, Algorithms, and Applications. Cambridge University Press, New York (2009)
Chapman, M., Davida, G.: Hiding the hidden: a software system for concealing ciphertext as innocuous text. In: Han, Y., Okamoto, T., Qing, S. (eds.) ICICS 1997. LNCS, vol. 1334, pp. 335–345. Springer, Heidelberg (1997). https://doi.org/10.1007/BFb0028489
Liu, Y., Sun, X., Liu, Y., Li, C.T.: MIMIC-PPT: Mimicking-based steganography for microsoft power point document. Inf. Technol. J. 7(4), 654–660 (2008)
Chen, Z., et al.: Linguistic steganography detection using statistical characteristics of correlations between words. In: Solanki, K., Sullivan, K., Madhow, U. (eds.) IH 2008. LNCS, vol. 5284, pp. 224–235. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88961-8_16
Bolshakov, I.A.: A method of linguistic steganography based on collocationally-verified synonymy. In: Fridrich, J. (ed.) IH 2004. LNCS, vol. 3200, pp. 180–191. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30114-1_13
Yuling, L., Xingming, S., Can, G., Hong, W.: An efficient linguistic steganography for Chinese text. In: 2007 IEEE International Conference on Multimedia and Expo, pp. 2094–2097. IEEE (2007)
Muhammad, H.Z., Rahman, S.M.S.A.A., Shakil, A.: Synonym based Malay linguistic text steganography. In: Innovative Technologies in Intelligent Systems and Industrial Applications, CITISIA 2009, pp. 423–427. IEEE (2009)
Shirali-Shahreza, M.H., Shirali-Shahreza, M.: A new synonym text steganography. In: International Conference on Intelligent Information Hiding and Multimedia Signal Processing, IIHMSP 2008, pp. 1524–1526. IEEE (2008)
Wilson, A., Ker, A.D.: Avoiding detection on twitter: embedding strategies for linguistic steganography. Electron. Imaging 2016(8), 1–9 (2016)
Winstein, K.: Lexical steganography Through Adaptive Modulation of the Word Choice Hash (1998, unpublished). http://www.imsa.edu/~keithw/tlex
Wilson, A., Blunsom, P., Ker, A.D.: Linguistic steganography on twitter: hierarchical language modeling with manual interaction. In: IS&T/SPIE Electronic Imaging, p. 902803. International Society for Optics and Photonics (2014)
Taskiran, C.M., Topkara, U., Topkara, M., Delp, E.J.: Attacks on lexical natural language steganography systems. In: Electronic Imaging 2006, p. 607209. International Society for Optics and Photonics (2006)
Yu, Z., Huang, L., Chen, Z., Li, L., Zhao, X., Zhu, Y.: Detection of synonym-substitution modified articles using context information. In: Second International Conference on Future Generation Communication and Networking, FGCN 2008, vol. 1, pp. 134–139. IEEE (2008)
Chen, Z., Huang, L., Miao, H., Yang, W., Meng, P.: Steganalysis against substitution-based linguistic steganography based on context clusters. Comput. Electr. Eng. 37(6), 1071–1081 (2011)
Xiang, L., Sun, X., Luo, G., Xia, B.: Linguistic steganalysis using the features derived from synonym frequency. Multimed. Tools Appl. 71(3), 1893–1911 (2014)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classification. arXiv preprint arXiv:1607.01759 (2016)
Ji, S., Yun, H., Yanardag, P., Matsushima, S., Vishwanathan, S.: WordRank: Learning word embeddings via robust ranking. arXiv preprint arXiv:1506.02761 (2015)
Miller, G.A.: Wordnet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
Acknowledgements
This work was supported in part by the Natural Science Foundation of China under Grant U1636201, 61572452.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Zuo, X., Hu, H., Zhang, W., Yu, N. (2018). Text Semantic Steganalysis Based on Word Embedding. In: Sun, X., Pan, Z., Bertino, E. (eds) Cloud Computing and Security. ICCCS 2018. Lecture Notes in Computer Science(), vol 11066. Springer, Cham. https://doi.org/10.1007/978-3-030-00015-8_42
Download citation
DOI: https://doi.org/10.1007/978-3-030-00015-8_42
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00014-1
Online ISBN: 978-3-030-00015-8
eBook Packages: Computer ScienceComputer Science (R0)