Natural Language Watermarking Using Semantic Substitution for Chinese Text

  • Yuei-Lin Chiang
  • Lu-Ping Chang
  • Wen-Tai Hsieh
  • Wen-Chih Chen
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2939)


Numerous schemes have been designed for watermarking multimedia contents. Many of these schemes are vulnerable to watermark erasing attacks. Naturally, such methods are ineffective on text unless the text is represented as a bitmap image, but in that case, the watermark can be erased easily by using Optical Character Recognition (OCR) to change the representation of the text from a bitmap to ASCII or EBCDIC. This study attempts to develop a method for embedding watermark in the text that is as successful as the frequency-domain methods have been for image and audio. The novel method embeds the watermark in original text, creating ciphertext, which preserves the meaning of the original text via various semantic replacements.


Natural Language Processing Synonym Substitution Original Text Optical Character Recognition Digital Watermark 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Brassil, J., Low, S., Maxemchuk, N., O’ Gorman, L.: Electronic marking and identification techniques to discourage document copying. In: Proceedings of IEEE INFOCOM 1994, pp. 1278–1287 (1994)Google Scholar
  2. 2.
    Brassil, J., Low, S., Maxemchuk, N., O’ Gorman, L.: Hiding information in document images. In: Proceedings of the 29th Annual Conference on Information Sciences and Systems, pp. 482–489 (1995)Google Scholar
  3. 3.
    Charniak, E.: Statistical parsing with a context-free grammar and word statistics. In: Proceedings of NCAI 1997, pp. 598–603 (1997)Google Scholar
  4. 4.
    Hardy, G.H., Wright, E.M.: Quadratic Residues, 5th edn. §6.5 in An Introduction to the Theory of Numbers, pp. 67–68. Clarendon Press, Oxford (1979)Google Scholar
  5. 5.
    Johnson, M.: The effect of alternative tree representations on tree bank grammars. In: Proceedings of the Joint Conference on New methods in Language Processing and Computational Natural Language Learning (NeMLaP3/CoNLL 1998), pp. 39–48 (1998)Google Scholar
  6. 6.
    Low, S., Maxemchuk, N., Brassil, J., O’ Gorman, L.: Document marking and identification using both line and word shifting. In: Proceedings of IEEE INFOCOM 1995 (1995)Google Scholar
  7. 7.
    Atallah, M., Raskin, V., Hempelmann, C.F., Karahan, M., Sion, R., Triezenberg, K.E., Topkara, U.: Natural Language Watermarking and Tamperproofing. In: Petitcolas, F.A.P. (ed.) IH 2002. LNCS, vol. 2578, pp. 196–212. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  8. 8.
    Atallah, M.J., Raskin, V., Crogan, M., Hempelmann, C., Kerschbaum, F., Mohamed, D., Naik, S.: Natural Language Watermarking: Design, Analysis, and Proof-of-Concept Implementation. Published in the Proceedings of the 4th International Information Hiding Workshop, Pittsburgh, Pennsylvania, April 25-27 (2001)Google Scholar
  9. 9.
    Xia, F., Palmer, M., Xue, N., Okurowski, M.E., Kovarik, J., Chiou, F.-D., Kroch, T., Marcus, M.: Developing Guidelines and Ensuring Consistency for Chinese Text Annotation. In: Second International Conference on Language Resources and Evaluation (LREC 2000), pp. 3–10 (2000)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Yuei-Lin Chiang
    • 1
  • Lu-Ping Chang
    • 1
  • Wen-Tai Hsieh
    • 1
  • Wen-Chih Chen
    • 1
  1. 1.Advanced e-Commerce Technology Lab.Institute for Information IndustryTaipeiTaiwan, R.O.C.

Personalised recommendations