Natural Language Watermarking: Design, Analysis, and a Proof-of-Concept Implementation

  • Mikhail J. Atallah
  • Victor Raskin
  • Michael Crogan
  • Christian Hempelmann
  • Florian Kerschbaum
  • Dina Mohamed
  • Sanket Naik
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2137)


We describe a scheme for watermarking natural language text by embedding small portions of the watermark bit string in the syntactic structure of a number of selected sentences in the text, with both the selection and embedding keyed (via quadratic residue) to a large prime number. Meaning-preserving transformations of sentences of the text (e.g., translation to another natural language) cannot damage the watermark. Meaning-modifying transformations have a probability, of damaging the watermark, proportional to the watermark length over the number of sentences. Having the key is all that is required for reading the watermark. The approach is best suited for longish meaning-rather than style-oriented ”expository” texts (e.g., reports, directives, manuals, etc.), of which governments and industry produce in abundance and which need protection more frequently than fiction or poetry, which are not so tolerant of the small meaning-preserving syntactic changes that the scheme implements.


Machine Translation Syntactic Structure Secret Message Lexical Entry Information Hiding 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Anderson, R. (ed.) 1996. Information Hiding. First International Workshop. Cambridge, UK, May/June 1996. Proceedings. Lecture Notes in Computer Science 1174Google Scholar
  2. 2.
    Aucsmith, D., J. Hartmanis, G. Goos, and J. Van Leeuwen (eds.) 1998. Information Hiding II: 2nd International Workshop, IH’ 98. Portland, Oregon, USA, April 1998. Proceedings. Lecture Notes in Computer Science 1525.Google Scholar
  3. 3.
    Petitcolas, F. A. P., R. J. Anderson, and M. G. Kuhn 1999. Information Hiding-A Survey. Proceedings of the IEEE 87(7), pp. 1062–1078. July 1999.Google Scholar
  4. 4.
    Pfitzmann, A. (ed.) 2000. Information Hiding. Third International Workshop, IH’ 99. Dresden, Germany, September/October 1999. Proceedings. Lecture Notes in Computer Science 1768.Google Scholar
  5. 5.
    Katzenbeisser, S., and F. A. P. Petitcolas (eds.) 2000. Information Hiding. Techniques for Steganography and Digital Watermarking.Google Scholar
  6. 6.
    N. F. Johnson, Z. Duric, and S. Jajodia (eds.) 2000. Information Hiding: Steganography and Watermarking-Attacks and Countermeasures. Advances in Information Security, Vol. 1.Google Scholar
  7. 7.
    Cox, I. J., J. Kilian, F. T. Leighton, T. Shamoon 1996. Secure spread spectrum watermarking for images, audio and video. International Conference on Image Processing, Vol. 3, pp. 243–246.Google Scholar
  8. 8.
    Cox, I. J., and M. L. Miller 1996. A review of watermarking and the importance of perceptual modeling. Proc. SPIE-Int. Soc. Opt. Eng., Vol. 3016, pp. 92–99.Google Scholar
  9. 9.
    Katzenbeisser, S. C. 2000. Principles of Steganography. In [5, pp. 17–41].Google Scholar
  10. 10.
    Brassil, J., N. F. Maxemchuk, and L. O’Gorman 1994. Electronic Marking and Identification Technique to Discourage Document Copying. Proceedings of INFOCOM’ 94, pp. 1278–1287.Google Scholar
  11. 11.
    Maxemchuk, N. F. 1994. Electronic Document Distribution. AT&T Technical Journal, September/October 1994, pp. 73–80.Google Scholar
  12. 12.
    Low, S. H., N. F. Maxemchuk, and A. M. Lapone 1998. Document Identification for Copyright Protection Using Centroid Detection. IEEE Transcations on Communication 46(3), pp. 372–383.CrossRefGoogle Scholar
  13. 13.
    Atallah, M. J., C. J. McDonough, V. Raskin, and S. Nirenburg 2000. Natural Language Processing for Information Assurance and Security: An Overview and Implementations. In: Preproceedings of the Workshop on New Paradigms in Information Security, Cork, Ireland, September 2000. To appear in: M. Shaeffer (ed.), NSPW’ 00: Proceedings of Workshop on New Paradigms in Information Security, Cork, Ireland, September 2000. ACM Publications, 2001.Google Scholar
  14. 14.
    Atallah, M. J., and S. S. Wagstaff 1996. Watermarking Data Using Quadratic Residues. Working Paper, Department of Computer Science, Purdue University.Google Scholar
  15. 15.
    Wayner, P. 1992. Mimic Functions. Cryptologia XVI(3), pp. 193–214.MathSciNetCrossRefGoogle Scholar
  16. 16.
    Wayner, P. 1995. Strong Theoretical Steganography. Cryptologia XIX(3), 285–299.CrossRefGoogle Scholar
  17. 17.
    Chapman, M., and G. Davida 1997. Hiding the Hidden: A Software System for Concealing Ciphertext as Innocuous Text. Proceedings of the International Conference on Information and Communication Security. Lecture Notes in Computer Sciences 1334, pp. 333–345.Google Scholar
  18. 18.
    Kutter, M., and F. A. P. Petitcolas 2000. Fair Evaluation Methods for Watermarking Systems. Journal of Electronic Imaging 9(4), pp. 445–455.CrossRefGoogle Scholar
  19. 19.
    Petitcolas, F. A. P. 2000. Watermarking Scheme Evaluation-Algorithms Need Common Benchmarks. IEEE Signal Processing Magazine 17(5), pp. 58–64.CrossRefGoogle Scholar
  20. 20.
    Nirenburg, S., and V. Raskin 2001. Principles of Ontological Semantics (forthcoming). Pre-publication draft,
  21. 21.
    Mahesh, K. 1996. Ontology Development for Machine Translation: Ideology and Methodology. Memoranda in Computer and Cognitive Science, MCCS-96-292. Las Cruces, NM, New Mexico State University, Computing Research Laboratory.Google Scholar
  22. 22.
    Nirenburg, S., and V. Raskin 1987. The subworld concept lexicon and the lexicon management system. Computational Linguistics, 13(3–4), pp. 276–289.Google Scholar
  23. 23.
    Nirenburg, S., and V. Raskin 1996. Ten Choices for Lexical Semantics. Memoranda in Computer and Cognitive Science, MCCS-96-304. Las Cruces, NM, New Mexico State University, Computing Research Laboratory.Google Scholar
  24. 24.
    Viegas, E., and V. Raskin 1998. Computational Semantic Lexicon Acquisition: Methodology and Guidelines. Memoranda in Computer and Cognitive Science, MCCS-98-315. Las Cruces, NM, New Mexico State University, Computing Research Laboratory.Google Scholar
  25. 25.
    Onyshkevych, B., and S. Nirenburg 1995. A lexicon for knowledge-based MT. Machine Translation, 10(1-2), pp. 5–57.CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2001

Authors and Affiliations

  • Mikhail J. Atallah
    • 1
  • Victor Raskin
    • 2
  • Michael Crogan
    • 1
  • Christian Hempelmann
    • 2
  • Florian Kerschbaum
    • 1
  • Dina Mohamed
    • 2
  • Sanket Naik
    • 1
  1. 1.CERIAS and Dept. of Computer SciencePurdue UniversityWest LafayetteUSA
  2. 2.Natural Language Processing Lab.CERIAS, Interdepartmental Program in LinguisticsUSA

Personalised recommendations