Advertisement

Robust Hash Algorithms for Text

  • Martin Steinebach
  • Peter Klöckner
  • Nils Reimers
  • Dominik Wienand
  • Patrick Wolf
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8099)

Abstract

We discuss and compare robust hash functions for natural text with respect to their performance regarding text modification and natural language watermark embedding. Our goal is to identify algorithms suitable for efficiently identifying watermarked copies of eBooks before watermark detection.

Keywords

Robust Hashing Text Watermarking Evaluation 

References

  1. 1.
    Hoffelder, N.: AAP Reports US eBook Sales Up 46% in 2012, Now Well Over a Fifth of US Book MarketGoogle Scholar
  2. 2.
    Wolf, M.: E-book market forecast to hit $5.2B as the book industry burnsGoogle Scholar
  3. 3.
    Wauters, R.: Total Mobile eBook Sales Forecast To Reach $10B By 2016; Now Close To 1 Million Books In Kindle StoreGoogle Scholar
  4. 4.
    Kornblum, J.: Identifying almost identical files using context triggered piecewise hashing. Digital Investigation 3(S) (2006)Google Scholar
  5. 5.
    Broder, A., Glassman, S., Manasse, M., Zweig, G.: Syntactic Clustering of the Web. In: 6th International World Wide Web Conference, pp. 393–404 (April 1997)Google Scholar
  6. 6.
    Charikar, M.: Similarity estimation techniques from rounding algorithms. In: Proc. 34th Annual Symposium on Theory of Computing, STOC 2002, pp. 380–388 (2002)Google Scholar
  7. 7.
    Manku, G., Jain, A., Sarma, A.: Detecting near-duplicates for web crawling. In: Proceedings of the 16th International Conference on World Wide Web (2007)Google Scholar
  8. 8.
    Gabrilovich, E.: Wikipedia Preprocessor (WikiPrep), http://www.cs.technion.ac.il/~gabr/resources/code/wikiprep/

Copyright information

© IFIP International Federation for Information Processing 2013

Authors and Affiliations

  • Martin Steinebach
    • 1
    • 2
    • 3
  • Peter Klöckner
    • 2
  • Nils Reimers
    • 2
  • Dominik Wienand
    • 2
  • Patrick Wolf
    • 3
  1. 1.Fraunhofer SITDarmstadtGermany
  2. 2.CASEDDarmstadtGermany
  3. 3.CoSee GmbHDarmstadtGermany

Personalised recommendations