Towards a Process Model for Hash Functions in Digital Forensics

  • Frank Breitinger
  • Huajian Liu
  • Christian Winter
  • Harald Baier
  • Alexey Rybalchenko
  • Martin Steinebach
Conference paper
Part of the Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering book series (LNICST, volume 132)

Abstract

Handling forensic investigations gets more and more difficult as the amount of data one has to analyze is increasing continuously. A common approach for automated file identification are hash functions. The proceeding is quite simple: a tool hashes all files of a seized device and compares them against a database. Depending on the database, this allows to discard non-relevant (whitelisting) or detect suspicious files (blacklisting).

One can distinguish three kinds of algorithms: (cryptographic) hash functions, bytewise approximate matching and semantic approximate matching (a.k.a perceptual hashing) where the main difference is the operation level. The latter one operates on the semantic level while both other approaches consider the byte-level. Hence, investigators have three different approaches at hand to analyze a device.

First, this paper gives a comprehensive overview of existing approaches for bytewise and semantic approximate matching (for semantic we focus on images functions). Second, we compare implementations and summarize the strengths and weaknesses of all approaches. Third, we show how to integrate these functions based on a sample use case into one existing process model, the computer forensics field triage process model.

Keywords

Digital forensics Hashing Similarity hashing Robust hashing Perceptual hashing Approximate matching Process model 

References

  1. 1.
    Pollitt, M.M.: An ad hoc review of digital forensic models. In: Second International Workshop on Systematic Approaches to Digital Forensic Engineering, SADFE 2007, pp. 43–54 (2007)Google Scholar
  2. 2.
    Rogers, M.K., Goldman, J., Mislan, R., Wedge, T., Debrota, S.: Computer forensics field triage process model. In: Conference on Digital Forensics, Security and Law, pp. 27–40 (2006)Google Scholar
  3. 3.
    NIST: National Software Reference Library, May 2012. http://www.nsrl.nist.gov
  4. 4.
    NIST: Secure Hash Standard. National Institute of Standards and Technologies, FIPS PUB 180–1 (1995)Google Scholar
  5. 5.
    White, D.: Hashing of file blocks: When exact matches are not useful. Presentation at American Academy of Forensic Sciences (AAFS) (2008)Google Scholar
  6. 6.
    Baier, H., Dichtelmueller, C.: Datenreduktion mittels kryptographischer Hashfunktionen in der IT-Forensik: Nur ein Mythos? In: DACH Security 2012, pp. 278–287, September 2012Google Scholar
  7. 7.
    Breitinger, F., Baier, H.: A Fuzzy Hashing Approach based on Random Sequences and Hamming Distance. In: ADFSL Conference on Digital Forensics, Security and Law, pp. 89–101, May 2012Google Scholar
  8. 8.
    Breitinger, F., Åstebøl, K.P., Baier, H., Busch, C.: mvhash-b - a new approach for similarity preserving hashing. IT Security Incident Management & IT Forensics (IMF), vol. 7, March 2013Google Scholar
  9. 9.
    Sadowski, C., Levin, G.: Simhash: Hash-based similarity detection, December 2007. http://simhash.googlecode.com/svn/trunk/paper/SimHashWithBib.pdf
  10. 10.
    Broder, A.Z.: On the resemblance and containment of documents. In: Compression and Complexity of Sequences (SEQUENCES’97), pp. 21–29. IEEE Computer Society (1997)Google Scholar
  11. 11.
    Tridgell, A.: Spamsum, Readme (2002). http://samba.org/ftp/unpacked/junkcode/spamsum/README
  12. 12.
    Noll, L.C.: Fowler/Noll/Vo (FNV) Hash (2001). http://www.isthe.com/chongo/tech/comp/fnv/index.html
  13. 13.
    Roussev, V.: Data fingerprinting with similarity digests. Int. Fed. Inf. Process. 337(2010), 207–226 (2010)Google Scholar
  14. 14.
    Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13, 422–426 (1970)CrossRefMATHGoogle Scholar
  15. 15.
    Breitinger, F., Baier, H.: Similarity Preserving Hashing: Eligible Properties and a new Algorithm MRSH-v2. In: 4th ICST Conference on Digital Forensics & Cyber Crime (ICDF2C), October 2012Google Scholar
  16. 16.
    Roussev, V., Richard, G.G., Marziale, L.: Multi-resolution similarity hashing. Digital Forensic Research Workshop (DFRWS), pp. 105–113 (2007)Google Scholar
  17. 17.
    Kato, T.: Database architecture for content-based image retrieval. In: Image Storage and Retrieval Systems. Proc. SPIE, IS&T, SPIE Electronic Imaging, San Jose. California, 9–14 February, vol. 1662, pp. 112–123, April 1992Google Scholar
  18. 18.
    Eakins, J., Graham, M.: Content-based image retrieval. University of Northumbria at Newcastle, JTAP report 39, October 1999Google Scholar
  19. 19.
    MPEG: Information technology - multimedia content description interface - part 3: Visual. ISO/IEC, Technical Report 15938–3 (2002)Google Scholar
  20. 20.
    Grega, M., Bryk, D., Napora, M.: INACT–INDECT advanced image cataloguing tool. Multimedia Tools and Applications, July 2012Google Scholar
  21. 21.
    Swain, M.J., Ballard, D.H.: Color indexing. Int. J. Comput. Vis. 7(1), 11–32 (1991)CrossRefGoogle Scholar
  22. 22.
    Stricker, M., Orengo, M.: Similarity of color images. In: Storage and Retrieval for Image and Video Databases III. Proc. SPIE, IS&T, SPIE Electronic Imaging, San Jose, California, 5–10 February, vol. 2420, pp. 381–392, March 1995Google Scholar
  23. 23.
    Xiang, S., Kim, H.J.: Histogram-based image hashing for searching content-preserving copies. In: Shi, Y.Q., Emmanuel, S., Kankanhalli, M.S., Chang, S.-F., Radhakrishnan, R., Ma, F., Zhao, L. (eds.) Transactions on DHMS VI. LNCS, vol. 6730, pp. 83–108. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  24. 24.
    Fridrich, J.: Robust bit extraction from images. In: IEEE International Conference on Multimedia Computing and Systems, vol. 2, pp. 536–540. IEEE Computer Society (1999)Google Scholar
  25. 25.
    Venkatesan, R., Koon, S.-M., Jakubowski, M.H., Moulin, P.: Robust image hashing. In: 2000 International Conference on Image Processing, vol. 3, pp. 664–666. IEEE (2000)Google Scholar
  26. 26.
    Yang, B., Gu, F., Niu, X.: Block mean value based image perceptual hashing. In: Intelligent Information Hiding and Multimedia Multimedia Signal Processing. IEEE Computer Society (2006)Google Scholar
  27. 27.
    Steinebach, M.: Robust hashing for efficient forensic analysis of image sets. In: Gladyshev, P., Rogers, M.K. (eds.) ICDF2C 2011. LNICST, vol. 88, pp. 180–187. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  28. 28.
    Queluz, M.P.: Towards robust, content based techniques for image authentication. In: Multimedia Signal Processing, pp. 297–302. IEEE (1998)Google Scholar
  29. 29.
    Xie, L., Arce, G.R.: A class of authentication digital watermarks for secure multimedia communication. IEEE Trans. Image Process. 10(11), 1754–1764 (2001)CrossRefMATHGoogle Scholar
  30. 30.
    Lefèbvre, F., Macq, B., Legat, J.-D.: Rash: radon soft hash algorithm. In: EUSIPCO’2002, vol. 1. TéSA, pp. 299–302 (2002)Google Scholar
  31. 31.
    Stanaert, F.-X., Lefèbvre, F., Rouvroy, G., Macq, B., Quisquater, J.-J., Legat, J.-D.: Practical evaluation of a radial soft hash algorithm. In: ITCC, vol. 2, pp. 89–94. IEEE Computer Society (2005)Google Scholar
  32. 32.
    De Roover, C., De Vleeschouwer, C., Lefèbvre, F., Macq, B.: Robust image hashing based on radial variance of pixels. In: ICIP, vol. 3, pp. 77–80. IEEE (2005)Google Scholar
  33. 33.
    Bhattacharjee, S., Kutter, M.: Compression tolerant image authentication. In: 1998 International Conference on Image Processing, vol. 1, pp. 435–439. IEEE Computer Society (1998)Google Scholar
  34. 34.
    Monga, V., Evans, B.L.: Perceptual image hashing via feature points: performance evaluation and tradeoffs. IEEE Trans. Image Process. 15(11), 3453–3466 (2006)CrossRefGoogle Scholar
  35. 35.
    Lowe, D.G.: Object recognition from local scale-invariant features. In: International Conference on Computer Vision, no. 2, pp. 1150–1157. IEEE Computer Society (1999)Google Scholar
  36. 36.
    Lv, X., Wang, Z.J.: Perceptual image hashing based on shape contexts and local feature points. IEEE Trans. Inf. Foren. Sec. 7(3), 1081–1093 (2012)CrossRefGoogle Scholar
  37. 37.
    Steinebach, M., Liu, H., Yannikos, Y.: Forbild: Efficient robust image hashing. In: SPIE 8303. Security, and Forensics, Media Watermarking (2012)Google Scholar
  38. 38.
    Zauner, C.: Implementation and benchmarking of perceptual image hash functions, Master’s thesis, University of Applied Sciences Upper Austria, July 2010Google Scholar
  39. 39.
    Zauner, C., Steinebach, M., Hermann, E.: Rihamark: perceptual image hash benchmarking. In: Media Watermarking, Security, and Forensics III. Proc. SPIE, IS&T/SPIE Electronic Imaging, San Francisco, California, 23–27 January, vol. 7880, pp. 7880 0X-1-15, Feb 2011. http://dx.doi.org/10.1117/12.876617
  40. 40.
    Breitinger, F., Stivaktakis, G., Baier, H.: FRASH: a framework to test algorithms of similarity hashing. In: 13th Digital Forensics Research Conference (DFRWS’13), Monterey, August 2013Google Scholar
  41. 41.
    Weng, L., Preneel, B.: From image hashing to video hashing. In: Boll, S., Tian, Q., Zhang, L., Zhang, Z., Chen, Y.-P.P. (eds.) MMM 2010. LNCS, vol. 5916, pp. 662–668. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  42. 42.
    Winter, C., Schneider, M., Yannikos, Y.: F2S2: fast forensic similarity search through indexing piecewise hash signatures. http://www.anwendertag-forensik.de/content/dam/anwendertag-forensik/de/documents/2012/Vortrag_Winter.pdf
  43. 43.
    Giraldo Triana, O.A.: Fast similarity search for robust image hashes, Bachelor Thesis, Technische Universität Darmstadt (2012)Google Scholar
  44. 44.
    Roussev, V.: An evaluation of forensic similarity hashes. In: Digital Forensic Research Workshop, vol. 8, pp. 34–41 (2011)Google Scholar

Copyright information

© Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2014

Authors and Affiliations

  • Frank Breitinger
    • 1
  • Huajian Liu
    • 2
  • Christian Winter
    • 2
  • Harald Baier
    • 1
  • Alexey Rybalchenko
    • 1
  • Martin Steinebach
    • 2
  1. 1.da/sec - Biometrics and Internet Security Research Group, Hochschule DarmstadtDarmstadtGermany
  2. 2.Fraunhofer Institute for Secure Information TechnologyDarmstadtGermany

Personalised recommendations