Similarity Preserving Hashing: Eligible Properties and a New Algorithm MRSH-v2

  • Frank Breitinger
  • Harald Baier
Conference paper

DOI: 10.1007/978-3-642-39891-9_11

Part of the Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering book series (LNICST, volume 114)
Cite this paper as:
Breitinger F., Baier H. (2013) Similarity Preserving Hashing: Eligible Properties and a New Algorithm MRSH-v2. In: Rogers M., Seigfried-Spellar K.C. (eds) Digital Forensics and Cyber Crime. ICDF2C 2012. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 114. Springer, Berlin, Heidelberg

Abstract

Hash functions are a widespread class of functions in computer science and used in several applications, e.g. in computer forensics to identify known files. One basic property of cryptographic Hash Functions is the avalanche effect that causes a significantly different output if an input is changed slightly. As some applications also need to identify similar files (e.g. spam/virus detection) this raised the need for Similarity Preserving Hashing. In recent years, several approaches came up, all with different namings, properties, strengths and weaknesses which is due to a missing definition.

Based on the properties and use cases of traditional Hash Functions this paper discusses a uniform naming and properties which is a first step towards a suitable definition of Similarity Preserving Hashing. Additionally, we extend the algorithm MRSH for Similarity Preserving Hashing to its successor MRSH-v2, which has three specialties. First, it fulfills all our proposed defining properties, second, it outperforms existing approaches especially with respect to run time performance and third it has two detections modes. The regular mode of MRSH-v2 is used to identify similar files whereas the f-mode is optimal for fragment detection, i.e. to identify similar parts of a file.

Keywords

Digital forensics Similarity Preserving Hashing fuzzy hashing MRSH-v2 properties of Similarity Preserving Hashing 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© ICST Institute for Computer Science, Social Informatics and Telecommunications Engineering 2013

Authors and Affiliations

  • Frank Breitinger
    • 1
  • Harald Baier
    • 1
  1. 1.da/sec Biometrics and Internet Security Research GroupHochschule DarmstadtDarmstadtGermany

Personalised recommendations