Reducing the Time Required for Hashing Operations

  • Frank Breitinger
  • Kaloyan Petrov
Part of the IFIP Advances in Information and Communication Technology book series (IFIPAICT, volume 410)

Abstract

Due to the increasingly massive amounts of data that need to be analyzed in digital forensic investigations, it is necessary to automatically recognize suspect files and filter out non-relevant files. To achieve this goal, digital forensic practitioners employ hashing algorithms to classify files into known-good, known-bad and unknown files. However, a typical personal computer may store hundreds of thousands of files and the task becomes extremely time-consuming. This paper attempts to address the problem using a framework that speeds up processing by using multiple threads. Unlike a typical multithreading approach, where the hashing algorithm is performed by multiple threads, the proposed framework incorporates a dedicated prefetcher thread that reads files from a device. Experimental results demonstrate a runtime efficiency of nearly 40% over single threading.

Keywords

File hashing runtime performance file handling prefetching 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    D. Alcantara, A. Sharf, F. Abbasinejad, S. Sengupta, M. Mitzenmacher, J. Owens and N. Amenta, Real-time parallel hashing on the GPU, ACM Transactions on Graphics, vol. 28(5), article no. 154, 2009.Google Scholar
  2. 2.
    C. Altheide and H. Carvey, Digital Forensics with Open Source Tools, Syngress, Waltham, Massachusetts, 2011.Google Scholar
  3. 3.
    H. Baier and F. Breitinger, Security aspects of piecewise hashing in computer forensics, Proceedings of the Sixth International Conference on IT Security Incident Management and IT Forensics, pp. 21–36, 2011.CrossRefGoogle Scholar
  4. 4.
    A. Baxter, SSD vs. HDD (www.storagereview.com/ssd_vs_hdd), 2012.
  5. 5.
    B. Bloom, Space/time trade-offs in hash coding with allowable errors, Communications of the ACM, vol. 13(7), pp. 422–426, 1970.CrossRefMATHGoogle Scholar
  6. 6.
    F. Breitinger and H. Baier, Performance issues about context-triggered piecewise hashing, Proceedings of the Third International ICST Conference on Digital Forensics and Cyber Crime, pp. 141–155, 2011.Google Scholar
  7. 7.
    F. Breitinger and H. Baier, Similarity preserving hashing: Eligible properties and a new algorithm mrsh-v2, Proceedings of the Fourth International ICST Conference on Digital Forensics and Cyber Crime, 2012.Google Scholar
  8. 8.
    L. Chen and G. Wang, An efficient piecewise hashing method for computer forensics, Proceedings of the First International Workshop on Knowledge Discovery and Data Mining, pp. 635–638, 2008.CrossRefGoogle Scholar
  9. 9.
    J. Kornblum, Identifying almost identical files using context triggered piecewise hashing, Digital Investigation, vol. 3(S), pp. S91–S97, 2006.CrossRefGoogle Scholar
  10. 10.
    A. Menezes, P. van Oorschot and S. Vanstone, Handbook of Applied Cryptography, CRC Press, Boca Raton, Florida, 1997.MATHGoogle Scholar
  11. 11.
    G. Moore, Cramming more components onto integrated circuits, Electronics Magazine, pp. 114–117, April 19, 1965.Google Scholar
  12. 12.
    National Institute of Standards and Technology, Secure Hash Standard, FIPS Publication 180-3, Gaithersburg, Maryland, 2008.Google Scholar
  13. 13.
    National Institute of Standards and Technology, National Software Reference Library, Gaithersburg, Maryland (www.nsrl.nist.gov), 2012.
  14. 14.
  15. 15.
    R. Rivest, MD5 Message-Digest Algorithm, RFC 1321, 1992.Google Scholar
  16. 16.
    V. Roussev, Data fingerprinting with similarity digests, in Advances in Digital Forensics VI, K. Chow and S. Shenoi (Eds.), Springer, Heidelberg, Germany, pp. 207–226, 2010.CrossRefGoogle Scholar
  17. 17.
    V. Roussev, An evaluation of forensic similarity hashes, Digital Investigation, vol. 8(S), pp. S34–S41, 2011.CrossRefGoogle Scholar
  18. 18.
    S. Sumathi and S. Esakkirajan, Fundamentals of Relational Database Management Systems, Springer-Verlag, Berlin Heidelberg, Germany, 2010.Google Scholar
  19. 19.
    S. Woerthmueller, Multithreaded file I/O, Dr. Dobb’s Journal, September 28, 2009. Google Scholar

Copyright information

© IFIP International Federation for Information Processing 2013

Authors and Affiliations

  • Frank Breitinger
    • 1
    • 2
  • Kaloyan Petrov
    • 3
  1. 1.University of Applied SciencesDarmstadtGermany
  2. 2.Center for Advanced Security Research Darmstadt (CASED)DarmstadtGermany
  3. 3.Institute of Information and Communication TechnologiesBulgarian Academy of SciencesSofiaBulgaria

Personalised recommendations