A Cloud Computing Platform for Large-Scale Forensic Computing

  • Vassil Roussev
  • Liqiang Wang
  • Golden Richard
  • Lodovico Marziale
Part of the IFIP Advances in Information and Communication Technology book series (IFIPAICT, volume 306)


The timely processing of massive digital forensic collections demands the use of large-scale distributed computing resources and the flexibility to customize the processing performed on the collections. This paper describes MPI MapReduce (MMR), an open implementation of the MapReduce processing model that outperforms traditional forensic computing techniques. MMR provides linear scaling for CPU-intensive processing and super-linear scaling for indexing-related workloads.


Cluster computing large-scale forensics MapReduce 


  1. 1.
    Apache Software Foundation, Apache Hadoop Core, Forest Hill, Maryland ( Scholar
  2. 2.
    A. Broder and M. Mitzenmacher, Network applications of Bloom filters: A survey, Internet Mathematics, vol. 1(4), pp. 485–509, 2005.CrossRefGoogle Scholar
  3. 3.
    J. Dean and S. Ghemawat, MapReduce: Simplified data processing on large clusters, Communications of the ACM, vol. 51(1), pp. 107–113, 2008.CrossRefGoogle Scholar
  4. 4.
    Federal Bureau of Investigation, Regional Computer Forensics Laboratory (RCFL) Program Annual Report for Fiscal Year 2007, Washington, DC ( _Annual07.pdf), 2008.Google Scholar
  5. 5.
    S. Ghemawat, H. Gobioff and S. Leung, The Google file system, Proceedings of the Nineteenth ACM Symposium on Operating Systems Principles, pp. 29–43, 2003.Google Scholar
  6. 6.
    Intel Corporation, Preboot Execution Environment (PXE) Specification, Version 2.1, Santa Clara, California ( /design/archives/wfm/downloads/pxespec.pdf), 1999.Google Scholar
  7. 7.
    L. Marziale, G. Richard and V. Roussev, Massive threading: Using GPU to increase the performance of digital forensic tools, Digital Investigation, vol. 4(S1), pp. 73–81, 2007.CrossRefGoogle Scholar
  8. 8.
    Message Passing Interface Forum, MPI Forum, Bloomington, Indiana ( Scholar
  9. 9.
    D. Patterson, Latency lags bandwidth, Communications of the ACM, vol. 47(10), pp. 71–75, 2004.CrossRefGoogle Scholar
  10. 10.
    C. Ranger, R. Raghuraman, A. Penmetsa, G. Bradski and C. Kozyrakis, Evaluating MapReduce for multi-core and multiprocessor systems, Proceedings of the Thirteenth International Symposium on High-Performance Computer Architecture, pp. 13–24, 2007.Google Scholar
  11. 11.
    V. Roussev and G. Richard, Breaking the performance wall: The case for distributed digital forensics, Proceedings of the Fourth Digital Forensic Research Workshop, 2004.Google Scholar
  12. 12.
    K. Shanmugasundaram, N. Memon, A. Savant and H. Bronnimann, ForNet: A distributed forensics network, Proceedings of the Second International Workshop on Mathematical Methods, Models and Architectures for Computer Network Security, pp. 1–16, 2003.Google Scholar

Copyright information

© IFIP International Federation for Information Processing 2009

Authors and Affiliations

  • Vassil Roussev
  • Liqiang Wang
  • Golden Richard
  • Lodovico Marziale

There are no affiliations available

Personalised recommendations