Content-Based Image Retrieval for Digital Forensics

  • Y. Chen
  • V. Roussev
  • G. RichardIII
  • Y. Gao
Part of the IFIP — The International Federation for Information Processing book series (IFIPAICT, volume 194)


Digital forensic investigators are often faced with the task of manually examining a large number of (photographic) images to identify potential evidence. The task can be daunting and time-consuming if the target of the investigation is very broad, such as a web hosting service. Current forensic tools are woefully inadequate: they are largely confined to generating pages of thumbnail images and identifying known files through cryptographic hashes. This paper presents a new approach that significantly automates the examination process by relying on image analysis techniques. The strategy is to use previously-identified content (e.g., contraband images) and to perform feature extraction, which captures mathematically the essential properties of the images. Based on this analysis, a feature set database is constructed to facilitate automatic scanning of a target machine for images similar to the ones in the database. An important property of the approach is that it is not possible to recover the original image from the feature set. Therefore, it is possible to build a (potentially very large) database targeting known contraband images that investigators may be barred from collecting directly. The approach can be used to automatically search for case-specific images, contraband or otherwise, and to provide online monitoring of shared storage for early detection of specific images.


Digital forensics image analysis image retrieval 


  1. [1]
    C. Carson, S. Belongie, H. Greenspan and J. Malik, Blobworld: Image segmentation using expectation-maximization and its application to image querying, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24(8), pp. 1026–1038, 2002.CrossRefGoogle Scholar
  2. [2]
    I. Cox, M. Miller, T. Minka, T. Papathomas and P. Yianilos, The Bayesian image retrieval system PicHunter: Theory, implementation and psychophysical experiments, IEEE Transactions on Image Processing, vol. 9(1), pp. 20–37, 2000.CrossRefGoogle Scholar
  3. [3]
    I. Daubechies, Ten Lectures on Wavelets, Capital City Press, Philadelphia, Pennsylvania, 1992.Google Scholar
  4. [4]
    C. Faloutsos, R. Barber, M. Flickner, J. Hafner, W. Niblack, D. Petkovic and W. Equitz, Efficient and effective querying by image content, Journal of Intelligent Information Systems, vol. 3(3–4), pp. 231–262, 1994.CrossRefGoogle Scholar
  5. [5]
    A. Gersho, Asymptotically optimum block quantization, IEEE Transactions on Information Theory, vol. 25(4), pp. 373–380, 1979.zbMATHMathSciNetCrossRefGoogle Scholar
  6. [6]
    T. Gevers and A. Smeulders, PicToSeek: Combining color and shape invariant features for image retrieval, IEEE Transactions on Image Processing, vol. 9(1), pp. 102–119, 2000.CrossRefGoogle Scholar
  7. [7]
    A. Gupta and R. Jain, Visual information retrieval, Communications of the ACM, vol. 40(5), pp. 70–79, 1997.CrossRefGoogle Scholar
  8. [8]
    J. Li, J. Wang and G. Wiederhold, IRM: Integrated region matching for image retrieval, Proceedings of the ACM International Conference on Multimedia, pp. 147–156, 2000.Google Scholar
  9. [9]
    W. Ma and B. Manjunath, NeTra: A toolbox for navigating large image databases, Proceedings of the IEEE International Conference on Image Processing, pp. 568–571, 1997.Google Scholar
  10. [10]
    S. Mehrotra, Y. Rui, M. Ortega-Binderberger and T. Huang, Supporting content-based queries over images in MARS, Proceedings of the IEEE International Conference on Multimedia Computing and Systems, pp. 632–633, 1997.Google Scholar
  11. [11]
    V. Ogle and M. Stonebraker. Chabot: Retrieval from a relational database of images, IEEE Computer, vol. 28(9), pp. 40–48, 1995.Google Scholar
  12. [12]
    A. Pentland, R. Picard and S. Sclaroff, Photobook: Content-based manipulation for image databases, International Journal of Computer Vision, vol. 18(3), pp. 233–254, 1996.CrossRefGoogle Scholar
  13. [13]
    V. Roussev and G. Richard III, Breaking the performance wall: The case for distributed digital forensics, Proceedings of the Digital Forensics Research Workshop, 2004.Google Scholar
  14. [14]
    J. Smith and S. Chang, VisualSEEK: A fully automated content-based query system, Proceedings of the ACM International Conference on Multimedia, pp. 87–98, 1996.Google Scholar
  15. [15]
    J. Wang, J. Li and G. Wiederhold, SIMPLIcity: Semantics-sensitive integrated matching for picture libraries, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23(9), pp. 947–963, 2001.CrossRefGoogle Scholar
  16. [16]
    J. Wang, G. Wiederhold, O. Firschein and X. Sha, Content-based image indexing and searching using Daubechies’ wavelets, International Journal on Digital Libraries, vol. 1(4), pp. 311–328, 1998.CrossRefGoogle Scholar

Copyright information

© International Federation for Information Processing 2006

Authors and Affiliations

  • Y. Chen
  • V. Roussev
  • G. RichardIII
  • Y. Gao

There are no affiliations available

Personalised recommendations