Abstract
This paper proposes a method, called Oscar, for determining the probable file type of binary data fragments. The Oscar method is based on building models, called centroids, of the mean and standard deviation of the byte frequency distribution of different file types. A weighted quadratic distance metric is then used to measure the distance between the centroid and sample data fragments. If the distance falls below a threshold, the sample is categorized as probably belonging to the modelled file type. Oscar is tested using JPEG pictures and is shown to give a high categorization accuracy, i.e. high detection rate and low false positives rate. By using a practical example we demonstrate how to use the Oscar method to prove the existence of known pictures based on fragments of them found in RAM and the swap partition of a computer.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Wang, K., Stolfo, S.: Anomalous payload-based network intrusion detection. In E. Jonsson el al., ed.: Recent Advances in Intrusion Detection 2004. Volume 3224 of LNCS., Springer-Verlag (2004) 203–222
CONVAR Deutschland: Pc inspector. (http://www.pcinspector.de/file_recovery/uk/welcome.htm) accessed 2005-10-31.
Carrier, B.: The Sleuth Kit. (http://www.sleuthkit.org/sleuthkit/index.php) accessed 2005-10-25.
Farmer, D., Venema, W.: The Coroner’s Toolkit (TCT). (http://www.porcupine.org/forensics/tct.html) accessed 2005-10-25.
Guidance Software: Encase forensic. (http://www.guidancesoftware.com/products/ef_index.asp) accessed 2005-10-31.
QueTek Consulting Corporation: File scavenger. (http://www.quetek.com/prod02.htm) accessed 2005-10-31.
iolo technologies: Search and recover. (http://www.iolo.eom/sr/3/) accessed 2005-10-31.
Brand, N.: Frozentech’s livecd list. (http://www.frozentech.com/content/livecd.php) accessed 2005-10-28.
grugq: Defeating forensic analysis on unix. Phrack 11(59) (2002) www.phrack.org/show.php?p=59&a=6, last visited 2004-11-19.
grugq: Remote exec. Phrack 11(62) (2004) www.phrack.org/show.php?p=62&a=8, last visited 2004-11-19.
Pluf, Ripe: Advanced antiforensics: SELF. Phrack 11(63) (2005) http://www.phrack.org/show.php?p=63&a=11, accessed 2005-11-03.
Rhodin, S.: Forensic engineer, Swedish National Laboratory of Forensic Science (SKL), IT Group. (several telephone contacts during October and November 2005)
Ericson, P.: Detective Sergeant, National Criminal Investigation Department (RKP), IT Crime Squad, IT Forensic Group. (telephone interview 2005-10-31)
Damashek, M.: Gauging similarity with n-grams: Language-independent categorization of text. Science 267(5199) (1995) 843–848
Eaton, J.: Octave. (http://www.octave.org/)
Li, W.J., Wang, K., Stolfo, S., Herzog, B.: Fileprints: Identifying file types by n-gram analysis. In: Proceedings from the sixth IEEE Sytems, Man and Cybernetics Information Assurance Workshop. (2005) 64–71
McDaniel, M., Heydari, M.: Content based file type detection algorithms. In: HICSS’03: Proceedings of the 36th Annual Hawaii International Conference on System Sciences (HICSS’03)-Track 9, Washington, DC, USA, IEEE Computer Society (2003) 332.1
Darwin, I.: file(l). (http://www.die.net/doc/linux/man/manl/file.1. html) accessed 2005-10-25.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 International Federation for Information Processing
About this paper
Cite this paper
Karresand, M., Shahmehri, N. (2006). Oscar — File Type Identification of Binary Data in Disk Clusters and RAM Pages. In: Fischer-Hübner, S., Rannenberg, K., Yngström, L., Lindskog, S. (eds) Security and Privacy in Dynamic Environments. SEC 2006. IFIP International Federation for Information Processing, vol 201. Springer, Boston, MA. https://doi.org/10.1007/0-387-33406-8_35
Download citation
DOI: https://doi.org/10.1007/0-387-33406-8_35
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-33405-9
Online ISBN: 978-0-387-33406-6
eBook Packages: Computer ScienceComputer Science (R0)