A Simple WordNet-Ontology Based Email Retrieval System for Digital Forensics

* Final gross prices may vary according to local VAT.

Get Access


Because of the high impact of high-tech digital crime upon our society, it is necessary to develop effective Information Retrieval (IR) tools to support digital forensic investigations. In this paper, we propose an IR system for digital forensics that targets emails. Our system incorporates WordNet (i.e. a domain independent ontology for the vocabulary) into an Extended Boolean Model (EBM) by applying query expansion techniques. Structured Boolean queries in Backus-Naur Form (BNF) are utilized to assist investigators in effectively expressing their information requirements. We compare the performance of our system on several email datasets with a traditional Boolean IR system built upon the Lucene keyword-only model. Experimental results show that our system yields a promising improvement in retrieval performance without the requirement of very accurate query keywords to retrieve the most relevant emails.

The authors thank the reviewers for suggestive comments. NICTA is funded by the Australian Government as represented by the Department of Broadband, Communications and the Digital Economy and the Australian Research Council through the ICT Centre of Excellence program.