A Distance Measure for Determining Similarity Between Criminal Investigations

  • Tim K. Cocx
  • Walter A. Kosters
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4065)


The information explosion has led to problems and possibilities in many areas of society, including that of law enforcement. In comparing individual criminal investigations on similarity, we seize one of the opportunities of the information surplus to determine what crimes may or may not have been committed by the same group of individuals.

For this purpose we introduce a new distance measure that is specifically suited to the comparison between investigations that differ largely in terms of available intelligence. It employs an adaptation of the probability density function of the normal distribution to constitute this distance between all possible couples of investigations.

We embed this distance measure in a four-step paradigm that extracts entities from a collection of documents and use it to transform a high dimensional vector table into input for a police operable tool. The eventual report is a two-dimensional representation of the distances between the various investigations and will assist the police force on the job to get a clearer picture of the current situation.


Distance Measure Text Miner Crime Scene Individual Investigation Common Entity 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Adderley, R., Musgrove, P.B.: Data mining case study: Modeling the behavior of offenders who commit serious sexual assaults. In: KDD 2001: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, pp. 215–220 (2001)Google Scholar
  2. 2.
    Chau, M., Atabakhsh, H., Zeng, D., Chen, H.: Building an infrastructure for law enforcement information sharing and collaboration: Design issues and challenges. In: Proceedings of The National Conference on Digital Government Research (2001)Google Scholar
  3. 3.
    Chau, M., Xu, J., Chen, H.: Extracting meaningful entities from police narrative reports. In: Proceedings of The National Conference on Digital Government Research (2002)Google Scholar
  4. 4.
    Chen, H., Atabakhsh, H., Petersen, T., Schroeder, J., Buetow, T., Chaboya, L., O’Toole, C., Chau, M., Cushna, T., Casey, D., Huang, Z.: COPLINK: Visualization for crime analysis. In: Proceedings of the The National Conference on Digital Government Research (2003)Google Scholar
  5. 5.
    Davison, M.L.: Multidimensional Scaling. John Wiley, New York (1983)MATHGoogle Scholar
  6. 6.
    Goldberg, H.G., Wong, R.W.H.: Restructuring transactional data for link analysis in the FinCEN AI system. In: Papers from the AAAI Fall Symposium (1998)Google Scholar
  7. 7.
    Kosters, W.A., Marchiori, E., Oerlemans, A.A.J.: Mining clusters with association rules. In: Hand, D.J., Kok, J.N., Berthold, M.R. (eds.) IDA 1999. LNCS, vol. 1642, pp. 39–50. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  8. 8.
    Kosters, W.A., van Wezel, M.C.: Competitive neural networks for customer choice models. In: E-Commerce and Intelligent Methods, Studies in Fuzziness and Soft Computing, pp. 41–60. Physica-Verlag, Springer, Heidelberg (2002)Google Scholar
  9. 9.
    Kumar, N., de Beer, J., Vanthienen, J., Moens, M.-F.: Evaluation of intelligent exploitation tools for non-structured police information. In: Proceedings of the ICAIL 2005 Workshop on Data Mining, Information Extraction and Evidentiary Reasoning for Law Enforcement and Counter-terrorism (2005)Google Scholar
  10. 10.
    Oatley, G.C., Zeleznikow, J., Ewart, B.W.: Matching and predicting crimes. In: Proceedings of AI 2004, the Twenty-fourth SGAI International Conference on Knowledge Based Systems and Applications of Artificial Intelligence (2004)Google Scholar
  11. 11.
    Skillicorn, D.B.: Clusters within clusters: SVD and counterterrorism. In: Proceedings of the Workshop on Data Mining for Counter Terrorism and Security (2003)Google Scholar
  12. 12.
    SPSS LexiQuest website,
  13. 13.
    Xiang, Y., Chau, M., Atabakhsh, H., Chen, H.: Visualizing criminal relationships: Comparison of a hyperbolic tree and a hierarchical list. Decision Support Systems 41(1), 69–83 (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Tim K. Cocx
    • 1
  • Walter A. Kosters
    • 1
  1. 1.Leiden Institute of Advanced Computer Science (LIACS)Leiden UniversityThe Netherlands

Personalised recommendations