Clustering Heterogeneous Semi-structured Social Science Datasets for Security Applications

  • D. B. Skillicorn
  • C. LeuprechtEmail author
Part of the Advanced Sciences and Technologies for Security Applications book series (ASTSA)


Social scientists have begun to collect large datasets that are heterogeneous and semi-structured, but the ability to analyze such data has lagged behind its collection. We design a process to map such datasets to a numerical form, apply singular value decomposition clustering, and explore the impact of individual attributes or fields by overlaying visualizations of the clusters. This provides a new path for understanding such datasets, which we illustrate with three real-world examples: the Global Terrorism Database, which records details of every terrorist attack since 1970; a Chicago police dataset, which records details of every drug-related incident over a period of approximately a month; and a dataset describing members of a Hezbollah crime/terror network in the U.S.


Clustering Hashing Terrorism Crime Global terrorism database Chicago policing Hezbollah 


  1. 1.
    Enders W, Sandler T, Gaibulloev K (2011) Domestic versus transnational terrorism: data, decomposition, and dynamics. J Peace Res 48(3):319–337CrossRefGoogle Scholar
  2. 2.
    Godwin A, Chang R, Kosara R, Ribarsky W (2008). Visual analysis of entity relationships in global terrorism database. In: Defense and Security 2008, Proceedings of SPIE Vol 6893, 2008Google Scholar
  3. 3.
    Golub GH, van Loan CF (1996) Matrix computations, 3rd edn. Johns Hopkins University Press, BaltimoreGoogle Scholar
  4. 4.
    Guo D, Liao K, Morgan M (2007) Visualizing patterns in a global terrorism incident database. Environ Plan 34:767–784CrossRefGoogle Scholar
  5. 5.
    LaFree G (2010) The global terrorism database: accomplishments and challenges. Perspect Terrorism 4(1)Google Scholar
  6. 6.
    Leuprecht C, Walther O, Skillicorn DB, Ryde-Collins H (2016) Hezbollah’s global tentacles: the party of god’s convergence with transnational organized crime. Terrorism and Political Violence 29(5):902–921CrossRefGoogle Scholar
  7. 7.
    Shafiq S, Haider Butt W, Qamar U (2014) Attack type prediction using hybrid classifier. In: Advanced data mining and applications, vol 8933. Springer Lecture Notes in Computer Science, pp 488–498Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.School of ComputingQueen’s UniversityKingstonCanada
  2. 2.Political ScienceRoyal Military College of CanadaKingstonCanada
  3. 3.Flinders University of South AustraliaAdelaideAustralia

Personalised recommendations