Data Mining and Knowledge Discovery

, Volume 28, Issue 5–6, pp 1398–1428

Uncovering the plot: detecting surprising coalitions of entities in multi-relational schemas

  • Hao Wu
  • Jilles Vreeken
  • Nikolaj Tatti
  • Naren Ramakrishnan
Article

DOI: 10.1007/s10618-014-0370-1

Cite this article as:
Wu, H., Vreeken, J., Tatti, N. et al. Data Min Knowl Disc (2014) 28: 1398. doi:10.1007/s10618-014-0370-1

Abstract

Many application domains such as intelligence analysis and cybersecurity require tools for the unsupervised identification of suspicious entities in multi-relational/network data. In particular, there is a need for automated semi-automated approaches to ‘uncover the plot’, i.e., to detect non-obvious coalitions of entities bridging many types of relations. We cast the problem of detecting such suspicious coalitions and their connections as one of mining surprisingly dense and well-connected chains of biclusters over multi-relational data. With this as our goal, we model data by the Maximum Entropy principle, such that in a statistically well-founded way we can gauge the surprisingness of a discovered bicluster chain with respect to what we already know. We design an algorithm for approximating the most informative multi-relational patterns, and provide strategies to incrementally organize discovered patterns into the background model. We illustrate how our method is adept at discovering the hidden plot in multiple synthetic and real-world intelligence analysis datasets. Our approach naturally generalizes traditional attribute-based maximum entropy models for single relations, and further supports iterative, human-in-the-loop, knowledge discovery.

Keywords

Multi-relational data Maximum entropy modeling Subjective interestingness Pattern mining Biclusters 

Copyright information

© The Author(s) 2014

Authors and Affiliations

  • Hao Wu
    • 1
    • 2
  • Jilles Vreeken
    • 3
    • 4
  • Nikolaj Tatti
    • 5
    • 6
  • Naren Ramakrishnan
    • 2
    • 7
  1. 1.Department of Electrical and Computer EngineeringVirginia TechArlingtonUSA
  2. 2.Discovery Analytics CenterVirginia TechArlingtonUSA
  3. 3.Max Planck Institute for InformaticsSaarbrückenGermany
  4. 4.Cluster of Excellence MMCISaarland UniversitySaarbrückenGermany
  5. 5.HIIT, Department of Information and Computer ScienceAalto UniversityAaltoFinland
  6. 6.Department of Computer ScienceKU LeuvenLeuvenBelgium
  7. 7.Department of Computer ScienceVirginia TechArlingtonUSA

Personalised recommendations