Knowledge and Information Systems

, Volume 50, Issue 2, pp 417–446 | Cite as

MiMAG: mining coherent subgraphs in multi-layer graphs with edge labels

  • Brigitte Boden
  • Stephan Günnemann
  • Holger Hoffmann
  • Thomas Seidl
Regular Paper

Abstract

Detecting dense subgraphs such as cliques or quasi-cliques is an important graph mining problem. While this task is established for simple graphs, today’s applications demand the analysis of more complex graphs: In this work, we consider a frequently observed type of graph where edges represent different types of relations. These multiple edge types can also be viewed as different “layers” of a graph, which is denoted as a “multi-layer graph”. Additionally, each edge might be annotated by a label characterizing the given relation in more detail. By simultaneously exploiting all this information, the detection of more interesting subgraphs can be supported. We introduce the multi-layer coherent subgraph model, which defines clusters of vertices that are densely connected by edges with similar labels in a subset of the graph layers. We avoid redundancy in the result by selecting only the most interesting, non-redundant subgraphs for the output. Based on this model, we introduce the best-first search algorithm MiMAG. In thorough experiments, we demonstrate the strengths of MiMAG in comparison with related approaches on synthetic as well as real-world data sets.

Keywords

Clustering Graph Network Subspace Multi-layer graph Labels 

References

  1. 1.
    Aggarwal C, Wang H (2010) Managing and mining graph data. Springer, New YorkCrossRefMATHGoogle Scholar
  2. 2.
    Araujo M, Günnemann S, Papadimitriou S, Faloutsos C, Basu P, Swami A, Papalexakis EE, Koutra D (2016) Discovery of “comet” communities in temporal and labeled graphs com\(^{\wedge 2}\). Knowl Inf Syst 46(3):657–677. doi:10.1007/s10115-015-0847-2
  3. 3.
    Berlingerio M, Coscia M, Giannotti F (2011) Finding and characterizing communities in multidimensional networks. In: ASONAM, pp 490–494. doi:10.1109/ASONAM.2011.104
  4. 4.
    Beyer KS, Goldstein J, Ramakrishnan R, Shaft U (1999) When is “nearest neighbor” meaningful? In: ICDT, pp 217–235Google Scholar
  5. 5.
    Boden B (2014) Combined clustering of graph and attribute data. PhD thesis, RWTH Aachen UniversityGoogle Scholar
  6. 6.
    Boden B, Günnemann S, Hoffmann H, Seidl T (2012) Mining coherent subgraphs in multi-layer graphs with edge labels. In: SIGKDDGoogle Scholar
  7. 7.
    Boden B, Günnemann S, Hoffmann H, Seidl T (2013) RMiCS: a robust approach for mining coherent subgraphs in edge-labeled multi-layer graphs. In: SSDBM, p 23Google Scholar
  8. 8.
    Cai D, Shao Z, He X, Yan X, Han J (2005) Community mining from multi-relational networks. PKDD 3721:445–452Google Scholar
  9. 9.
    Cerf L, Besson J, Robardet C, Boulicaut JF (2008) Data-peeler: constraint-based closed pattern mining in n-ary relations. SDM 8:37–48Google Scholar
  10. 10.
    Cerf L, Besson J, Robardet C, Boulicaut JF (2009a) Closed patterns meet n-ary relations. TKDD 3(1):1–3Google Scholar
  11. 11.
    Cerf L, Nguyen TBN, Boulicaut JF (2009b) Discovering relevant cross-graph cliques in dynamic networks. In: ISMIS, pp 513–522Google Scholar
  12. 12.
    Cheng Y, Zhao R (2009) Multiview spectral clustering via ensemble. In: GRC, IEEE, pp 101–106Google Scholar
  13. 13.
    Dong X, Frossard P, Vandergheynst P, Nefedov N (2012) Clustering with multi-layer graphs: a spectral perspective. Signal Process 60(11):5820–5831. doi:10.1109/TSP.2012.2212886 MathSciNetGoogle Scholar
  14. 14.
    Fortunato S (2010) Community detection in graphs. Phys Rep 486(3–5):75–174MathSciNetCrossRefGoogle Scholar
  15. 15.
    Günnemann S, Färber I, Boden B, Seidl T (2010) Subspace clustering meets dense subgraph mining: a synthesis of two paradigms. In: ICDM, pp 845–850Google Scholar
  16. 16.
    Günnemann S, Boden B, Seidl T (2011) DB-CSC: a density-based approach for subspace clustering in graphs with feature vectors. In: PKDD, pp 565–580Google Scholar
  17. 17.
    Günnemann S, Färber I, Müller E, Assent I, Seidl T (2011) External evaluation measures for subspace clustering. In: CIKMGoogle Scholar
  18. 18.
    Günnemann S, Boden B, Seidl T (2012) Finding density-based subspace clusters in graphs with feature vectors. Data Min Knowl Discov 25(2):243–269MathSciNetCrossRefMATHGoogle Scholar
  19. 19.
    Günnemann S, Färber I, Raubach S, Seidl T (2013) Spectral subspace clustering for graphs with feature vectors. In: ICDM, pp 231–240Google Scholar
  20. 20.
    Günnemann S, Färber I, Boden B, Seidl T (2014) Gamer: a synthesis of subspace clustering and dense subgraph mining. Knowl Inf Syst 40(2):243–278CrossRefGoogle Scholar
  21. 21.
    Hanisch D, Zien A, Zimmer R, Lengauer T (2002) Co-clustering of biological networks and gene expression data. Bioinformatics 18:145–154CrossRefGoogle Scholar
  22. 22.
    Harary F, Norman R (1960) Some properties of line digraphs. Rendiconti del Circolo Matematico di Palermo 9(2):161–168MathSciNetCrossRefMATHGoogle Scholar
  23. 23.
    Hart P, Nilsson N, Raphael B (1968) A formal basis for the heuristic determination of minimum cost paths. Syst Sci Cybern 4(2):100–107. doi:10.1109/TSSC.1968.300136 CrossRefGoogle Scholar
  24. 24.
    Kriegel HP, Kröger P, Zimek A (2009) Clustering high-dimensional data: a survey on subspace clustering, pattern-based clustering, and correlation clustering. TKDD 3(1):1–58. doi:10.1145/1497577.1497578 CrossRefGoogle Scholar
  25. 25.
    Li M, Fan Y, Chen J, Gao L, Di Z, Wu J (2005) Weighted networks of scientific communication: the measurement and topological role of weight. Physica A: Stat Mech Appl 350(2):643–656CrossRefGoogle Scholar
  26. 26.
    Liu G, Wong L (2008) Effective pruning techniques for mining quasi-cliques. In: ECML/PKDD (2), pp 33–49Google Scholar
  27. 27.
    Moser F, Colak R, Rafiey A, Ester M (2009) Mining cohesive patterns from graphs with feature vectors. In: SDM, pp 593–604Google Scholar
  28. 28.
    Müller E, Assent I, Günnemann S, Krieger R, Seidl T (2009) Relevant subspace clustering: mining the most interesting non-redundant concepts in high dimensional data. In: ICDM, pp 377–386Google Scholar
  29. 29.
    Müller E, Günnemann S, Assent I, Seidl T (2009) Evaluating clustering in subspace projections of high dimensional data. In: VLDB, pp 1270–1281Google Scholar
  30. 30.
    Neville J, Adler M, Jensen D (2004) Spectral clustering with links and attributes. University of Massachusetts Amherst, Technical Report, Department of Computer ScienceGoogle Scholar
  31. 31.
    Pearl J (1984) Heuristics: intelligent search strategies for computer problem solving. Addison-Wesley Pub. Co., Inc, ReadingGoogle Scholar
  32. 32.
    Pei J, Jiang D, Zhang A (2005) On mining cross-graph quasi-cliques. In: SIGKDD, pp 228–238Google Scholar
  33. 33.
    Qi G, Aggarwal C, Huang T (2012) Community detection with edge content in social media networks. In: ICDE, pp 534–545Google Scholar
  34. 34.
    Rymon R (1992) Search through systematic set enumeration. In: KR, pp 539–550Google Scholar
  35. 35.
    Shiga M, Takigawa I, Mamitsuka H (2007) A spectral clustering approach to optimally combining numerical vectors with a modular network. In: SIGKDD, pp 647–656Google Scholar
  36. 36.
    Spielmat D, Teng S (1996) Spectral partitioning works: planar graphs and finite element meshes. In: FOCS, pp 96–105Google Scholar
  37. 37.
    Spyropoulou E, De Bie T (2011) Interesting multi-relational patterns. In: ICDM, pp 675–684Google Scholar
  38. 38.
    Tang L, Wang X, Liu H (2009a) Uncovering groups via heterogeneous interaction analysis. In: ICDM, pp 503–512Google Scholar
  39. 39.
    Tang W, Lu Z, Dhillon IS (2009b) Clustering with multiple graphs. In: Ninth IEEE international conference on data mining, ICDM’09, pp 1016–1021Google Scholar
  40. 40.
    Tang L, Wang X, Liu H (2012) Community detection via heterogeneous interaction analysis. DMKD 25(1):1–33MathSciNetGoogle Scholar
  41. 41.
    Wang J, Zeng Z, Zhou L (2006) Clan: an algorithm for mining closed cliques from large dense graph databases. In: ICDE, p 73. doi:10.1109/ICDE.2006.34
  42. 42.
    Wu Z, Yin W, Cao J, Xu G, Cuzzocrea A (2013) Community detection in multi-relational social networks. In: Web Information Systems Engineering-WISE 2013. Springer, pp 43–56Google Scholar
  43. 43.
    Zeng Z, Wang J, Zhou L, Karypis G (2006) Coherent closed quasi-clique discovery from large dense graph databases. In: SIGKDD, pp 797–802Google Scholar
  44. 44.
    Zhou W, Jin H, Liu Y (2012) Community discovery and profiling with social messages. In: SIGKDD, pp 388–396Google Scholar
  45. 45.
    Zhou Y, Cheng H, Yu JX (2009) Graph clustering based on structural/attribute similarities. PVLDB 2(1):718–729Google Scholar

Copyright information

© Springer-Verlag London 2016

Authors and Affiliations

  • Brigitte Boden
    • 1
  • Stephan Günnemann
    • 2
  • Holger Hoffmann
    • 1
  • Thomas Seidl
    • 1
  1. 1.Data Management and Data Exploration GroupRWTH Aachen UniversityAachenGermany
  2. 2.Department of InformaticsTechnical University of MunichMunichGermany

Personalised recommendations