Mining exceptional closed patterns in attributed graphs

  • Anes Bendimerad
  • Marc Plantevit
  • Céline Robardet
Regular Paper
  • 54 Downloads

Abstract

Geo-located social media provide a large amount of information describing urban areas based on user descriptions and comments. Such data make possible to identify meaningful city neighborhoods on the basis of the footprints left by a large and diverse population that uses this type of media. In this paper, we present some methods to exhibit the predominant activities and their associated urban areas to automatically describe a whole city. Based on a suitably attributed graph model, our approach identifies neighborhoods with homogeneous and exceptional characteristics. We introduce the novel problem of exceptional subgraph mining in attributed graphs and propose a complete algorithm that takes benefits from closure operators, new upper bounds and pruning properties. We also define an approach to sample the space of closed exceptional subgraphs within a given time budget. Experiments performed on ten real datasets are reported and demonstrated the relevancy of both approaches, and also showed their limits.

Keywords

Exceptional subgraph mining Pattern mining Urban data analysis 

Notes

Acknowledgements

This work was supported in part by the Group Image Mining (GIM) which joins researchers of THALES Group and LIRIS Lab. We thank especially Jérôme Kodjabachian and Bertrand Duqueroie of AS&BSIM Lab. of THALES Group. This work was also partially supported by the EU FP7-PEOPLE-2013-IAPP project GRAISearch.

References

  1. 1.
    Atzmüeller M, Doerfel S, Mitzlaff F (2016) Description-oriented community detection using exhaustive subgroup discovery. Inf Sci 329:965–984CrossRefGoogle Scholar
  2. 2.
    Bendimerad AA, Plantevit M, Robardet C (2016) Unsupervised exceptional attributed sub-graph mining in urban data. In: IEEE 16th international conference on data mining, ICDM 2016, Barcelona, Spain, 12–15 Dec 2016, pp 21–30Google Scholar
  3. 3.
    Boley M, Lucchese C, Paurat D, Gärtner T (2011) Direct local pattern sampling by efficient two-step random procedures. In: ACM SIGKDD 2011, pp 582–590Google Scholar
  4. 4.
    Boulicaut J, Plantevit M, Robardet C (2016) Local pattern detection in attributed graphs. In: Solving large scale learning tasks. Challenges and algorithms—essays dedicated to Katharina Morik on the occasion of her 60th birthday, pp 168–183Google Scholar
  5. 5.
    Duivesteijn W, Feelders A, Knobbe AJ (2016) Exceptional model mining—supervised descriptive local pattern mining with complex target concepts. Data Min Knowl Discov 30(1):47–98MathSciNetCrossRefGoogle Scholar
  6. 6.
    Duivesteijn W, Knobbe AJ, Feelders A, van Leeuwen M (2010) Subgroup discovery meets bayesian networks—an exceptional model mining approach. In: ICDM 2010, pp 158–167Google Scholar
  7. 7.
    Dzyuba V, van Leeuwen M, Raedt LD (2017) Flexible constrained sampling with guarantees for pattern mining. Data Min Knowl Discov 31(5):1266–1293MathSciNetCrossRefGoogle Scholar
  8. 8.
    Falher GL, Gionis A, Mathioudakis M (2015) Where is the soho of rome? Measures and algorithms for finding similar neighborhoods in cities. In: ICWSM 2015, pp 228–237Google Scholar
  9. 9.
    Giacometti A, Soulet A (2016) Frequent pattern outlier detection without exhaustive mining. In: PAKDD 2016, pp 196–207Google Scholar
  10. 10.
    Günnemann S, Färber I, Boden B, Seidl T (2010) Subspace clustering meets dense subgraph mining. In: ICDM 2010, pp 845–850Google Scholar
  11. 11.
    Hasan MA, Zaki MJ (2009) Output space sampling for graph patterns. PVLDB 2(1):730–741Google Scholar
  12. 12.
    Kaytoue M, Plantevit M, Zimmermann A, Bendimerad A, Robardet C (2017) Exceptional contextual subgraph mining. Mach Learn 106(8):1171–1211MathSciNetCrossRefGoogle Scholar
  13. 13.
    Kuznetsov SO (1999) Learning of simple conceptual graphs from positive and negative examples. In: Principles of data mining and knowledge discovery, third european conference, PKDD ’99, Prague, Czech Republic, Proceedings, 15–18 Sep 1999, pp 384–391Google Scholar
  14. 14.
    Lavrac N, Kavsek B, Flach PA, Todorovski L (2004) Subgroup discovery with CN2-SD. J Mach Learn Res 5:153–188MathSciNetGoogle Scholar
  15. 15.
    Leman D, Feelders A, Knobbe AJ (2008) Exceptional model mining. In: ECMLPKDD 2008, pp 1–16Google Scholar
  16. 16.
    Lemmerich F, Becker M, Singer P, Helic D, Hotho A, Strohmaier M (2016) Mining subgroups with exceptional transition behavior. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, San Francisco, CA, USA, 13–17 Aug 2016, pp 965–974Google Scholar
  17. 17.
    Li G, Zaki MJ (2016) Sampling frequent and minimal boolean patterns. Data Min Knowl Discov 30(1):181–225MathSciNetCrossRefGoogle Scholar
  18. 18.
    Moens S, Boley M (2014) Instant exceptional model mining using weighted controlled pattern sampling. In: IDA, pp 203–214Google Scholar
  19. 19.
    Moens S, Goethals B (2013) Randomly sampling maximal itemsets. In: ACM SIGKDD workshop on interactive data exploration and analytics, pp 79–86Google Scholar
  20. 20.
    Moser F, Colak R, Rafiey A, Ester M (2009) Mining cohesive patterns from graphs with feature vectors. In: SDM 2009, pp 593–604Google Scholar
  21. 21.
    Mougel P, Rigotti C, Plantevit M, Gandrillon O (2014) Finding maximal homogeneous clique sets. Knowl Inf Syst 39(3):579–608CrossRefGoogle Scholar
  22. 22.
    Novak PK, Lavrac N, Webb GI (2009) Supervised descriptive rule discovery: a unifying survey of contrast set, emerging pattern and subgroup mining. J Mach Learn Res 10:377–403MATHGoogle Scholar
  23. 23.
    Park S, Bourqui M, Frías-Martínez E (2016) Mobinsight: understanding urban mobility with crowd-powered neighborhood characterizations. In: IEEE international conference on data mining workshops, ICDM (demo) 2016, Barcelona, Spain, 12–15 Dec 2016, pp 1312–1315Google Scholar
  24. 24.
    Prado A, Plantevit M, Robardet C, Boulicaut J (2013) Mining graph topological patterns: finding covariations among vertex descriptors. IEEE TKDE 25(9):2090–2104Google Scholar
  25. 25.
    Rozenshtein P, Anagnostopoulos A, Gionis A, Tatti N (2014) Event detection in activity networks. In: KDD, pp 1176–1185Google Scholar
  26. 26.
    Saha TK, Hasan MA (2015) A sampling based method for top-k frequent subgraph mining. Stat Anal DM 8(4):245–261MathSciNetCrossRefGoogle Scholar
  27. 27.
    Silva A, Meira W Jr, Zaki MJ (2012) Mining attribute-structure correlated patterns in large attributed graphs. PVLDB 5(5):466–477Google Scholar
  28. 28.
    Spielman SE, Thill J (2008) Social area analysis, data mining, and GIS. Comput Environ Urban Syst 32(2):110–122CrossRefGoogle Scholar
  29. 29.
    van Leeuwen M (2010) Maximal exceptions with minimal descriptions. Data Min Knowl Discov 21(2):259–276MathSciNetCrossRefGoogle Scholar
  30. 30.
    van Leeuwen M, Knobbe AJ (2012) Diverse subgroup set discovery. Data Min Knowl Discov 25(2):208–242MathSciNetCrossRefGoogle Scholar
  31. 31.
    Yang G (2004) The complexity of mining maximal frequent itemsets and maximal frequent patterns. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining, seattle, Washington, USA, 22–25 Aug 2004, pp 344–353Google Scholar

Copyright information

© Springer-Verlag London Ltd. 2017

Authors and Affiliations

  1. 1.Univ LyonINSA Lyon, CNRS, LIRIS UMR5205LyonFrance
  2. 2.Univ LyonUniversité Lyon 1, CNRS, LIRIS UMR5205LyonFrance

Personalised recommendations