Abstract
Directed acyclic graphs (DAGs) are used in many domains ranging from computer science to bioinformatics, including industry and geoscience. They enable to model complex evolutions where spatial objects (e.g., soil erosion) may move, (dis)appear, merge or split. We study a new graph-based representation, called attributed DAG (a-DAG). It enables to capture interactions between objects as well as information on objects (e.g., characteristics or events). In this paper, we focus on pattern mining in such data. Our patterns, called weighted paths, offer a good trade-off between expressiveness and complexity. Frequency and compactness constraints are used to filter out uninteresting patterns. These constraints lead to an exact condensed representation (without loss of information) in the single-graph setting. A depth-first search strategy and an optimized data structure are proposed to achieve the efficiency of weighted path discovery. It does a progressive extension of patterns based on database projections. Relevance, scalability and genericity are illustrated by means of qualitative and quantitative results when mining various real and synthetic datasets. In particular, we show how such an approach can be used to monitor soil erosion using remote sensing and geographical information system (GIS) data.
Similar content being viewed by others
References
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th international conference on very large data bases (VLDB). Morgan Kaufmann, pp 487–499
Agrawal R, Srikant R (1995) Mining sequential patterns. In: Proceedings of the eleventh international conference on data engineering (ICDE). IEEE Computer Society, pp 3–14
Alatrista-Salas H, Bringay S, Flouvat F, Selmaoui-Folcher N, Teisseire M (2012) The pattern next door: towards spatio-sequential pattern discovery. In: Advances in knowledge discovery and data mining. Springer, pp 157–168
Arimura H, Uno T (2009) Polynomial-delay and polynomial-space algorithms for mining closed sequences, graphs, and pictures in accessible set systems. In: Proceedings of the SIAM international conference on data mining (SDM). SIAM, pp 1088–1099
Aydin B, Angryk RA (2016) A graph-based approach to spatiotemporal event sequence mining. In: Proceedings of the IEEE international conference on data mining workshops (ICDMW). IEEE Computer Society, pp 1090–1097
Bannari A, Morin D, Bonn F, Huete A (1995) A review of vegetation indices. Remote Sens Rev 13(1–2):95–120
Berthold MR, Cebron N, Dill F, Gabriel TR, Kötter T, Meinl T, Ohl P, Sieb C, Thiel K, Wiswedel B (2007) KNIME: the Konstanz information miner. In: Studies in classification, data analysis, and knowledge organization (GfKL 2007). Springer
Beucher S, Meyer F (1993) The morphological approach to segmentation: the watershed transformation. Mathematical morphology in image processing. Opt Eng 34:433–481
Bonchi F, Lucchese C (2004) On closed constrained frequent pattern mining. In: Proceedings of the IEEE international conference on data mining (ICDM). IEEE Computer Society, pp 35–42
Borges J, Levene M (2000) A fine grained heuristic to capture web navigation patterns. ACM SIGKDD Explor 2(1):40–50
Boulicaut JF, Bykowski A, Rigotti C (2003) Free-sets: a condensed representation of boolean data for the approximation of frequency queries. Data Min Knowl Discov 7(1):5–22
Bringmann B, Nijssen S (2008) What is frequent in a single graph? In: Proceedings of the Pacific-Asia conference on advances in knowledge discovery and data mining (PAKDD). Springer, pp 858–863
Calders T, Rigotti C, Boulicaut JF (2004) A survey on condensed representations for frequent sets. In: Constraint-based mining and inductive databases. Springer, pp 64–80
Casali A, Cicchetti R, Lakhal L (2005) Essential patterns: a perfect cover of frequent patterns. In: Proceedings of the international conference on data warehousing and knowledge discovery (DaWaK). Springer, pp 428–437
Celik M, Shekhar S, Rogers JP, Shine JA (2008) Mixed-drove spatiotemporal co-occurrence pattern mining. IEEE Trans Knowl Data Eng 20(10):1322–1335
Chen MS, Park JS, Yu PS (1998) Efficient data mining for path traversal patterns. IEEE Trans Knowl Data Eng 10(2):209–221
Chen Yl, Kao Hp, Ko Mt (2004) Mining DAG patterns from DAG databases. In: Advances in web-age information management, pp 579–588
Collin M, Flouvat F, Selmaoui-Folcher N (2016) Patsi: pattern mining of time series of satellite images in knime. In: Proceedings of the IEEE international conference on data mining workshops (ICDMW). IEEE Computer Society, pp 1292–1295
Cook D, Holder L (2006) Mining graph data. Wiley, New York
De Raedt L, Kramer S (2001) The levelwise version space algorithm and its application to molecular fragment finding. In: Proceedings of the international joint conference on artificial intelligence (IJCAI), vol 2. Morgan Kaufmann, pp 853–859
De Raedt L, Jaeger M, Lee SD, Mannila H (2002) A theory of inductive query answering. In: Proceedings of the IEEE international conference on data mining (ICDM). IEEE Computer Society, pp 123–130
Douar B, Liquiere M, Latiri C, Slimani Y (2015) Lc-mine: a framework for frequent subgraph mining with local consistency techniques. Knowl Inf Syst 44(1):1–25
Dube MP, Egenhofer MJ (2014) Surrounds in partitions. In: Proceedings of the ACM international conference on advances in geographic information systems (SIGSPATIAL). ACM, pp 233–242
Dube MP, Barrett JV, Egenhofer MJ (2015) From metric to topology: determining relations in discrete space. In: International workshop on spatial information theory. Springer, pp 151–171
Fariha A, Ahmed CF, Leung CKS, Abdullah S, Cao L (2013) Mining frequent patterns from human interactions in meetings using directed acyclic graphs. In: Proceedings of the Pacific-Asia conference on advances in knowledge discovery and data mining (PAKDD). Springer, pp 38–49
Fiedler M, Borgelt C (2007) Support computation for mining frequent subgraphs in a single graph. In: Mining and learning with graphs
Flouvat F, Sanhes J, Pasquier C, Selmaoui-Folcher N, Boulicaut JF (2014) Improving pattern discovery relevancy by deriving constraints from expert models. In: Proceedings of the European conference on artificial intelligence (ECAI). IOS Press, pp 327–332
Fukuzaki M, Seki M, Kashima H, Sese J (2010) Finding itemset-sharing patterns in a large itemset-associated graph. In: Proceedings of the Pacific-Asia conference on advances in knowledge discovery and data mining (PAKDD). Springer, pp 147–159
Garriga GC, Khardon R, De Raedt L (2012) Mining closed patterns in relational, graph and network data. In: Annals of mathematics and artificial intelligence, pp 1–28
Geng R, Xu W, Dong X (2007) WTPMiner: efficient mining of weighted frequent patterns based on graph traversals. In: Proceedings of the international conference on knowledge science, engineering and management (KSEM). Springer, pp 412–424
Giannotti F, Pedreschi D (eds) (2008) Mobility, data mining and privacy—geographic knowledge discovery. Springer, Berlin
Gudes E, Shimony SE, Vanetik N (2006) Discovering frequent graph patterns using disjoint paths. IEEE Trans Knowl Data Eng 18(11):1441–1456
Günnemann S, Seidl T (2010) Subgraph mining on directed and weighted graphs. In: Proceedings of the Pacific-Asia conference on advances in knowledge discovery and data mining (PAKDD). Springer, pp 133–146
Gunopulos D, Mannila H, Saluja S (1997) Discovering all most specific sentences by randomized algorithms extended abstract. Springer, Berlin
Haas BJ, Delcher AL, Wortman JR, Salzberg SL (2004) Dagchainer: a tool for mining segmental genome duplications and synteny. Bioinformatics 20(18):3643–3646
Huang Y, Shekhar S, Xiong H (2004) Discovering colocation patterns from spatial data sets: a general approach. IEEE Trans Knowl Data Eng 16(12):1472–1485
Inokuchi A, Washio T, Motoda H (2000) An apriori-based algorithm for mining frequent substructures from graph data. In: Proceedings of the European conference on principles of data mining and knowledge discovery (PKDD). Springer, vol 1910, pp 13–23
Jiang C, Coenen F, Zito M (2013) A survey of frequent subgraph mining algorithms. Knowl Eng Rev 28(01):75–105
Jiang J, Worboys M (2009) Event-based topology for dynamic planar areal objects. Int J Geogr Inf Sci 23(1):33–60
Jiang X, Xiong H, Wang C, Tan AH (2009) Mining globally distributed frequent subgraphs in a single labeled graph. Data Knowl Eng 68(10):1034–1058
Khan A, Yan X, Wu KL (2010) Towards proximity pattern mining in large graphs. In: Proceedings of the ACM international conference on management of data (SIGMOD). ACM Press, pp 867–878
Kuramochi M, Karypis G (2001) Frequent subgraph discovery. In: Proceedings of the IEEE international conference on data mining (ICDM). IEEE Computer Society, pp 313–320
Kuramochi M, Karypis G (2005) Finding frequent patterns in a large sparse graph*. Data Min Knowl Discov 11(3):243–271
Leskovec J, Krevl A (2014) SNAP Datasets: Stanford large network dataset collection. http://snap.stanford.edu/data
Leskovec J, Kleinberg J, Faloutsos C (2005) Graphs over time: densification laws, shrinking diameters and possible explanations. In: Proceedings of the ACM international conference on knowledge discovery in data mining (SIGKDD). ACM, pp 177–187
Lewis JA, Dube MP, Egenhofer MJ (2013) The topology of spatial scenes in r2. In: International conference on spatial information theory. Springer, pp 495–515
Miyoshi Y, Ozaki T, Ohkawa T (2009) Frequent pattern discovery from a single graph with quantitative itemsets. In: Proceedings of the IEEE international conference on data mining workshops (ICDMW), pp 527–532
Mohan P, Shekhar S, Shine JA, Rogers JP (2010) Cascading spatio-temporal pattern discovery: a summary of results. In: Proceedings of the SIAM international conference on data mining (SDM), pp 327–338
Mohan P, Shekhar S, Shine JA, Rogers JP (2012) Cascading spatio-temporal pattern discovery. IEEE Trans Knowl Data Eng 24(11):1977–1992
Moser F, Colak R, Rafiey A, Ester M (2009) Mining cohesive patterns from graphs with feature vectors. In: Proceedings of the SIAM international conference on data mining (SDM), pp 593–604
Nanopoulos A, Manolopoulos Y (2001) Mining patterns from graph traversals. Data Knowl Eng 37(3):243–266
Nguyen TT, Nguyen HA, Pham NH, Al-Kofahi JM, Nguyen TN (2009) Graph-based mining of multiple object usage patterns. In: Proceedings of the the joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on the foundations of software engineering. ACM Press, pp 383–392
Nijssen S, Kok JN (2004) A quickstart in frequent structure mining can make a difference. In: Proceedings of the ACM international conference on knowledge discovery and data mining (SIGKDD). ACM, pp 647–652
Pasquier C, Flouvat F, Sanhes J, Selmaoui-Folcher N (2017) Attributed graph mining in the presence of automorphism. Knowl Inf Syst 50(2):569–584
Pasquier N, Bastide Y, Taouil R, Lakhal L (1999) Discovering frequent closed itemsets for association rules. In: Proceedings of the international conference on database theory (ICDT). Springer, pp 398–416
Pei J, Han J, Mortazavi-Asl B, Wang J, Pinto H, Chen Q, Dayal U, Hsu M (2004) Mining sequential patterns by pattern-growth: the prefixspan approach. IEEE Trans Knowl Data Eng 16(11):1424–1440
Qian F, He Q, He J (2009) Mining spread patterns of spatio-temporal co-occurrences over zones. In: Proceedings of the international conference on computational science and its applications (ICCSA). Springer, vol 5593, pp 677–692
Sanhes J, Flouvat F, Pasquier C, Selmaoui-Folcher N, Boulicaut J (2013) Weighted path as a condensed pattern in a single attributed DAG. In: Proceedings of the international joint conference on artificial intelligence (IJCAI)
Sedgewick R, Wayne K (2011) Algorithms, 4th edn. Addison-Wesley, Reading
Selmaoui-Folcher N, Flouvat F (2011) How to use classical tree mining algorithms to find complex spatio-temporal patterns? In: Proceedings of the international conference on database and expert systems applications (DEXA). Springer, pp 107–117
Silva A, Meira W Jr, Zaki MJ (2012) Mining attribute-structure correlated patterns in large attributed graphs. Proceedings of the VLDB Endowment 5(5):466–477
Sindoni G, Stell JG (2017) The logic of discrete qualitative relations. In: Proceedings of the international conference on spatial information theory (COSIT). Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, vol 86, pp 1–15
Termier A, Tamada Y, Numata K, Imoto S, Washio T, Higushi T, Higuchi T (2007) DigDag, a first algorithm to mine closed frequent embedded sub-DAGs. In: Proceedings of mining and learning with graphs (MLG), pp 1–5
Tsoukatos I, Gunopulos D (2001) Efficient mining of spatiotemporal patterns. In: Proceedings of the international symposium on spatial and temporal databases (SSTD). Springer, vol 2121, pp 425–442
Uno T, Asai T, Uchida Y, Arimura H (2003) LCM: an efficient algorithm for enumerating frequent closed item sets. In: Proceedings of the IEEE international conference on data mining workshop on frequent itemset mining implementations (FIMI). CEUR-WS.org, vol 90
Uno T, Asai T, Uchida Y, Arimura H (2004) An efficient algorithm for enumerating closed patterns in transaction databases. In: Proceedings of the international conference on discovery science (DS). Springer, pp 16–31
Wang J, Hsu W, Lee ML, Wang JTL (2004) FlowMiner: finding flow patterns in spatio-temporal databases. In: Proceedings of the IEEE international conference on tools with artificial intelligence (ICTAI). IEEE Computer Society, pp 14–21
Wang J, Hsu W, Lee ML, Sheng C (2006) A partition-based approach to graph mining. In: Proceedings of the IEEE international conference on data engineering (ICDE). IEEE Computer Society, pp 74—-74
Washio T, Motoda H (2003) State of the art of graph-based data mining. SIGKDD Explora Newsl 5(1):59–68
Washio T, Mitsunaga Y, Motoda H (2005) Mining quantitative frequent itemsets using adaptive density-based subspace clustering. In: Proceedings of the IEEE international conference on data mining (ICDM). IEEE Computer Society, pp 793–796
Wasserman S, Faust K (1994) Social network analysis: methods and applications, vol 8. Cambridge University Press, Cambridge
Werth T, Dreweke A, Wörlein M, Fischer I, Philippsen M (2008) Dagma: mining directed acyclic graphs. In: Proceedings of the IADIS European conference on data mining. IADIS Press, pp 11–18
Werth T, Wörlein M, Dreweke A, Fischer I, Philippsen M (2009) Dag mining for code compaction. In: Data mining for business applications. Springer, pp 209–223
Worboys M (2012) The maptree: a fine-grained formal representation of space. In: International conference on geographic information science. Springer, pp 298–310
Yan X, Han J (2002) gSpan: Graph-bases substructure pattern mining. In: Proceedings of the IEEE international conference on data mining (ICDM). IEEE Computer Society, vol 3, pp 721–724
Yan X, Han J (2003) CloseGraph. In: Proceedings of the ACM international conference on knowledge discovery and data mining (SIGKDD). ACM Press, vol 6, p 286
Yan X, Han J, Afshar R (2003) Clospan: mining: closed sequential patterns in large datasets. In: Proceedings of the SIAM international conference on data mining (SDM), pp 166–177
Yang H, Parthasarathy S, Mehta S (2005) A generalized framework for mining spatio-temporal patterns in scientific data. In: Proceedings of the ACM international conference on knowledge discovery and data mining (SIGKDD). ACM Press, pp 716–721
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This research was supported by the Project FOSTER ANR-2010-COSI-012-01 funded by the French Ministry of Higher Education and Research.
Rights and permissions
About this article
Cite this article
Flouvat, F., Selmaoui-Folcher, N., Sanhes, J. et al. Mining evolutions of complex spatial objects using a single-attributed Directed Acyclic Graph. Knowl Inf Syst 62, 3931–3971 (2020). https://doi.org/10.1007/s10115-020-01478-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-020-01478-9