Frequent pattern mining in attributed trees: algorithms and applications

Pasquier, Claude; Sanhes, Jérémy; Flouvat, Frédéric; Selmaoui-Folcher, Nazha

doi:10.1007/s10115-015-0831-x

Frequent pattern mining in attributed trees: algorithms and applications

Regular Paper
Published: 28 March 2015

Volume 46, pages 491–514, (2016)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Claude Pasquier^1,2,3,
Jérémy Sanhes³,
Frédéric Flouvat³ &
…
Nazha Selmaoui-Folcher³

662 Accesses
13 Citations
Explore all metrics

Abstract

Frequent pattern mining is an important data mining task with a broad range of applications. Initially focused on the discovery of frequent itemsets, studies were extended to mine structural forms like sequences, trees or graphs. In this paper, we introduce a new domain of patterns, attributed trees (atrees), and a method to extract these patterns in a forest of atrees. Attributed trees are trees in which vertices are associated with itemsets. Mining this type of patterns (called asubtrees), which combines tree mining and itemset mining, requires the exploration of a huge search space. To make our approach scalable, we investigate the mining of condensed representations. For attributed trees, the classical concept of closure involves both itemset closure and structural closure. We present three algorithms for mining all patterns, closed patterns w.r.t. itemsets (content) and/or structure in attributed trees. We show that, for low support values, mining content-closed attributed trees is a good compromise between non-redundancy of solutions and execution time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique

Article 12 April 2024

A comprehensive survey of data mining

Article 06 February 2020

A survey of Bayesian Network structure learning

Article Open access 17 January 2023

References

Agrawal R, Imieliński T, Swami A (1993) Mining association rules between sets of items in large databases. SIGMOD Rec 22(2):207–216
Article Google Scholar
Agrawal R, Srikant R (1995) Mining sequential patterns. In: ICDE, 95, pp 3–14
Asai T, Abe K, Kawasoe S, Arimura H, Sakamoto H, Arikawa S (2002) Efficient substructure discovery from large semi-structured data. In: SDM
Asai T, Arimura H, Uno T, Nakano S-I (2003) Discovering frequent substructures in large unordered trees. In: The 6th International Conference on Discovery Science, Springer, pp 47–61
Ayres J, Flannick J, Gehrke J, Yiu T (2002) Sequential pattern mining using a bitmap representation. In: KDD, pp 429–435
Balcázar JL, Bifet A, Lozano A (2010) Mining frequent closed rooted trees. Mach Learn 78(1–2):1–33
MathSciNet Google Scholar
Bayardo RJ (1998) Efficiently mining long patterns from databases. In: ACM SIGMOD International Conference on Management of Data SIGMOD 98, pp 85–93
Chehreghani MH (2011) Efficiently mining unordered trees. In: ICDM, pp 111–120
Chi Y, Muntz RR, Nijssen S, Kok JN (2004) Frequent subtree mining—an overview. Fundam Inf 66(1–2):161–198
MathSciNet Google Scholar
Chi Y, Yang Y, Muntz RR (2003) Indexing and mining free trees. In: Proceedings of the 2003 IEEE International Conference on Data Mining (ICDM’03)
Chi Y, Yang Y, Muntz RR (2004) Hybridtreeminer: an efficient algorithm for mining frequent rooted trees and free trees using canonical form. In: Scientific and Statistical Database Management, 2004. Proceedings. 16th International Conference on, pp 11–20
Chi Y, Yang Y, Xia Y, Muntz RR (2004) Cmtreeminer: mining both closed and maximal frequent subtrees. In: PAKDD, pp 63–73
Deshpande M, Kuramochi M, Karypis G (2003) Frequent sub-structure-based approaches for classifying chemical compounds. In: Third IEEE International Conference on Data Mining, IEEE Comput. Soc, pp 35–42
Fukuzaki M, Seki M, Kashima H, Sese J (2010) Finding itemset-sharing patterns in a large itemset-associated graph. In: PAKDD, pp 147–159
Gay D, Selmaoui-Folcher N, Boulicaut J-F (2010) Application-independent feature construction based on almost-closedness properties. Knowl Inf Syst 30(1):87–111
Article Google Scholar
Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. SIGMOD Rec 29(2):1–12
Article Google Scholar
Hido S, Kawano H (2005) Amiot: induced ordered tree mining in tree-structured databases. In: ICDM, pp 170–177
Jiang C, Coenen F, Zito M (2013) A survey of frequent subgraph mining algorithms. Knowl Eng Rev 28:75–105
Article Google Scholar
Luccio F, Enriquez AM, Rieumont PO, Pagli L (2001) Exact rooted subtree matching in sublinear time, Universita Di Pisa Technical Report TR-01 14
Luccio F, Pagli L (1995) Approximate matching for 2 families of trees. Inf Comput 123(1):111–120
Article MathSciNet MATH Google Scholar
Mannila H, Toivonen H (1996) Multiple uses of frequent sets and condensed representations. In: KDD, pp 189–194
Miyoshi Y, Ozaki T, Ohkawa T (2009) Frequent pattern discovery from a single graph with quantitative itemsets. In: ICDM Workshops, pp 527–532
Moser F, Colak R, Rafiey A, Ester M (2009) Mining cohesive patterns from graphs with feature vectors. In: SDM, pp 593–604
Mougel P-N, Rigotti C, Gandrillon O (2012) Finding collections of k-clique percolated components in attributed graphs. In: PAKDD, pp 181–192
Nijssen S, Kok JN (2003) Efficient discovery of frequent unordered trees. In: First International Workshop on Mining Graphs, Trees and Sequences (MGTS)
Pasquier C, Sanhes J, Flouvat F, Selmaoui-Folcher N (2013) Frequent Pattern Mining in Attributed trees. In: Proceedings of the 17th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’13)., Gold Coast Australia, pp 26–37
Pasquier N, Bastide Y, Taouil R, Lakhal L (1999) Discovering frequent closed itemsets for association rules. In: ICDT, pp 398–416
Pensa RG, Boulicaut J-F (2005) From local pattern mining to relevant bi-cluster characterization. In: 6th International Symposium on Intelligent Data Analysis (IDA 2005), pp 293–304
Rymon R (1992) Search through systematic set enumeration. In: Proceedings of the 3rd International Conference on Principles of Knowledge Representation and Reasoning (KR’92), pp 539–550
Selmaoui-Folcher N, Flouvat F (2011) How to use “classical” tree mining algorithms to find complex spatio-temporal patterns?. In: DEXA (2), pp 107–117
Termier A, Rousset M-C, Sebag M (2004) Dryade: a new approach for discovering closed frequent trees in heterogeneous tree databases. In: ICDM, pp 543–546
Termier A, Rousset M-C, Sebag M, Ohara K, Washio T, Motoda H (2008) Dryadeparent, an efficient and robust closed attribute tree mining algorithm. IEEE Trans Knowl Data Eng 20(3):300–320
Article Google Scholar
Wang C, Hong M, Pei J, Zhou H, Wang W, Shi B (2004) Efficient pattern-growth methods for frequent tree pattern mining. In: PAKDD, pp 441–451
Washio T, Motoda H (2003) State of the art of graph-based data mining. SIGKDD Explor Newsl 5(1):59–68
Article Google Scholar
Xiao Y, Yao J-F, Li Z, Dunham MH (2003) Efficient data mining for maximal frequent subtrees. In: ICDM, pp 379–386
Yan X, Yu PS, Han J (2004) Graph indexing: a frequent structure-based approach. In: SIGMOD Conference, pp 335–346
Zaki MJ (2002) Efficiently mining frequent trees in a forest. In: KDD, pp 71–80
Zaki MJ (2004) Efficiently mining frequent embedded unordered trees. Fundam Inf 66(1–2):33–52
MathSciNet Google Scholar
Zaki MJ (2005) Efficiently mining frequent trees in a forest: algorithms and applications. IEEE Trans Knowl Data Eng 17(8):1021–1035
Article Google Scholar
Zou L, Lu Y, Zhang H, Hu R (2006) Prefixtreespan: a pattern growth algorithm for mining embedded subtrees. In: WISE, pp 499–505

Download references

Author information

Authors and Affiliations

University of Nice Sophia Antipolis, I3S, UMR 7271, 06900, Sophia Antipolis, France
Claude Pasquier
CNRS, I3S, UMR 7271, 06900, Sophia Antipolis, France
Claude Pasquier
Pôle Pluridisciplinaire de la Matière et de l’Environnement (PPME), University of New Caledonia, 98851, Nouméa, New Caledonia
Claude Pasquier, Jérémy Sanhes, Frédéric Flouvat & Nazha Selmaoui-Folcher

Authors

Claude Pasquier
View author publications
You can also search for this author in PubMed Google Scholar
Jérémy Sanhes
View author publications
You can also search for this author in PubMed Google Scholar
Frédéric Flouvat
View author publications
You can also search for this author in PubMed Google Scholar
Nazha Selmaoui-Folcher
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Claude Pasquier.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pasquier, C., Sanhes, J., Flouvat, F. et al. Frequent pattern mining in attributed trees: algorithms and applications. Knowl Inf Syst 46, 491–514 (2016). https://doi.org/10.1007/s10115-015-0831-x

Download citation

Received: 30 March 2014
Revised: 29 October 2014
Accepted: 17 March 2015
Published: 28 March 2015
Issue Date: March 2016
DOI: https://doi.org/10.1007/s10115-015-0831-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Frequent pattern mining in attributed trees: algorithms and applications

Abstract

Access this article

Similar content being viewed by others

An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique

A comprehensive survey of data mining

A survey of Bayesian Network structure learning

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Frequent pattern mining in attributed trees: algorithms and applications

Abstract

Access this article

Similar content being viewed by others

An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique

A comprehensive survey of data mining

A survey of Bayesian Network structure learning

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation