Incremental Mining of Closed Frequent Subtrees

Nguyen, Viet Anh; Yamamoto, Akihiro

doi:10.1007/978-3-642-16184-1_25

Incremental Mining of Closed Frequent Subtrees

Viet Anh Nguyen²² &
Akihiro Yamamoto²²

Conference paper

1742 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6332))

Abstract

We study the problem of mining closed frequent subtrees from tree databases that are updated regularly over time. Closed frequent subtrees provide condensed and complete information for all frequent subtrees in the database. Although mining closed frequent subtrees is in general faster than mining all frequent subtrees, this is still a very time consuming process, and thus it is undesirable to mine from scratch when the change to the database is small. The set of previous mined closed subtrees should be reused as much as possible to compute new emerging subtrees. We propose, in this paper, a novel and efficient incremental mining algorithm for closed frequent labeled ordered trees. We adopt a divide-and-conquer strategy and apply different mining techniques in different parts of the mining process. The proposed algorithm requires no additional scan of the whole database while its memory usage is reasonable. Our experimental study on both synthetic and real-life datasets demonstrates the efficiency and scalability of our algorithm.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Abiteboul, S., Buneman, P., Suciu, D.: Data on the Web. Morgan Kaufmann, San Francisco (2000)
Google Scholar
Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proc. the ACM SIGMOD Intl. Conf. on Management of Data, pp. 207–216 (1993)
Google Scholar
Asai, T., Abe, K., Kawasoe, S., Arimura, H., Sakamoto, H., Arikawa, S.: Efficient Substructure Discovery From Large Semi-structured Data. In: Proc. the Second SIAM International Conference on Data Mining (SDM 2002), pp. 158–174 (2002)
Google Scholar
Asai, T., Arimura, H., Abe, K., Kawasoe, S., Arikawa, S.: Online Algorithms for Mining Semi-structured Data Stream. In: Proc. IEEE International Conference on Data Mining (ICDM 2002), pp. 27–34 (2002)
Google Scholar
Bifet, A., Gavalda, R.: Mining Adaptively Frequent Closed Unlabeled Rooted Trees in Data Streams. In: Proc. the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-08), pp. 34–42 (2008)
Google Scholar
Cheng, H., Yan, X., Han, J.: IncSpan: Incremental Mining of Sequential Patterns in Large Database. In: Proc. the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-04), pp. 527–532 (2004)
Google Scholar
Chi, Y., Yang, Y., Xia, Y., Muntz, R.R.: CMTreeMiner: Mining Both Closed and Maximal Frequent Subtrees. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 63–73. Springer, Heidelberg (2004)
Chapter Google Scholar
Chi, Y., Muntz, R.R., Nijssen, S., Kok, J.N.: Frequent Subtree Mining - An Overview. In: Fundamenta Informaticae, vol. 66, pp. 161–198 (2005)
Google Scholar
Hashimoto, K., Takigawa, I., Shiga, M., Kanehisa, M., Mamitsuka, H.: Mining Significant Tree Patterns in Carbohydrate Sugar Chains. In: Proc. the 7th European Conference on Computational Biology, pp. 167–173 (2008)
Google Scholar
Hsieh, M., Wu, Y., Chen, A.: Discovering frequent tree patterns over data streams, in Proc. SIAM International Conference on Data Mining (SDM 2006), pp. 629-633 (2006)
Google Scholar
Nijssen, S., Kok, J.N.: Efficient Discovery of Frequent Unordered Trees. In: Proc. the First International Workshop on Mining Graphs, Trees and Sequences (MGTS2003), in conjunction with ECML/PKDD 2003, pp. 55-64 (2003)
Google Scholar
Termier, A., Rousset, M.C., Sebag, M.: Dryade: A New Approach for Discovering Closed Frequent Trees in Heterogeneous Tree Databases. In: Perner, P. (ed.) ICDM 2004. LNCS (LNAI), vol. 3275, pp. 543–546. Springer, Heidelberg (2004)
Google Scholar
Wang, D., Peng, G.: A New Marketing Channel Management Strategy Based on Frequent Subtree Mining. Communications of the IIMA 7(1), 49–54 (2007)
Google Scholar
Zaki, M.J.: Efficiently Mining Frequent Trees in a Forest. In: Proc. the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2002), pp. 71–80 (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

Graduate School of Informatics, Kyoto University, Yoshida Honmachi, Sakyo-ku, Kyoto, 606-8501, Japan
Viet Anh Nguyen & Akihiro Yamamoto

Authors

Viet Anh Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Akihiro Yamamoto
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of Waikato, Hamilton, New Zealand
Bernhard Pfahringer
Department of Computer Science, The University of Waikato, Private Bag 3105, 3240, Hamilton, New Zealand
Geoff Holmes
School of Computer Science and Engineering, The University of New South Wales, 2052, Sydney, Australia
Achim Hoffmann

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nguyen, V.A., Yamamoto, A. (2010). Incremental Mining of Closed Frequent Subtrees. In: Pfahringer, B., Holmes, G., Hoffmann, A. (eds) Discovery Science. DS 2010. Lecture Notes in Computer Science(), vol 6332. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16184-1_25

Download citation

DOI: https://doi.org/10.1007/978-3-642-16184-1_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16183-4
Online ISBN: 978-3-642-16184-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics