Discovering frequent subtrees from XML data using neural networks

Wei, Sun; Da-xin, Liu; Tong, Wang

doi:10.1007/BF02831715

Discovering frequent subtrees from XML data using neural networks

Web Data Management Information Integration
Published: January 2006

Volume 11, pages 117–121, (2006)
Cite this article

Wuhan University Journal of Natural Sciences

Sun Wei¹,
Liu Da-xin¹ &
Wang Tong¹

25 Accesses
Explore all metrics

Abstract

By rapid progress of network and storage technologies, a huge amount of electronic data such as Web pages and XML has been available on Internet. In this paper, we study a data-mining problem of discovering frequent ordered sub-trees in a large collection of XML data, where both of the patterns and the data are modeled by labeled ordered trees. We present an efficient algorithm of Ordered Substree Miner (OSTMiner) based on two- layer neural networks with Hebb rule, that computes all ordered sub-trees appearing in a collection of XML trees with frequent above a user-specified threshold using a special structure EM-tree. In this algorithm, EM-tree is used as an extended merging tree to supply scheme information for efficient pruning and mining frequent sub-trees. Experiments results showed that OSTMiner has good response time and scales well.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Mining rooted ordered trees under subtree homeomorphism

Article 19 October 2015

Compressing Neural Networks by Applying Frequent Item-Set Mining

Structural XML Classification in Concept Drifting Data Streams

Article 01 July 2015

References

Wang Ke, Liu Hui-qing. Discovering Structural Association of Semistructured Data.IEEE Trans Knowl Data Eng, 2000,12(2):353–371.
Google Scholar
Miyahara T, Shoudai T, Uchida T,et al. Discovery of Frequent Tree Structured Patterns in Semistructured Web Documents.Proc PAKDD-2001. London: Springer-Verlag, 2001, 47–52.
Google Scholar
Mohammed J Z. Efficiently Mining Frequent Trees in a Forest.Proc KDD2002. New York: ACM Press, 2002, 71–80.
Google Scholar
Dehaspe L, Toivonen H, King R D. Finding Frequent Substructures in Chemical Compounds.Proc KDD98. New York: ACM Press, 1998. 30–36.
Google Scholar
Roberto J, Bayardo J R. Efficiently Mining Long Patterns from Databases.Proc SIGMOD98. New York: ACM Press, 1998, 85–93.
Google Scholar
Matsuda T, Horiuchi T, Motoda H,et al. Graph-Based Induction for General Graph Structured Data.Proc of the Second International Conference on Discovery Science. London: Springer-Verlag, 1999, 340–342.
Google Scholar
Sese J, Morishita, S. Answering the Most Correlated N Association Rules Efficiently.Proc of PKDD2002. Helsinki: ACM Press, 2002. 410–422.
Google Scholar
Yang L H, Lee M L, Hsu W. Mining Frequent Quer Patterns from XML Queries.Proc DASFAA2003. Kyoto: IEEE Press. 2003. 355–362.
Google Scholar
Kleinfeld D. Sequential State Generation by Model Neural Networks.The National Academy of Sciences, 1986,83 (24):9469–9473.
Article MathSciNet Google Scholar
Kleinfeld D, Sompolinsky H. Associative Neural Network Model for the Generation of Temporal Patterns.Biophys J, 1989,55(3):1039–1051.
Google Scholar

Download references

Author information

Authors and Affiliations

College of Computer Science and Technology, Harbin Engineering University, 150001, Harbin, Heilongjiang, China
Sun Wei, Liu Da-xin & Wang Tong

Authors

Sun Wei
View author publications
You can also search for this author in PubMed Google Scholar
Liu Da-xin
View author publications
You can also search for this author in PubMed Google Scholar
Wang Tong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Liu Da-xin.

Additional information

Foundation item: Supported by Key Science-Technology Project of Heilongjiang Province(GA010401-3)

Biography: SUN Wei(1978-) male, Ph.D. candidate, research direction: XML database, data mining and information management.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wei, S., Da-xin, L. & Tong, W. Discovering frequent subtrees from XML data using neural networks. Wuhan Univ. J. Nat. Sci. 11, 117–121 (2006). https://doi.org/10.1007/BF02831715

Download citation

Received: 28 May 2005
Issue Date: January 2006
DOI: https://doi.org/10.1007/BF02831715

Key words

CLC number

TP 311.13

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Discovering frequent subtrees from XML data using neural networks

Abstract

Access this article

Similar content being viewed by others

Mining rooted ordered trees under subtree homeomorphism

Compressing Neural Networks by Applying Frequent Item-Set Mining

Structural XML Classification in Concept Drifting Data Streams

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Key words

CLC number

Navigation

Discovering frequent subtrees from XML data using neural networks

Abstract

Access this article

Similar content being viewed by others

Mining rooted ordered trees under subtree homeomorphism

Compressing Neural Networks by Applying Frequent Item-Set Mining

Structural XML Classification in Concept Drifting Data Streams

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key words

CLC number

Search

Navigation