Abstract
In this paper, we present a FASST mining approach to extract the frequently changing semantic structures (FASSTs), which are a subset of semantic substructures that change frequently, from versions of unordered XML documents. We propose a data structure, H-DOM + , and a FASST mining algorithm, which incorporates the semantic issue and takes the advantage of the related domain knowledge. The distinct feature of this approach is that the FASST mining process is guided by the user-defined concept hierarchy. Rather than mining all the frequent changing structures, only these frequent changing structures that are semantically meaningful are extracted. Our experimental results show that the H-DOM + structure is compact and the FASST algorithm is efficient with good scalability. We also design a declarative FASST query language, FASSTQUEL, to make the FASST mining process interactive and flexible.
Keywords
- Structure Dynamic
- Semantic Concept
- Version Dynamic
- Tree Representation
- Semantic Structure
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, access via your institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Ganti, V., Gehrke, J., Ramakrishnan, R.: DEMON: Mining and monitoring evolving data. In: Proc. IEEE ICDE, pp. 439–448 (2000)
Han, J., Fu, Y.: Dynamic generation and refinement of concept hierarchies for knowledge discovery in databases. In: Proc. KDD Workshop, pp. 157–168 (1994)
Inokuchi, A., Washio, T., Motoda, H.: An apriori based algorithm for mining frequent substructures from graph data. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 13–23. Springer, Heidelberg (2000)
Wang, Y., DeWitt, D.J., Cai, J.-Y.: X-diff: An effective change detection algorithm for XML documents. In: Proc. ICDE, pp. 519–530 (2003)
Zaki, M.J.: Efficiently mining frequent trees in a forest. In: Proc. ACM SIGKDD, pp. 71–80 (2002)
Zhao, Q., Bhowmick, S.S., Mohania, M., Kambayashi, Y.: Discovering frequently changing structures from historical structural deltas of unordered XML. In: Proc. ACM CIKM, pp. 188–198 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhao, Q., Bhowmick, S.S. (2005). FASST Mining: Discovering Frequently Changing Semantic Structure from Versions of Unordered XML Documents. In: Zhou, L., Ooi, B.C., Meng, X. (eds) Database Systems for Advanced Applications. DASFAA 2005. Lecture Notes in Computer Science, vol 3453. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11408079_66
Download citation
DOI: https://doi.org/10.1007/11408079_66
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25334-1
Online ISBN: 978-3-540-32005-0
eBook Packages: Computer ScienceComputer Science (R0)
