Abstract
XML is becoming increasingly popular as a language for representing many types of electronic documents. The consequence of the strict structural document description via XML is that a relatively new task in mining documents based on structural and/or content information has emerged. In this paper we investigate (1) the suitability of new unsupervised machine learning methods for the clustering task of XML documents, and (2) the importance of contextual information for the same task. These tasks are part of an international competition on XML clustering and categorization (INEX 2006). It will be shown that the proposed approaches provide a suitable tool for the clustering of structured data as they yield the best results in the international INEX 2006 competition on clustering of XML data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Denoyer, L., Gallinari, P.: Report on the xml mining track at inex 2005 and inex 2006. In: Proceedings of INEX (2006)
Frasconi, P., Gori, M., Maggini, M., Martinelli, E., Soda, G.: Inductive inference of tree automata by recursive neural networks. In: Proceedings of the Fifth Congress of the Italian Association for Artificial Intelligence, Rome, Italy, pp. 425–432. Springer, Heidelberg (1997)
Hagenbuchner, M., Sperduti, A., Tsoi, A.: A self-organizing map for adaptive processing of structured data. IEEE Transactions on Neural Networks 14(3), 491–505 (2003)
Hagenbuchner, M., Sperduti, A., Tsoi, A.: Contextual processing of graphs using self-organizing maps. In: European symposium on Artificial Neural Networks, Poster track, Bruges, Belgium (April 27-29, 2005)
Hagenbuchner, M., Sperduti, A., Tsoi, A.: Contextual self-organizing maps for structured domains. In: Workshop on Relational Machine Learning (2005)
Hagenbuchner, M., Sperduti, A., Tsoi, A., Trentini, F., Scarselli, F., Gori, M.: Clustering xml documents using self-organizing maps for structures. In: Fuhr, N., Lalmas, M., Malik, S., Kazai, G. (eds.) INEX 2005. LNCS, vol. 3977, pp. 481–496. Springer, Heidelberg (2006)
Kohonen, T.: Self-Organizing Maps. Springer Series in Information Sciences, vol. 30. Springer, Heidelberg (1995)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kc, M., Hagenbuchner, M., Tsoi, A.C., Scarselli, F., Sperduti, A., Gori, M. (2007). XML Document Mining Using Contextual Self-organizing Maps for Structures. In: Fuhr, N., Lalmas, M., Trotman, A. (eds) Comparative Evaluation of XML Information Retrieval Systems. INEX 2006. Lecture Notes in Computer Science, vol 4518. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73888-6_47
Download citation
DOI: https://doi.org/10.1007/978-3-540-73888-6_47
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73887-9
Online ISBN: 978-3-540-73888-6
eBook Packages: Computer ScienceComputer Science (R0)