Abstract
Information filtering systems constitute a critical component in modern information seeking applications. As the number of users grows and the information available becomes even bigger it is crucial to employ scalable and efficient representation and filtering techniques. In this paper we propose an innovative XML filtering system that utilizes clustering of user profiles in order to reduce the filtering space and achieves sub-linear filtering time. The proposed system employs a unique sequence representation for user profiles and XML documents based on the depth-first traversal of the XML tree and an appropriate distance metric in order to compare and cluster the user profiles and filter the incoming XML documents. Experimental results depict that the proposed system outperforms the previous approaches in XML filtering and achieves sub-linear filtering time.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Aguilera, M.K., Strom, R.E., Stunnan, D.C., AsHey, M., Chandra, T.D.: Matching Events in a Content-based Subscription System. In: PODC 1999, pp. 53–61 (1999)
Altinel, M., Franklin, M.I.J.: Efficient Filtering of XML Documents for Selective Dissemination of Information. In: VLDB 2000, pp. 53–64 (2000)
Antonellis, P., Makris, C., Tsirakis, N.: XEdge: Clustering Homogeneous and Heterogeneous XML Documents Using Edge Summaries. In: ACM SAC 2008 (to appear, 2008)
Antonellis, P., Makris, C.: XFIS: An XML Filtering System based on String Representation and Matching. International Journal on Web Engineering and Technology (IJWET) 4(1), 70–94 (2008)
Canadan, K., Hsiung, W., Chen, S., Tatemura, J., Agrrawal, D.: AFilter: Adaptable XML Filtering with Prefix-Caching and Suffix-Clustering. In: VLDB 2006, pp. 559–570 (2006)
Dalamagas, T., Cheng, T., Winkel, K., Sellis, T.: Clustering XML documents using Structural Summaries. In: Lindner, W., Mesiti, M., Türker, C., Tzitzikas, Y., Vakali, A.I. (eds.) EDBT 2004. LNCS, vol. 3268, pp. 547–556. Springer, Heidelberg (2004)
Diao, Y., Altinel, M., Franklin, M.: Path sharing and predicate evaluation for high-performance XML filtering. TODS 28(4), 467–516 (2003)
Francesca, F., Gordano, G., Ortale, R., Tagarelli, A.: Distance-based Clustering of XML Documents. In: MGTS 2003, pp. 75–78 (2003)
Gupta, A.K., Suciu, D.: Stream processing of XPath queries with predicates. In: SIGMOD 2003, pp. 419–430 (2003)
Isert, C.: The editing distance between trees. Technical Report, Ferienakademie, for course 2: Bume: Algorithmik Und Kombinatorik, Italy (1999)
Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice-Hall, Englewood Cliffs (1988)
Kwon, J., Rao, P., Moon, B., Lee, S.: FiST: Scalable XML Document Filtering by Sequencing Twig Patterns. In: VLDB 2005, pp. 217–228 (2005)
Peng, F., Chawathe, S.: XSQ: A streaming XPath Queries. In: TODS 2005, pp. 577–623 (2005)
Zhang, K., Shasha, D.: Simple fast algorithms for the editing distance between trees and related problems. SIAM Journal on Computing, 1245–1262 (1989)
http://www.dia.uniroma3.it/Araneus/Sigmod/Record/DTD/index.html
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Antonellis, P., Makris, C. (2008). XML Filtering Using Dynamic Hierarchical Clustering of User Profiles. In: Bhowmick, S.S., Küng, J., Wagner, R. (eds) Database and Expert Systems Applications. DEXA 2008. Lecture Notes in Computer Science, vol 5181. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85654-2_46
Download citation
DOI: https://doi.org/10.1007/978-3-540-85654-2_46
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85653-5
Online ISBN: 978-3-540-85654-2
eBook Packages: Computer ScienceComputer Science (R0)