Skip to main content

XML Filtering Using Dynamic Hierarchical Clustering of User Profiles

  • Conference paper
Database and Expert Systems Applications (DEXA 2008)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5181))

Included in the following conference series:

Abstract

Information filtering systems constitute a critical component in modern information seeking applications. As the number of users grows and the information available becomes even bigger it is crucial to employ scalable and efficient representation and filtering techniques. In this paper we propose an innovative XML filtering system that utilizes clustering of user profiles in order to reduce the filtering space and achieves sub-linear filtering time. The proposed system employs a unique sequence representation for user profiles and XML documents based on the depth-first traversal of the XML tree and an appropriate distance metric in order to compare and cluster the user profiles and filter the incoming XML documents. Experimental results depict that the proposed system outperforms the previous approaches in XML filtering and achieves sub-linear filtering time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Aguilera, M.K., Strom, R.E., Stunnan, D.C., AsHey, M., Chandra, T.D.: Matching Events in a Content-based Subscription System. In: PODC 1999, pp. 53–61 (1999)

    Google Scholar 

  2. Altinel, M., Franklin, M.I.J.: Efficient Filtering of XML Documents for Selective Dissemination of Information. In: VLDB 2000, pp. 53–64 (2000)

    Google Scholar 

  3. Antonellis, P., Makris, C., Tsirakis, N.: XEdge: Clustering Homogeneous and Heterogeneous XML Documents Using Edge Summaries. In: ACM SAC 2008 (to appear, 2008)

    Google Scholar 

  4. Antonellis, P., Makris, C.: XFIS: An XML Filtering System based on String Representation and Matching. International Journal on Web Engineering and Technology (IJWET) 4(1), 70–94 (2008)

    Article  Google Scholar 

  5. Canadan, K., Hsiung, W., Chen, S., Tatemura, J., Agrrawal, D.: AFilter: Adaptable XML Filtering with Prefix-Caching and Suffix-Clustering. In: VLDB 2006, pp. 559–570 (2006)

    Google Scholar 

  6. Dalamagas, T., Cheng, T., Winkel, K., Sellis, T.: Clustering XML documents using Structural Summaries. In: Lindner, W., Mesiti, M., Türker, C., Tzitzikas, Y., Vakali, A.I. (eds.) EDBT 2004. LNCS, vol. 3268, pp. 547–556. Springer, Heidelberg (2004)

    Google Scholar 

  7. Diao, Y., Altinel, M., Franklin, M.: Path sharing and predicate evaluation for high-performance XML filtering. TODS 28(4), 467–516 (2003)

    Article  Google Scholar 

  8. Francesca, F., Gordano, G., Ortale, R., Tagarelli, A.: Distance-based Clustering of XML Documents. In: MGTS 2003, pp. 75–78 (2003)

    Google Scholar 

  9. Gupta, A.K., Suciu, D.: Stream processing of XPath queries with predicates. In: SIGMOD 2003, pp. 419–430 (2003)

    Google Scholar 

  10. Isert, C.: The editing distance between trees. Technical Report, Ferienakademie, for course 2: Bume: Algorithmik Und Kombinatorik, Italy (1999)

    Google Scholar 

  11. Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice-Hall, Englewood Cliffs (1988)

    MATH  Google Scholar 

  12. Kwon, J., Rao, P., Moon, B., Lee, S.: FiST: Scalable XML Document Filtering by Sequencing Twig Patterns. In: VLDB 2005, pp. 217–228 (2005)

    Google Scholar 

  13. Peng, F., Chawathe, S.: XSQ: A streaming XPath Queries. In: TODS 2005, pp. 577–623 (2005)

    Google Scholar 

  14. Zhang, K., Shasha, D.: Simple fast algorithms for the editing distance between trees and related problems. SIAM Journal on Computing, 1245–1262 (1989)

    Google Scholar 

  15. http://kdl.cs.umass.edu/data/dblp/dblp-info.html

  16. http://www.levenshtein.net/

  17. http://xml.coverpages.org/bosakShakespeare200.html

  18. http://www.eclipse.org

  19. http://www.w3.org/TR/xpath

  20. http://www.levenshtein.net

  21. http://www.dia.uniroma3.it/Araneus/Sigmod/Record/DTD/index.html

Download references

Author information

Authors and Affiliations

Authors

Editor information

Sourav S. Bhowmick Josef Küng Roland Wagner

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Antonellis, P., Makris, C. (2008). XML Filtering Using Dynamic Hierarchical Clustering of User Profiles. In: Bhowmick, S.S., Küng, J., Wagner, R. (eds) Database and Expert Systems Applications. DEXA 2008. Lecture Notes in Computer Science, vol 5181. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85654-2_46

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-85654-2_46

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-85653-5

  • Online ISBN: 978-3-540-85654-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics