Skip to main content

XML Clustering Based on Common Neighbor

  • Conference paper
Advanced Web and Network Technologies, and Applications (APWeb 2006)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3842))

Included in the following conference series:

Abstract

Clustering on XML documents is an important task. However, it is difficult to select the appropriate parameters’ value for the clustering algorithms. By integrating outlier detection with clustering, the paper takes a new approach for analyzing the XML documents by structure distance. After stating the XML tree distance, the paper proposes a new clustering algorithm, which stops clustering automatically by utilizing the outlier information and needs only one parameter, whose appropriate value range can be decided in the outlier mining process. The paper adopts the XML dataset with different structure and other real-life datasets to compare it with other clustering algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Lee, M.L., Yang, L.H., Hsu, W., Yang, X.: XClust: Clustering XML Schemas for Effective Integration. In: Proc. 11th ACM Int. Conf. on Information and Knowledge Management, pp. 292–299 (2002)

    Google Scholar 

  2. Shen, Y., Wang, B.: Clustering Schemaless XML Document. In: Proc. of the 11th Int. Conf. on Cooperative Information System, pp. 767–784 (2003)

    Google Scholar 

  3. Dalamagas, T., et al.: Clustering XML documents by structure. In: Proceedings Methods and Applications of Artificial Intelligence, pp. 112–121 (2004)

    Google Scholar 

  4. Zhao, Y., Karypis, G.: Criterion Functions for Document Clustering: Experiment and Analysis. Technical Report #01-40, University of Minnesota (2001)

    Google Scholar 

  5. Guha, S., Rastogi, R., Shim, K.: ROCK: a robust clustering algorithm for categorical attributes. In: Proc. of the 15th Int’l Conf. on Data Eng. (1999)

    Google Scholar 

  6. http://www.cs.wisc.edu/niagara/data.html

  7. Fred, A.L.N., Leitão, J.M.N.: A new Cluster Isolation criterion Based on Dissimilarity Increments. IEEE Transactions on Pattern Analysis and Machine Intelligence 25(8), 944–958 (2003)

    Article  Google Scholar 

  8. http://www.ics.uci.edu/~mlearn/MLRepository.html

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lv, Ty., Zhang, Xz., Zuo, Wl., Wang, Zx. (2006). XML Clustering Based on Common Neighbor. In: Shen, H.T., Li, J., Li, M., Ni, J., Wang, W. (eds) Advanced Web and Network Technologies, and Applications. APWeb 2006. Lecture Notes in Computer Science, vol 3842. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11610496_18

Download citation

  • DOI: https://doi.org/10.1007/11610496_18

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-31158-4

  • Online ISBN: 978-3-540-32435-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics