Skip to main content

TreeXP—An Instantiation of XPattern Framework

  • Conference paper
  • First Online:
Data Science and Security

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 132))

Abstract

Most of the data generated from social media, Internet of Things, etc. are semi-structured or unstructured. XML is a leading semi-structured data commonly used over cross-platforms. XML clustering is an active research area. Because of the complexity of XML clustering, it remains a challenging area in data analytics, especially when Big Data is considered. In this paper, we focus on clustering of XML based on structure. A novel method for representing XML documents, Compressed Representation of XML Tree, is proposed following the concept of frequent pattern tree structure. From the proposed structure, clustering is carried out with a new algorithm, TreeXP, which follows the XPattern framework. The performances of the proposed representation and clustering algorithm are compared with a well-established PathXP algorithm and found to give the same performance, but require very less time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Aggarwal C, Ta N, Wang J, Feng J, Zaki M (2007) Xproj. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining, KDD’07

    Google Scholar 

  2. Piernik M, Brzezinski D, Morzy T, Lesniewska A (2014) XML clustering: a review of structural approaches. Knowl Eng Rev 30(03):297–323

    Article  Google Scholar 

  3. Thulasi A, Remya KTV, Raju G (2017) Structure based XML document clustering: a review. In: 2017 international conference on Infocom technologies and unmanned systems (trends and future directions) (ICTUS)

    Google Scholar 

  4. Piernik M, Brzezinski D, Morzy T (2015) Clustering XML documents by patterns. Knowl Inform Syst 46(1):185–212

    Google Scholar 

  5. Sigmodrecord.org. (n.d.) SIGMOD Record – SIGMOD Record Site. https://sigmodrecord.org

  6. Aiweb.cs.washington.edu. (n.d.) UW XML Repository. http://aiweb.cs.washington.edu/research/projects/xmltk/xmldata/www/repository.html

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thulasi Accottillam .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Accottillam, T., Remya, K.T.V., Raju, G. (2021). TreeXP—An Instantiation of XPattern Framework. In: Jat, D.S., Shukla, S., Unal, A., Mishra, D.K. (eds) Data Science and Security. Lecture Notes in Networks and Systems, vol 132. Springer, Singapore. https://doi.org/10.1007/978-981-15-5309-7_7

Download citation

Publish with us

Policies and ethics