Advertisement

A Tree Distance Function Based on Multi-sets

  • Arnoldo José Müller-Molina
  • Kouichi Hirata
  • Takeshi Shinohara
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5433)

Abstract

We introduce a tree distance function based on multi-sets. We show that this function is a metric on tree spaces, and we design an algorithm to compute the distance between trees of size at most n in O(n 2) time and O(n) space. Contrary to other tree distance functions that require expensive memory allocations to maintain dynamic programming tables of forests, our function can be implemented over simple and static structures. Additionally, we present a case study in which we compare our function with other two distance functions.

Keywords

Tree edit distance Program matching Triangle inequality Metric 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Augsten, N., Bhlen, M., Gamper, J.: Approximate matching of hierarchical data using pq-grams. In: VLDB 2005, pp. 301–312 (2005)Google Scholar
  2. 2.
    Bille, P.: A survey on tree edit distance and related problems. Theoretical Computer Science 337(1-3), 217–239 (2005)CrossRefGoogle Scholar
  3. 3.
    Chawathe, S.S., Garcia-Molina, H.: Meaningful change detection in structured data. SIGMOD Rec. 26(2), 26–37 (1997)CrossRefGoogle Scholar
  4. 4.
    Chawathe, S.S., Rajaraman, A., Garcia-Molina, H., Widom, J.: Change detection in hierarchically structured information. SIGMOD Rec. 25(2), 493–504 (1996)CrossRefGoogle Scholar
  5. 5.
    Demaine, E., Mosez, S., Rossman, B., Weimann, O.: An optimal decomposition algorithm for tree edit distance. In: Arge, L., Cachin, C., Jurdziński, T., Tarlecki, A. (eds.) ICALP 2007. LNCS, vol. 4596, pp. 146–157. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  6. 6.
    Garofalakis, M., Kumar, A.: Xml stream processing using tree-edit distance embeddings. ACM Trans. Database Syst. 30(1), 279–332 (2005)CrossRefGoogle Scholar
  7. 7.
    Jiang, T., Wang, L., Zhang, K.: Alignment of trees - an alternative to tree edit. Theoretical Computer Science 143(1), 148–157 (1995)Google Scholar
  8. 8.
    Kailing, K., Kriegel, H.-P., Schönauer, S., Seidl, T.: Efficient similarity search for hierarchical data in large databases. In: Bertino, E., Christodoulakis, S., Plexousakis, D., Christophides, V., Koubarakis, M., Böhm, K., Ferrari, E. (eds.) EDBT 2004. LNCS, vol. 2992, pp. 676–693. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  9. 9.
    Klein, P., Tirthapura, S., Sharvit, D., Kimia, B.: A tree-edit-distance algorithm for comparing simple, closed shapes. In: SODA 2000, Philadelphia, USA. Society for Industrial and Applied Mathematics, pp. 696–704 (2000)Google Scholar
  10. 10.
    Klein, P.N.: Computing the edit-distance between unrooted ordered trees. In: Bilardi, G., Pietracaprina, A., Italiano, G.F., Pucci, G. (eds.) ESA 1998. LNCS, vol. 1461, pp. 91–102. Springer, Heidelberg (1998)Google Scholar
  11. 11.
    Müller-Molina, A.J., Shinohara, T.: On approximate matching of programs for protecting libre software. In: CASCON 2006, pp. 275–289. ACM Press, New York (2006)Google Scholar
  12. 12.
    Müller-Molina, A.J., Shinohara, T.: Fast approximate matching of programs for protecting libre/open source software by using spatial indexes. In: SCAM 2007, pp. 111–122. IEEE Computer Society, Los Alamitos (2007)Google Scholar
  13. 13.
    Ohkura, N., Hirata, K., Kuboyama, T., Harao, M.: The q-gram distance for ordered unlabeled trees. In: Hoffmann, A., Motoda, H., Scheffer, T. (eds.) DS 2005. LNCS (LNAI), vol. 3735, pp. 189–202. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  14. 14.
    Shinohara, T., Ishizaka, H.: On dimension reduction mappings for approximate retrieval of multi-dimensional data. In: Arikawa, S., Shinohara, A. (eds.) Progress in Discovery Science. LNCS, vol. 2281, pp. 224–231. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  15. 15.
    Tai, K.-C.: The tree-to-tree correction problem. JACM 26(3), 422–433 (1979)CrossRefGoogle Scholar
  16. 16.
    Yang, R., Kalnis, P., Tung, A.K.H.: Similarity evaluation on tree-structured data. In: SIGMOD 2005, pp. 754–765 (2005)Google Scholar
  17. 17.
    Zhang, K.: Algorithms for the constrained editing distance between ordered labeled trees and related problems. Pattern Recognition 28(3), 463–474 (1995)CrossRefGoogle Scholar
  18. 18.
    Zhang, K.: Computing similarity between rna secondary structures. In: INTSYS 1998, pp. 126–132 (1998)Google Scholar
  19. 19.
    Zhang, K., Statman, R., Shasha, D.: On the editing distance between unordered labeled trees. Information Processing Letters 42(3), 133–139 (1992)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Arnoldo José Müller-Molina
    • 1
  • Kouichi Hirata
    • 1
  • Takeshi Shinohara
    • 1
  1. 1.Department of Artificial IntelligenceKyushu Institute of TechnologyIizukaJapan

Personalised recommendations