Advertisement

Sibling Distance for Rooted Labeled Trees

  • Taku Aratsu
  • Kouichi Hirata
  • Tetsuji Kuboyama
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5433)

Abstract

In this paper, we introduce a sibling distance δ s for rooted labeled trees as an L 1-distance between their sibling histograms, which consist of the frequencies of every pair of the label of a node and the sequence of labels of its children. Then, we show that δ s gives a constant factor lower bound on the tree edit distance δ such that δ s (T 1,T 2) ≤ 4δ(T 1,T 2). Next, we design the algorithm to compute the sibling histogram in O(n) time for ordered trees and in O(gn) time for unordered trees, where n and g are the number of nodes and the degree of a tree. Finally, we give experimental results by applying the sibling distance to glycan data.

Keywords

Similarity Measure Space Complexity Edit Distance Edit Operation Label Tree 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Akutsu, T.: A relationship between edit distance for ordered trees and edit distance for Euler strings. Inform. Proc. Let. 100, 105–109 (2006)CrossRefGoogle Scholar
  2. 2.
    Augsten, N., Böhlen, M., Gamper, J.: Approximate matching of hierarchical data using pq-grams. In: Proc. VLDB 2005, pp. 301–312 (2005)Google Scholar
  3. 3.
    Bille, P.: A survey on tree edit distance and related problems. Theoret. Comput. Sci. 337, 217–239 (2005)CrossRefGoogle Scholar
  4. 4.
    Demaine, E.D., Mozes, S., Rossman, B., Weimann, O.: An optimal decomposition algorithm for tree edit distance. In: Arge, L., Cachin, C., Jurdziński, T., Tarlecki, A. (eds.) ICALP 2007. LNCS, vol. 4596, pp. 146–157. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  5. 5.
    Kailing, K., Kriegel, H.-P., Schönauer, S., Seidl, T.: Efficient similarity search for hierarchical data in large databases. In: Bertino, E., Christodoulakis, S., Plexousakis, D., Christophides, V., Koubarakis, M., Böhm, K., Ferrari, E. (eds.) EDBT 2004. LNCS, vol. 2992, pp. 676–693. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  6. 6.
    Kuboyama, T., Hirata, K., Aoki-Kinoshita, K.F.: An efficient unordered tree kernel and its application to glycan classification. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds.) PAKDD 2008. LNCS (LNAI), vol. 5012, pp. 184–195. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  7. 7.
    Kuboyama, T., Hirata, K., Ohkura, N., Harao, M.: A q-gram based distance measure for ordered labeled trees. In: Proc. LLLL 2006, pp. 77–83 (2006)Google Scholar
  8. 8.
    Ohkura, N., Hirata, K., Kuboyama, T., Harao, M.: The q-gram distance for ordered unlabeled trees. In: Hoffmann, A., Motoda, H., Scheffer, T. (eds.) DS 2005. LNCS (LNAI), vol. 3735, pp. 189–202. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  9. 9.
    Ukkonen, E.: Approximate string-matching with q-grams and maximal matches. Theor. Comput. Sci. 92, 191–211 (1993)CrossRefGoogle Scholar
  10. 10.
    Yang, R., Kalnis, P., Tung, A.K.H.: Similarity evaluation on tree-structured data. In: Proc. SIGMOD 2005, pp. 754–765 (2005)Google Scholar
  11. 11.
    Zhang, K., Jiang, T.: Some MAX SNP-hard results concerning unordered labeled trees. Inform. Process. Let. 49, 249–254 (1994)CrossRefGoogle Scholar
  12. 12.
    Zhang, K., Shasha, D.: Simple fast algorithms for the editing distance between trees and related problems. SIAM J. Comput. 18, 1245–1262 (1989)CrossRefGoogle Scholar
  13. 13.
    Zhang, K., Statman, R., Shasha, D.: On the editing distance between unordered labeled trees. Inform. Process. Let. 42, 133–139 (1992)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Taku Aratsu
    • 1
  • Kouichi Hirata
    • 2
  • Tetsuji Kuboyama
    • 3
  1. 1.Graduate School of Computer Science and Systems EngineeringJapan
  2. 2.Department of Artificial IntelligenceKyushu Institute of TechnologyIizukaJapan
  3. 3.Computer CenterGakushuin UniversityTokyoJapan

Personalised recommendations