Frontiers of Computer Science in China

, Volume 5, Issue 1, pp 119–125 | Cite as

A metric normalization of tree edit distance

Research Article

Abstract

Traditional normalized tree edit distances do not satisfy the triangle inequality. We present a metric normalization method for tree edit distance, which results in a new normalized tree edit distance fulfilling the triangle inequality, under the condition that the weight function is a metric over the set of elementary edit operations with all costs of insertions/deletions having the same weight. We prove that the new distance, in the range [0, 1], is a genuine metric as a simple function of the sizes of two ordered labeled trees and the tree edit distance between them, which can be directly computed through tree edit distance with the same complexity. Based on an efficient algorithm to represent digits as ordered labeled trees, we show that the normalized tree edit metric can provide slightly better results than other existing methods in handwritten digit recognition experiments using the approximating and eliminating search algorithm (AESA) algorithm.

Keywords

metric normalization tree edit distance triangle inequality approximating and eliminating search algorithm (AESA) 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Zhang K, Shasha D, Wang J T L. Approximate tree matching in the presence of variable length don’t cares. Journal of Algorithms, 1994, 16(1): 33–66MathSciNetMATHCrossRefGoogle Scholar
  2. 2.
    Tai K C. The tree-to-tree correction problem. Journal of the ACM, 1979, 26(3): 422–433MathSciNetMATHCrossRefGoogle Scholar
  3. 3.
    Zhang K, Shasha D. Simple fast algorithms for the editing distance between trees and related problems. SIAM Journal on Computing, 1989, 18(6): 1245–1262MathSciNetMATHCrossRefGoogle Scholar
  4. 4.
    Kilpeläinen P, Mannila H. Ordered and unordered tree inclusion. SIAM Journal on Computing, 1995, 24(2): 340–356MathSciNetMATHCrossRefGoogle Scholar
  5. 5.
    Klein P, Tirthapura S, Sharvit D, Kimia B. A tree-edit-distance algorithm for comparing simple, closed shapes. In: Proceedings of 11th Annual ACM-SIAM Symposium on Discrete Algorithms. 2000, 696–704Google Scholar
  6. 6.
    Hoffmann C M, O’Donnell M J. Pattern matching in trees. Journal of the ACM, 1982, 29(1): 68–95MathSciNetMATHCrossRefGoogle Scholar
  7. 7.
    Ramesh R, Ramakrishnan I V. Nonlinear pattern matching in trees. Journal of the ACM, 1992, 39(2): 295–316MathSciNetMATHCrossRefGoogle Scholar
  8. 8.
    Bille P. A survey on tree edit distance and related problems. Theoretical Computer Science, 2005, 337(1–3): 217–239MathSciNetMATHCrossRefGoogle Scholar
  9. 9.
    Levenshtein A. Binary codes capable of correcting deletions, insertions and reversals. Soviet Physics Doklady, 1966, 10(8): 707–710MathSciNetGoogle Scholar
  10. 10.
    Sellers P H. On the theory and computation of evolutionary distances. SIAM Journal of Applied Mathematics, 1974, 26(4): 787–793MathSciNetMATHCrossRefGoogle Scholar
  11. 11.
    Wagner R A, Fischer M J. The string-to-string correction problem. Journal of the ACM, 1974, 21(1): 168–173MathSciNetMATHCrossRefGoogle Scholar
  12. 12.
    Weigel A, Fein F. Normalizing the weighted edit distance. In: Proceedings of 12th International Conference on Pattern Recognition. 1994, 399–402Google Scholar
  13. 13.
    Yianilos P N. Normalized Forms for Two Common Metrics. Report 91-082-9027-1, revision 7. July 2002, NEC Research Inst. http://www.pnylab.com/pny/
  14. 14.
    Rico-Juan J R, Mico L. Comparison of AESA and LAESA search algorithms using string and tree-edit-distance. Pattern Recognition Letters, 2003, 24(9–10): 1417–1426MATHCrossRefGoogle Scholar
  15. 15.
    Yujian L, Bo L. A normalized Levenshtein distance metric. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007, 29(6): 1091–1095CrossRefGoogle Scholar
  16. 16.
    Marzal A, Vidal E. Computation of normalized edit distance and applications. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1993, 15(9): 926–932CrossRefGoogle Scholar
  17. 17.
    Schroeder M, Schweimeier R. Argument and misunderstandings: fuzzy unification for negotiating agents. Electronic Notes in Theoretical Computer Science, 2002, 70(5): 1–19CrossRefGoogle Scholar
  18. 18.
    Vidal E. New formulation and improvements of the nearest-neighbour approximating and eliminating search algorithm (AESA). Pattern Recognition Letters, 1994, 15(1): 1–7MathSciNetCrossRefGoogle Scholar
  19. 19.
    Lecun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 1998, 86(11): 2278–2324CrossRefGoogle Scholar
  20. 20.
    Carrasco R C, Forcada ML. A note on the Nagen-draprasad-Wang-Gupta thining algorithm. Pattern Recognition Letters, 1995, 16(5): 539–541MATHCrossRefGoogle Scholar
  21. 21.
    Kong L B, Tang S W, Yang D Q, Wang T J, Gao J. Querying Techniques for XML Data. Journal of Software, 2007, 18(6): 1400–1418 (in Chinese)MATHCrossRefGoogle Scholar

Copyright information

© Higher Education Press and Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  1. 1.College of Computer Science and TechnologyBeijing University of TechnologyBeijingChina
  2. 2.College of Information Science and TechnologyHainan UniversityHaikouChina

Personalised recommendations