Top-Down Tree Edit-Distance of Regular Tree Languages

Ko, Sang-Ki; Han, Yo-Sub; Salomaa, Kai

doi:10.1007/978-3-319-04921-2_38

Sang-Ki Ko¹⁹,
Yo-Sub Han¹⁹ &
Kai Salomaa²⁰

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8370))

Included in the following conference series:

International Conference on Language and Automata Theory and Applications

1105 Accesses

Abstract

We study the edit-distance of regular tree languages. The edit-distance is a metric for measuring the similarity or dissimilarity between two objects, and a regular tree language is a set of trees accepted by a finite-state tree automaton or described by a regular tree grammar. Given two regular tree languages L and R, we define the edit-distance d(L,R) between L and R to be the minimum edit-distance between a tree t ₁ ∈ L and t ₂ ∈ R, respectively. Based on tree automata for L and R, we present a polynomial algorithm that computes d(L,R). We also suggest how to use the edit-distance between two tree languages for identifying a special common string between two context-free grammars.

Ko and Han were supported by the Basic Science Research Program through NRF funded by MEST (2012R1A1A2044562), and Salomaa was supported by the Natural Sciences and Engineering Research Council of Canada Grant OGP0147224.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bunke, H.: Edit distance of regular languages. In: Proceedings of the 5th Annual Symposium on Document Analysis and Information Retrieval, pp. 113–124 (1996)
Google Scholar
Chawathe, S.S.: Comparing hierarchical data in external memory. In: Proceedings of the 25th International Conference on Very Large Data Bases, pp. 90–101 (1999)
Google Scholar
Choffrut, C., Pighizzini, G.: Distances between languages and reflexivity of relations. Theoretical Compututer Science 286(1), 117–138 (2002)
Article MATH MathSciNet Google Scholar
Comon, H., Dauchet, M., Gilleron, R., Löding, C., Jacquemard, F., Lugiez, D., Tison, S., Tommasi, M.: Tree automata techniques and applications (2007), http://www.grappa.univ-lille3.fr/tata (release October 12, 2007)
Demaine, E.D., Mozes, S., Rossman, B., Weimann, O.: An optimal decomposition algorithm for tree edit distance. ACM Transactions on Algorithms 6(1), 2:1–2:19 (2009)
Google Scholar
Gécseg, F., Steinby, M.: Tree languages. In: Handbook of Formal Languages, Vol. 3: Beyond Words, pp. 1–68. Springer-Verlag New York, Inc. (1997)
Google Scholar
Hamming, R.W.: Error Detecting and Error Correcting Codes. Bell System Technical Journal 26(2), 147–160 (1950)
Article MathSciNet Google Scholar
Han, Y.-S., Ko, S.-K., Salomaa, K.: Computing the edit-distance between a regular language and a context-free language. In: Yen, H.-C., Ibarra, O.H. (eds.) DLT 2012. LNCS, vol. 7410, pp. 85–96. Springer, Heidelberg (2012)
Chapter Google Scholar
Han, Y.-S., Ko, S.-K., Salomaa, K.: Approximate matching between a context-free grammar and a finite-state automaton. In: Konstantinidis, S. (ed.) CIAA 2013. LNCS, vol. 7982, pp. 146–157. Springer, Heidelberg (2013)
Google Scholar
Klein, P.N.: Computing the edit-distance between unrooted ordered trees. In: Proceedings of the 6th Annual European Symposium on Algorithms, pp. 91–102 (1998)
Google Scholar
Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady 10(8), 707–710 (1966)
MathSciNet Google Scholar
Mohri, M.: Edit-distance of weighted automata: General definitions and algorithms. International Journal of Foundations of Computer Science 14(6), 957–982 (2003)
Article MATH MathSciNet Google Scholar
Myers, G.: Approximately matching context-free languages. Information Processing Letters 54, 85–92 (1995)
Article MATH MathSciNet Google Scholar
Nierman, A., Jagadish, H.V.: Evaluating structural similarity in XML documents. In: Proceedings of the 5th International Workshop on the Web and Databases, pp. 61–66 (2002)
Google Scholar
Reis, D.C., Golgher, P.B., Silva, A.S., Laender, A.F.: Automatic web news extraction using tree edit distance. In: Proceedings of the 13th International Conference on World Wide Web, pp. 502–511 (2004)
Google Scholar
Selkow, S.: The tree-to-tree editing problem. Information Processing Letters 6(6), 184–186 (1977)
Article MATH MathSciNet Google Scholar
Tai, K.C.: The tree-to-tree correction problem. Journal of the ACM 26(3), 422–433 (1979)
Article MATH MathSciNet Google Scholar
Tekli, J., Chbeir, R., Yetongnon, K.: Survey: An overview on XML similarity: Background, current trends and future directions. Computer Science Review 3(3), 151–173 (2009)
Article Google Scholar
Wagner, R.A.: Order-n correction for regular languages. Communications of the ACM 17, 265–268 (1974)
Article MATH Google Scholar
Wagner, R.A., Fischer, M.J.: The string-to-string correction problem. Journal of the ACM 21, 168–173 (1974)
Article MATH MathSciNet Google Scholar
Winkler, W.E.: String comparator metrics and enhanced decision rules in the fellegi-sunter model of record linkage. In: Proceedings of the Section on Survey Research, pp. 354–359 (1990)
Google Scholar
Yang, R., Kalnis, P., Tung, A.K.H.: Similarity evaluation on tree-structured data. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, pp. 754–765 (2005)
Google Scholar
Zhang, K., Shasha, D.: Simple fast algorithms for the editing distance between trees and related problems. SIAM Journal on Computing 18(6), 1245–1262 (1989)
Article MATH MathSciNet Google Scholar
Zhang, Z., Cao, R.L.S., Zhu, Y.: Similarity metric for XML documents. In: Proceedings of Workshop on Knowledge and Experience Management (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Yonsei University, 50, Yonsei-Ro, Seodaemun-Gu, Seoul, 120-749, Republic of Korea
Sang-Ki Ko & Yo-Sub Han
School of Computing, Queen’s University, Kingston, Ontario, K7L 3N6, Canada
Kai Salomaa

Authors

Sang-Ki Ko
View author publications
You can also search for this author in PubMed Google Scholar
Yo-Sub Han
View author publications
You can also search for this author in PubMed Google Scholar
Kai Salomaa
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Research Group on Mathematical Linguistics, Rovira i Virgili University, Avinguda Catalunya, 35, 43002, Tarragona, Spain
Adrian-Horia Dediu & Carlos Martín-Vide &
School of Computer Science, Department of Software Engineering and Artificial Intelligence, Complutense University of Madrid, Professor José Garcia Santesmases, 9, 28040, Madrid, Spain
José-Luis Sierra-Rodríguez
Fakultät für Informatik, Institut für Wissens- und Sprachverarbeitung, Otto-von-Guericke-Universität Magdeburg, Universitätsplatz 2, 39106, Magdeburg, Germany
Bianca Truthe

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ko, SK., Han, YS., Salomaa, K. (2014). Top-Down Tree Edit-Distance of Regular Tree Languages. In: Dediu, AH., Martín-Vide, C., Sierra-Rodríguez, JL., Truthe, B. (eds) Language and Automata Theory and Applications. LATA 2014. Lecture Notes in Computer Science, vol 8370. Springer, Cham. https://doi.org/10.1007/978-3-319-04921-2_38

Download citation

DOI: https://doi.org/10.1007/978-3-319-04921-2_38
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-04920-5
Online ISBN: 978-3-319-04921-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics