Skip to main content

Top-Down Tree Edit-Distance of Regular Tree Languages

  • Conference paper
Language and Automata Theory and Applications (LATA 2014)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8370))

  • 1105 Accesses

Abstract

We study the edit-distance of regular tree languages. The edit-distance is a metric for measuring the similarity or dissimilarity between two objects, and a regular tree language is a set of trees accepted by a finite-state tree automaton or described by a regular tree grammar. Given two regular tree languages L and R, we define the edit-distance d(L,R) between L and R to be the minimum edit-distance between a tree t 1 ∈ L and t 2 ∈ R, respectively. Based on tree automata for L and R, we present a polynomial algorithm that computes d(L,R). We also suggest how to use the edit-distance between two tree languages for identifying a special common string between two context-free grammars.

Ko and Han were supported by the Basic Science Research Program through NRF funded by MEST (2012R1A1A2044562), and Salomaa was supported by the Natural Sciences and Engineering Research Council of Canada Grant OGP0147224.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bunke, H.: Edit distance of regular languages. In: Proceedings of the 5th Annual Symposium on Document Analysis and Information Retrieval, pp. 113–124 (1996)

    Google Scholar 

  2. Chawathe, S.S.: Comparing hierarchical data in external memory. In: Proceedings of the 25th International Conference on Very Large Data Bases, pp. 90–101 (1999)

    Google Scholar 

  3. Choffrut, C., Pighizzini, G.: Distances between languages and reflexivity of relations. Theoretical Compututer Science 286(1), 117–138 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  4. Comon, H., Dauchet, M., Gilleron, R., Löding, C., Jacquemard, F., Lugiez, D., Tison, S., Tommasi, M.: Tree automata techniques and applications (2007), http://www.grappa.univ-lille3.fr/tata (release October 12, 2007)

  5. Demaine, E.D., Mozes, S., Rossman, B., Weimann, O.: An optimal decomposition algorithm for tree edit distance. ACM Transactions on Algorithms 6(1), 2:1–2:19 (2009)

    Google Scholar 

  6. Gécseg, F., Steinby, M.: Tree languages. In: Handbook of Formal Languages, Vol. 3: Beyond Words, pp. 1–68. Springer-Verlag New York, Inc. (1997)

    Google Scholar 

  7. Hamming, R.W.: Error Detecting and Error Correcting Codes. Bell System Technical Journal 26(2), 147–160 (1950)

    Article  MathSciNet  Google Scholar 

  8. Han, Y.-S., Ko, S.-K., Salomaa, K.: Computing the edit-distance between a regular language and a context-free language. In: Yen, H.-C., Ibarra, O.H. (eds.) DLT 2012. LNCS, vol. 7410, pp. 85–96. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  9. Han, Y.-S., Ko, S.-K., Salomaa, K.: Approximate matching between a context-free grammar and a finite-state automaton. In: Konstantinidis, S. (ed.) CIAA 2013. LNCS, vol. 7982, pp. 146–157. Springer, Heidelberg (2013)

    Google Scholar 

  10. Klein, P.N.: Computing the edit-distance between unrooted ordered trees. In: Proceedings of the 6th Annual European Symposium on Algorithms, pp. 91–102 (1998)

    Google Scholar 

  11. Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady 10(8), 707–710 (1966)

    MathSciNet  Google Scholar 

  12. Mohri, M.: Edit-distance of weighted automata: General definitions and algorithms. International Journal of Foundations of Computer Science 14(6), 957–982 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  13. Myers, G.: Approximately matching context-free languages. Information Processing Letters 54, 85–92 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  14. Nierman, A., Jagadish, H.V.: Evaluating structural similarity in XML documents. In: Proceedings of the 5th International Workshop on the Web and Databases, pp. 61–66 (2002)

    Google Scholar 

  15. Reis, D.C., Golgher, P.B., Silva, A.S., Laender, A.F.: Automatic web news extraction using tree edit distance. In: Proceedings of the 13th International Conference on World Wide Web, pp. 502–511 (2004)

    Google Scholar 

  16. Selkow, S.: The tree-to-tree editing problem. Information Processing Letters 6(6), 184–186 (1977)

    Article  MATH  MathSciNet  Google Scholar 

  17. Tai, K.C.: The tree-to-tree correction problem. Journal of the ACM 26(3), 422–433 (1979)

    Article  MATH  MathSciNet  Google Scholar 

  18. Tekli, J., Chbeir, R., Yetongnon, K.: Survey: An overview on XML similarity: Background, current trends and future directions. Computer Science Review 3(3), 151–173 (2009)

    Article  Google Scholar 

  19. Wagner, R.A.: Order-n correction for regular languages. Communications of the ACM 17, 265–268 (1974)

    Article  MATH  Google Scholar 

  20. Wagner, R.A., Fischer, M.J.: The string-to-string correction problem. Journal of the ACM 21, 168–173 (1974)

    Article  MATH  MathSciNet  Google Scholar 

  21. Winkler, W.E.: String comparator metrics and enhanced decision rules in the fellegi-sunter model of record linkage. In: Proceedings of the Section on Survey Research, pp. 354–359 (1990)

    Google Scholar 

  22. Yang, R., Kalnis, P., Tung, A.K.H.: Similarity evaluation on tree-structured data. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, pp. 754–765 (2005)

    Google Scholar 

  23. Zhang, K., Shasha, D.: Simple fast algorithms for the editing distance between trees and related problems. SIAM Journal on Computing 18(6), 1245–1262 (1989)

    Article  MATH  MathSciNet  Google Scholar 

  24. Zhang, Z., Cao, R.L.S., Zhu, Y.: Similarity metric for XML documents. In: Proceedings of Workshop on Knowledge and Experience Management (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Ko, SK., Han, YS., Salomaa, K. (2014). Top-Down Tree Edit-Distance of Regular Tree Languages. In: Dediu, AH., Martín-Vide, C., Sierra-Rodríguez, JL., Truthe, B. (eds) Language and Automata Theory and Applications. LATA 2014. Lecture Notes in Computer Science, vol 8370. Springer, Cham. https://doi.org/10.1007/978-3-319-04921-2_38

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-04921-2_38

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-04920-5

  • Online ISBN: 978-3-319-04921-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics