Algorithms for Finding a Most Similar Subforest

Jansson, Jesper; Peng, Zeshan

doi:10.1007/11780441_34

Jesper Jansson¹⁸ &
Zeshan Peng¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4009))

Included in the following conference series:

Annual Symposium on Combinatorial Pattern Matching

907 Accesses
3 Citations

Abstract

Given an ordered labeled forest F (“the target forest”) and an ordered labeled forest G (“the pattern forest”), the most similar subforest problem is to find a subforest F′ of F such that the distance between F′ and G is minimum over all possible F′. This problem generalizes several well-studied problems which have important applications in locating patterns in hierarchical structures such as RNA molecules’ secondary structures and XML documents. In this paper, we present efficient algorithms for the most similar subforest problem with forest edit distance for three types of subforests: simple substructures, sibling substructures, and closed subforests.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Chen, W.: New algorithm for ordered tree-to-tree correction problem. Journal of Algorithms 40(2), 135–158 (2001)
Article MathSciNet MATH Google Scholar
Cobéna, G., Abiteboul, S., Marian, A.: Detecting changes in XML documents. In: Proceedings of the 18th IEEE International Conference on Data Engineering (ICDE 2002), pp. 41–52 (2002)
Google Scholar
Crochemore, M., Rytter, W.: Text algorithms. Oxford University Press, Oxford (1994)
MATH Google Scholar
Höchsmann, M., Töller, T., Giegerich, R., Kurtz, S.: Local similarity in RNA secondary structures. In: Proceedings of the IEEE Computational Systems Bioinformatics Conference (CSB 2003), pp. 159–168 (2003)
Google Scholar
Jansson, J., Lingas, A.: A fast algorithm for optimal alignment between similar ordered trees. Fundamenta Informaticae 56(1–2), 105–120 (2003)
MathSciNet MATH Google Scholar
Jansson, J., Ngo, T.H., Sung, W.-K.: Local gapped subforest alignment and its application in finding RNA structural motifs. In: Fleischer, R., Trippen, G. (eds.) ISAAC 2004. LNCS, vol. 3341, pp. 569–580. Springer, Heidelberg (2004)
Chapter Google Scholar
Jiang, T., Wang, L., Zhang, K.: Alignment of trees - an alternative to tree edit. Theoretical Computer Science 143, 137–148 (1995)
Article MathSciNet MATH Google Scholar
Kilpeläinen, P., Mannila, H.: Ordered and unordered tree inclusion. SIAM Journal on Computing 24(2), 340–356 (1995)
Article MathSciNet MATH Google Scholar
Klein, P.N.: Computing the edit-distance between unrooted ordered trees. In: Bilardi, G., Pietracaprina, A., Italiano, G.F., Pucci, G. (eds.) ESA 1998. LNCS, vol. 1461, pp. 91–102. Springer, Heidelberg (1998)
Chapter Google Scholar
Motifs database, http://subviral.med.uottawa.ca/cgi-bin/motifs.cgi
Shapiro, B.A., Zhang, K.: Comparing multiple RNA secondary structures using tree comparisons. Computer Applications in the Biosciences 6(4), 309–318 (1990)
Google Scholar
Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147, 195–197 (1981)
Article Google Scholar
Tai, K.-C.: The tree-to-tree correction problem. Journal of the ACM 26(3), 422–433 (1979)
Article MathSciNet MATH Google Scholar
Touzet, H.: A linear time edit distance algorithm for similar ordered trees. In: Apostolico, A., Crochemore, M., Park, K. (eds.) CPM 2005. LNCS, vol. 3537, pp. 334–345. Springer, Heidelberg (2005)
Chapter Google Scholar
Valiente, G.: Constrained tree inclusion. In: Baeza-Yates, R., Chávez, E., Crochemore, M. (eds.) CPM 2003. LNCS, vol. 2676, pp. 361–371. Springer, Heidelberg (2003)
Chapter Google Scholar
Zhang, K., Shasha, D.: Simple fast algorithms for the editing distance between trees and related problems. SIAM Journal on Computing 18(6), 1245–1262 (1989)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Communication Engineering, Kyushu University, 6-10-1 Hakozaki, Higashi-ku, Fukuoka, 812-8581, Japan
Jesper Jansson
Department of Computer Science, The University of Hong Kong, Pokfulam Road, Hong Kong
Zeshan Peng

Authors

Jesper Jansson
View author publications
You can also search for this author in PubMed Google Scholar
Zeshan Peng
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Bar-Ilan University, 52900, Ramat-Gan, Israel
Moshe Lewenstein
Department of Software, Technical University of Catalonia, 08034, Barcelona, Spain
Gabriel Valiente

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jansson, J., Peng, Z. (2006). Algorithms for Finding a Most Similar Subforest. In: Lewenstein, M., Valiente, G. (eds) Combinatorial Pattern Matching. CPM 2006. Lecture Notes in Computer Science, vol 4009. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11780441_34

Download citation

DOI: https://doi.org/10.1007/11780441_34
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-35455-0
Online ISBN: 978-3-540-35461-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics