Algorithms for Finding a Most Similar Subforest
Given an ordered labeled forest F (“the target forest”) and an ordered labeled forest G (“the pattern forest”), the most similar subforest problem is to find a subforest F′ of F such that the distance between F′ and G is minimum over all possible F′. This problem generalizes several well-studied problems which have important applications in locating patterns in hierarchical structures such as RNA molecules’ secondary structures and XML documents. In this paper, we present efficient algorithms for the most similar subforest problem with forest edit distance for three types of subforests: simple substructures, sibling substructures, and closed subforests.
KeywordsLabel Tree Edit Mapping Pattern Forest Combinatorial Pattern Match Alignment Distance
Unable to display preview. Download preview PDF.
- 2.Cobéna, G., Abiteboul, S., Marian, A.: Detecting changes in XML documents. In: Proceedings of the 18th IEEE International Conference on Data Engineering (ICDE 2002), pp. 41–52 (2002)Google Scholar
- 4.Höchsmann, M., Töller, T., Giegerich, R., Kurtz, S.: Local similarity in RNA secondary structures. In: Proceedings of the IEEE Computational Systems Bioinformatics Conference (CSB 2003), pp. 159–168 (2003)Google Scholar
- 10.Motifs database, http://subviral.med.uottawa.ca/cgi-bin/motifs.cgi
- 11.Shapiro, B.A., Zhang, K.: Comparing multiple RNA secondary structures using tree comparisons. Computer Applications in the Biosciences 6(4), 309–318 (1990)Google Scholar