Counting, Generating and Sampling Tree Alignments

  • Cedric Chauve
  • Julien Courtiel
  • Yann Ponty
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9702)


Pairwise ordered tree alignment are combinatorial objects that appear in RNA secondary structure comparison. However, the usual representation of tree alignments as supertrees is ambiguous, i.e. two distinct supertrees may induce identical sets of matches between identical pairs of trees. This ambiguity is uninformative, and detrimental to any probabilistic analysis. In this work, we consider tree alignments up to equivalence. Our first result is a precise asymptotic enumeration of tree alignments, obtained from a context-free grammar by means of basic analytic combinatorics. Our second result focuses on alignments between two given ordered trees. By refining our grammar to align specific trees, we obtain a decomposition scheme for the space of alignments, and use it to design an efficient dynamic programming algorithm for sampling alignments under the Gibbs-Boltzmann probability distribution. This generalizes existing tree alignment algorithms, and opens the door for a probabilistic analysis of the space of suboptimal RNA secondary structures alignments.


Tree alignment RNA secondary structure Dynamic programming 


  1. 1.
    Andrade, H., Area, I., Nieto, J.J., Torres, A.: The number of reduced alignments between two dna sequences. BMC Bioinformatics 15, 94 (2014). CrossRefGoogle Scholar
  2. 2.
    Blin, G., Denise, A., Dulucq, S., Herrbach, C., Touzet, H.: Alignments of RNA structures. IEEE/ACM Trans. Comput. Biol. Bioinform. 7(2), 309–322 (2010). CrossRefGoogle Scholar
  3. 3.
    Chauve, C., Courtiel, J., Ponty, Y.: Counting, generating and sampling tree alignments. In: ALCOB - 3rd International Conference on Algorithms for Computational Biology - 2016. Trujillo, Spain, Jun 2016.
  4. 4.
    Do, C.B., Gross, S.S., Batzoglou, S.: CONTRAlign: discriminative training for protein sequence alignment. In: Apostolico, A., Guerra, C., Istrail, S., Pevzner, P.A., Waterman, M. (eds.) RECOMB 2006. LNCS (LNBI), vol. 3909, pp. 160–174. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  5. 5.
    Dress, A., Morgenstern, B., Stoye, J.: The number of standard and of effective multiple alignments. Appl. Math. Lett. 11(4), 43–49 (1998). MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Flajolet, P., Sedgewick, R.: Analytic combinatorics. Cambridge University Press, Cambridge (2009)CrossRefzbMATHGoogle Scholar
  7. 7.
    Herrbach, C., Denise, A., Dulucq, S.: Average complexity of the Jiang-Wang-Zhang pairwise tree alignment algorithm and of a RNA secondary structure alignment algorithm. Theor. Comput. Sci. 411(26–28), 2423–2432 (2010). MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Höchsmann, M., Töller, T., Giegerich, R., Kurtz, S.: Local similarity in RNA secondary structures. Proc. Ieee Comput. Soc. Bioinform Conf. 2, 159–168 (2003)Google Scholar
  9. 9.
    Höchsmann, M., Voss, B., Giegerich, R.: Pure multiple rna secondary structure alignments: a progressive profile approach. IEEE/ACM Trans. Comput. Biol. Bioinformatics 1(1), 53–62 (2004). CrossRefGoogle Scholar
  10. 10.
    Jiang, T., Wang, L., Zhang, K.: Alignment of trees - an alternative to tree edit. Theor. Comput. Sci. 143(1), 137–148 (1995). MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Ponty, Y., Saule, C.: A combinatorial framework for designing (pseudoknotted) RNA algorithms. In: Przytycka, T.M., Sagot, M.-F. (eds.) WABI 2011. LNCS, vol. 6833, pp. 250–269. Springer, Heidelberg (2011). CrossRefGoogle Scholar
  12. 12.
    Schirmer, S., Giegerich, R.: Forest alignment with affine gaps and anchors, applied in RNA structure comparison. Theor. Comput. Sci. 483, 51–67 (2013). MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Torres, A., Cabada, A., Nieto, J.J.: An exact formula for the number of alignments between two DNA sequences. DNA Seq. 14(6), 427–430 (2003)CrossRefGoogle Scholar
  14. 14.
    Vingron, M., Argos, P.: Determination of reliable regions in protein sequence alignments. Protein Eng. 3(7), 565–569 (1990). CrossRefGoogle Scholar
  15. 15.
    Waterman, M.S.: Introduction to Computational Biology: Maps, Sequences, and Genomes. CRC Press, Pevzner (1995)CrossRefzbMATHGoogle Scholar
  16. 16.
    Wilf, H.S.: A unified setting for sequencing, ranking, and selection algorithms for combinatorial objects. Adv. Math. 24, 281–291 (1977)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.Department of MathematicsSimon Fraser UniversityBurnabyCanada
  2. 2.Pacific Institute for the Mathematical SciencesVancouverCanada
  3. 3.CNRS-LIX, Ecole PolytechniquePalaiseauFrance

Personalised recommendations