Normalized Similarity of RNA Sequences

  • Rolf Backofen
  • Danny Hermelin
  • Gad M. Landau
  • Oren Weimann
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3772)

Abstract

We introduce a normalized version of the LCS metric as a new local similarity measure for comparing two RNAs. An \(\mathcal{O}(n^{2}m{\rm lg}m)\) time algorithm is presented for computing the maximum normalized score of two RNA sequences, where n and m are the lengths of the sequences and nm. This algorithm has the same time complexity as the currently best known global LCS algorithm.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Alber, J., Gramm, J., Guo, J., Niedermeier, R.: Towards optimally solving the longest common subsequence problem for sequences with nested arc annotations in linear time. In: Apostolico, A., Takeda, M. (eds.) CPM 2002. LNCS, vol. 2373, pp. 99–114. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  2. 2.
    Apostolico, A., Guerra, C.: The longest common subsequence problem revisited. Algorithmica 2, 315–336 (1987)MATHCrossRefMathSciNetGoogle Scholar
  3. 3.
    Arslan, A.N., Eǧecioğlu, Ö., Pevzner, P.A.: A new approach to sequence alignment: normalized sequence alignment. Bioinformatics 17(4), 327–337 (2001)CrossRefGoogle Scholar
  4. 4.
    Bille, P.: A survey on tree edit distance and related problems. Theoretical Computer Science 337, 217–239 (2005)MATHCrossRefMathSciNetGoogle Scholar
  5. 5.
    Chartrand, P., Meng, X.-H., Singer, R.H., Long, R.M.: Structural elements required for the localization of ASH1 mRNA and of a green fluorescent protein reporter particle in vivo. Current Biology 9, 333–336 (1999)CrossRefGoogle Scholar
  6. 6.
    Couzin, J.: Breakthrough of the year. Small RNAs make big splash. Science 298(5602), 2296–2297 (2002)CrossRefGoogle Scholar
  7. 7.
    Efraty, N., Landau, G.M.: Sparse normalized local alignment. In: Sahinalp, S.C., Muthukrishnan, S.M., Dogrusoz, U. (eds.) CPM 2004. LNCS, vol. 3109, pp. 333–346. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  8. 8.
    Evans, P.A.: Algorithms and complexity for annotated sequence analysis. PhD thesis, University of Alberta (1999)Google Scholar
  9. 9.
    Gramm, J., Guo, J., Niedermeier, R.: Pattern matching for arc annotated sequences. In: Agrawal, M., Seth, A.K. (eds.) FSTTCS 2002. LNCS, vol. 2556, pp. 182–193. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  10. 10.
    Hirschberg, D.S.: Algorithms for the longest common subsequence problem. Journal of the ACM 24(4), 664–675 (1977)MATHCrossRefMathSciNetGoogle Scholar
  11. 11.
    Hunt, J.W., Szymanski, T.G.: A fast algorithm for computing longest common subsequences. Communications of the ACM 20(5), 350–353 (1977)MATHCrossRefMathSciNetGoogle Scholar
  12. 12.
    Jiang, T., Lin, G.-H., Ma, B., Zhang, K.: The longest common subsequence problem for arc-annotated sequences. In: Giancarlo, R., Sankoff, D. (eds.) CPM 2000. LNCS, vol. 1848, pp. 154–165. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  13. 13.
    Klein, P.N.: Computing the Edit-Distance between Unrooted Ordered Trees. In: Bilardi, G., Pietracaprina, A., Italiano, G.F., Pucci, G. (eds.) ESA 1998. LNCS, vol. 1461, pp. 91–102. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  14. 14.
    Moore, P.B.: Structural motifs in RNA. Annual review of biochemistry 68, 287–300 (1999)CrossRefGoogle Scholar
  15. 15.
    Shasha, D., Zhang, K.: Simple fast algorithms for the editing distance between trees and related problems. SIAM Journal on Computing 18(6), 1245–1262 (1989)MATHCrossRefMathSciNetGoogle Scholar
  16. 16.
    Smith, T.F., Waterman, M.S.: The identification of common molecular subsequences. Journal of Molecular Biology 147, 195–197 (1981)CrossRefGoogle Scholar
  17. 17.
    Zhang, K.: Computing similarity between RNA secondary structures. In: Proc. of the IEEE joint symposium on Intelligence and Systems conference, pp. 126–132 (1998)Google Scholar
  18. 18.
    Zuker, M.: On finding all suboptimal foldings of an RNA molecule. Science 244(4900), 48–52 (1989)CrossRefMathSciNetGoogle Scholar
  19. 19.
    Zuker, M., Stiegler, P.: Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information. Nucleic Acids Research 9(1), 133–148 (1981)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Rolf Backofen
    • 1
  • Danny Hermelin
    • 2
  • Gad M. Landau
    • 3
    • 4
  • Oren Weimann
    • 2
  1. 1.Institute of Computer ScienceFriedrich-Schiller Universität Jena, Jena Center for BioinformaticsGermany
  2. 2.Department of Computer ScienceUniversity of HaifaIsrael
  3. 3.Department of Computer ScienceUniversity of HaifaHaifaIsrael
  4. 4.Department of Computer and Information SciencePolytechnic UniversityNew YorkUSA

Personalised recommendations