Structural Alignment of Pseudoknotted RNA

  • Banu Dost
  • Buhm Han
  • Shaojie Zhang
  • Vineet Bafna
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3909)

Abstract

In this paper, we address the problem of discovering novel non-coding RNA (ncRNA) using primary sequence, and secondary structure conservation, focusing on ncRNA families with pseudo-knotted structures. Our main technical result is an efficient algorithm for computing an optimum structural alignment of an RNA sequence against a genomic substring. This algorithm finds two applications. First, by scanning a genome, we can identify novel (homologous) pseudoknotted ncRNA, and second, we can infer the secondary structure of the target aligned sequence. We test an implementation of our algorithm (Pal), and show that it has near-perfect behavior for predicting the structure of many known pseudoknots. Additionally, it can detect the true homologs with high sensitivity and specificity in controlled tests. We also use Pal to search entire viral genome and mouse genome for novel homologs of some viral, and eukaryotic pseudoknots respectively. In each case, we have found strong support for novel homologs.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Argaman, L., et al.: Novel small RNA-encoding genes in the intergenic regions of Escherischia coli. Curr. Biol. 11, 941–950 (2001)CrossRefGoogle Scholar
  2. 2.
    Novina, C.D., Sharp, P.A.: The RNAi revolution. Nature 430, 161–164 (2004) (News)CrossRefGoogle Scholar
  3. 3.
    Storz, G.: An expanding universe of noncoding RNAs. Science 296, 1260–1263 (2002)CrossRefGoogle Scholar
  4. 4.
    Vitreschak, A., Rodionov, D., Mironov, A., Gelfand, M.: Riboswitches: the oldest mechanism for the regulation of gene expression? Trends in Genetics 20, 44–50 (2003)CrossRefGoogle Scholar
  5. 5.
    Winkler, W.C., Breaker, R.R.: Genetic control by metabolite-binding riboswitches. Chembiochem 4, 1024–1032 (2003)CrossRefGoogle Scholar
  6. 6.
    Eddy, S.: Non-coding RNA genes and the modern RNA world. Nature Reviews in Genetics 2, 919–929 (2001)CrossRefGoogle Scholar
  7. 7.
    Jaeger, J., Turner, D., Zuker, M.: Improved prediction of secondary structures for RNA. Proceedings of the National Academy of Sciences 86, 7706–7710 (1989)CrossRefGoogle Scholar
  8. 8.
    Zuker, M., Sankoff, D.: RNA secondary structures and their prediction. Bull. Math. Biol. 46, 591–621 (1984)MATHGoogle Scholar
  9. 9.
    Rivas, E., Eddy, S.: Secondary structure alone is generally not statistically significant for the detection of noncoding RNAs. Bioinformatics 16, 583–605 (2000)CrossRefGoogle Scholar
  10. 10.
    Griffiths-Jones, S., Moxon, S., Marshall, M., Khanna, A., Eddy, S.R., Bateman, A.: Rfam: annotating non-coding RNAs in complete genomes. Nucleic Acids Res 33, 121–124 (2005)CrossRefGoogle Scholar
  11. 11.
    Klein, R., Eddy, S.: Rsearch: Finding homologs of single structured rna sequences. BMC Bioinformatics 4, 44 (2003)CrossRefGoogle Scholar
  12. 12.
    Rastogi, T., Beattie, T.L., Olive, J.E., Collins, R.A.: A long-range pseudoknot is required for activity of the Neurospora VS ribozyme. EMBO J. 15, 2820–2825 (1996)Google Scholar
  13. 13.
    Adams, P.L., Stahley, M.R., Kosek, A.B., Wang, J., Strobel, S.A.: Crystal structure of a self-splicing group I intron with both exons. Nature 430, 45–50 (2004)CrossRefGoogle Scholar
  14. 14.
    Theimer, C.A., Blois, C.A., Feigon, J.: Structure of the human telomerase RNA pseudoknot reveals conserved tertiary interactions essential for function. Mol. Cell 17, 671–682 (2005)CrossRefGoogle Scholar
  15. 15.
    Nixon, P.L., Rangan, A., Kim, Y.G., Rich, A., Hoffman, D.W., Hennig, M., Giedroc, D.P.: Solution structure of a luteoviral P1-P2 frameshifting mRNA pseudoknot. J. Mol. Biol. 322, 621–633 (2002)CrossRefGoogle Scholar
  16. 16.
    Akutsu, T.: Dynamic programming algorithm for RNA secondary structure prediction with pseudoknots. Disc. Appl. Math. 104, 45–62 (2000)MATHCrossRefMathSciNetGoogle Scholar
  17. 17.
    Dirks, R.M., Pierce, N.A.: A partition function algorithm for nucleic acid secondary structure including pseudoknots. J. Comput. Chem. 24, 1664–1677 (2003)CrossRefGoogle Scholar
  18. 18.
    Evans, P.: Algorithms and Complexity for Annotated Sequence Analysis. PhD thesis, University of Victoria, Victoria BC, Canada (1964)Google Scholar
  19. 19.
    Jiang, T., Lin, G., Ma, B., Zhang, K.: A general edit distance between rna structures. Journal of Computational Biology 9, 371–388 (2002)CrossRefGoogle Scholar
  20. 20.
    Rivas, E., Eddy, S.: A Dynamic Programming Algorithm for RNA Structure Prediction Including Pseudoknots. Journal of Molecular Biology 285, 2053–2068 (1999)CrossRefGoogle Scholar
  21. 21.
    Condon, A., Davy, B., Rastegari, B., Tarrant, F., Zhao, S.: Classifying RNA Pseudoknotted Structures. Theoretical Computer Science 320, 35–50 (2004)MATHCrossRefMathSciNetGoogle Scholar
  22. 22.
    Rastegari, B., Condon, A.: Linear time algorithm for parsing RNA secondary structure. In: Casadio, R., Myers, G. (eds.) WABI 2005. LNCS (LNBI), vol. 3692, pp. 341–352. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  23. 23.
    Matsui, H., Sato, K., Sakakibara, Y.: Pair stochastic tree adjoining grammars for aligning and predicting pseudoknot RNA structures. Bioinformatics 21, 2611–2617 (2005)CrossRefGoogle Scholar
  24. 24.
    Bafna, V., Muthukrishnan, S., Ravi, R.: Computing similarity between RNA strings. Combinatorial Pattern Matching 937, 1–14 (1995)MathSciNetGoogle Scholar
  25. 25.
    Durbin, R., Eddy, S., Krogh, A., Mitchison, G.: 10.3 Covariance models: SCFG-based RNA profiles. In: Biological Sequence Analysis, Cambridge University Press, Cambridge (1998)CrossRefGoogle Scholar
  26. 26.
    Zhang, S., Hass, B., Eskin, E., Bafna, V.: Searching genomes for non-coding rna using fastr. IEEE Transactions on Computational Biology and Bioinformatics 2, 366–379 (2005)CrossRefGoogle Scholar
  27. 27.
    Weinberg, Z., Ruzzo, W.L.: Faster genome annotation of non-coding rna families without loss of accuracy. In: Proceedings of the Annual Intl. Conference on Computational Biology (RECOMB) (2004)Google Scholar
  28. 28.
    Zhang, S., Borovok, I., Aharonowitz, Y., Sharan, R., Bafna, V.: A Sequence-Based Filtering Method for ncRNA Identification and its Application to Searching for Riboswitch Elements (manuscript, 2005)Google Scholar
  29. 29.
    Baranov, P.V., Henderson, C.M., Anderson, C.B., Gesteland, R.F., Atkins, J.F., Howard, M.T.: Programmed ribosomal frameshifting in decoding the SARS-CoV genome. Virology 332, 498–510 (2005)CrossRefGoogle Scholar
  30. 30.
    Williams, G.D., Chang, R.Y., Brian, D.A.: A phylogenetically conserved hairpin-type 3’ untranslated region pseudoknot functions in coronavirus RNA replication. J. Virol. 73, 8349–8355 (1999)Google Scholar
  31. 31.
    Ben-Asouli, Y., Banai, Y., Pel-Or, Y., Shir, A., Kaempfer, R.: Human interferon-gamma mRNA autoregulates its translation through a pseudoknot that activates the interferon-inducible protein kinase PKR. Cell 108, 221–232 (2002)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Banu Dost
    • 1
  • Buhm Han
    • 1
  • Shaojie Zhang
    • 1
  • Vineet Bafna
    • 1
  1. 1.Department of Computer Science and EngineeringUniversity of CaliforniaSan Diego, La JollaUSA

Personalised recommendations