Skip to main content

Advertisement

Log in

PSSMTS: position specific scoring matrices on tree structures

  • Published:
Journal of Mathematical Biology Aims and scope Submit manuscript

Abstract

Identifying non-coding RNA regions on the genome using computational methods is currently receiving a lot of attention. In general, it is essentially more difficult than the problem of detecting protein-coding genes because non-coding RNA regions have only weak statistical signals. On the other hand, most functional RNA families have conserved sequences and secondary structures which are characteristic of their molecular function in a cell. These are known as sequence motifs and consensus structures, respectively. In this paper, we propose an improved method which extends a pairwise structural alignment method for RNA sequences to handle position specific scoring matrices and hence to incorporate motifs into structural alignment of RNA sequences. To model sequence motifs, we employ position specific scoring matrices (PSSMs). Experimental results show that PSSMs enable us to find individual RNA families efficiently, especially if we have biological knowledge such as sequence motifs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Coventry A., Kleitman D.J. and Berger B. (2004). MSARI: multiple sequence alignments for statistical detection of RNA secondary structure. Proc. Natl Acad. Sci. USA 101(33): 12,102–12,107

    Article  Google Scholar 

  2. Dalli D., Wilm A., Mainz I. and Steger G. (2006). STRAL: progressive alignment of non-coding RNA using base pairing probability vectors in quadratic time. Bioinformatics 22(13): 1593–1599

    Article  Google Scholar 

  3. Deng W., Zhu X., Skogerbø G., Zhao Y., Fu Z., Wang Y., He H., Cai L., Sun H., Liu C., Li B., Bai B., Wang J., Jia D., Sun S., He H., Cui Y., Wang Y., Bu D. and Chen R. (2006). Organization of the Caenorhabditis elegans small non-coding transcriptome: genomic features, biogenesis, and expression. Genome Res. 16(1): 20–29

    Article  Google Scholar 

  4. Do C.B., Mahabhashyam M.S.P., Brudno M. and Batzoglou S. (2005). ProbCons: probabilistic consistency-based multiple sequence alignment. Genome Res. 15(2): 330–340

    Article  Google Scholar 

  5. Eddy S.R. (1998). Profile hidden Markov models. Bioinformatics 14(9): 755–763

    Article  Google Scholar 

  6. Eddy S.R. (2001). Non-coding RNA genes and the modern RNA world. Nat. Rev. Genet. 2(12): 919–929

    Article  Google Scholar 

  7. Eddy S.R. (2002). A memory-efficient dynamic programming algorithm for optimal alignment of a sequence to an RNA secondary structure. BMC Bioinform. 3(1): 18

    Article  Google Scholar 

  8. Eddy S.R. and Durbin R. (1994). RNA sequence analysis using covariance models. Nucleic Acids Res. 22(11): 2079–2088

    Article  Google Scholar 

  9. Freyhult E.K., Bollback J.P. and Gardner P.P. (2007). Exploring genomic dark matter: a critical assessment of the performance of homology search methods on noncoding RNA. Genome Res. 17(1): 117–125

    Article  Google Scholar 

  10. Gautheret D. and Lambert A. (2001). Direct RNA motif definition and identification from multiple sequence alignments using secondary structure profiles. J. Mol. Biol. 313(5): 1003–1011

    Article  Google Scholar 

  11. Griffiths-Jones S., Moxon S., Marshall M., Khanna A., Eddy S.R. and Bateman A. (2005). Rfam: annotating non-coding RNAs in complete genomes. Nucleic Acids Res. 33(Database issue): D121–D124

    Article  Google Scholar 

  12. Havgaard J.H., Lyngsø R.B., Stormo G.D. and Gorodkin J. (2005). Pairwise local structural alignment of RNA sequences with sequence similarity less than 40%. Bioinformatics 21(9): 1815–1824

    Article  Google Scholar 

  13. Höchsmann M., Voss B. and Giegerich R. (2004). Pure multiple RNA secondary structure alignments: a progressive profile approach. IEEE Trans. Comput. Biol. Bioinform. 1(1): 53–62

    Article  Google Scholar 

  14. Hofacker I.L. (2003). Vienna RNA secondary structure server. Nucleic Acids Res. 31(13): 3429–3431

    Article  Google Scholar 

  15. Hofacker I.L., Bernhart S.H.F. and Stadler P.F. (2004). Alignment of RNA base pairing probability matrices. Bioinformatics 20(14): 2222–2227

    Article  Google Scholar 

  16. Hofacker I.L., Fekete M. and Stadler P.F. (2002). Secondary structure prediction for aligned RNA sequences. J. Mol. Biol. 319(5): 1059–1066

    Article  Google Scholar 

  17. Hüttenhofer A., Brosius J. and Bachellerie J.P. (2002). RNomics: identification and function of small, non-messenger RNAs. Curr. Opin. Chem. Biol. 6(6): 835–843

    Article  Google Scholar 

  18. Klein R.J. and Eddy S.R. (2003). RSEARCH: finding homologs of single structured RNA sequences. BMC Bioinform. 4(1): 44

    Article  Google Scholar 

  19. Krogh A., Brown M., Mian I.S., Sjölander K. and Haussler D. (1994). Hidden Markov models in computational biology. Applications to protein modeling. J. Mol. Biol. 235(5): 1501–1531

    Article  Google Scholar 

  20. Lambert A., Legendre M., Fontaine J.F. and Gautheret D. (2005). Computing expectation values for RNA motifs using discrete convolutions. BMC Bioinform. 6(1): 118

    Article  Google Scholar 

  21. Mathews D.H. and Turner D.H. (2002). Dynalign: an algorithm for finding the secondary structure common to two RNA sequences. J. Mol. Biol. 317(2): 191–203

    Article  Google Scholar 

  22. Matsui H., Sato K. and Sakakibara Y. (2005). Pair stochastic tree adjoining grammars for aligning and predicting pseudoknot RNA structures. Bioinformatics 21(11): 2611–2617

    Article  Google Scholar 

  23. Pedersen J.S., Bejerano G., Siepel A., Rosenbloom K., Lindblad-Toh K., Lander E.S., Kent J., Miller W. and Haussler D. (2006). Identification and classification of conserved RNA secondary structures in the human genome. PLoS Comput. Biol. 2(4): e33

    Article  Google Scholar 

  24. Rivas E. and Eddy S.R. (2001). Noncoding RNA gene detection using comparative sequence analysis. BMC Bioinform. 2(1): 8

    Article  Google Scholar 

  25. Sakakibara Y. (2003). Pair hidden Markov models on tree structures. Bioinformatics 19(suppl 1): i232–i240

    Article  MathSciNet  Google Scholar 

  26. Sakakibara Y., Brown M., Hughey R., Mian I.S., Sjölander K., Underwood R.C. and Haussler D. (1994). Stochastic context-free grammars for tRNA modeling. Nucleic Acids Res. 22(23): 5112–5120

    Article  Google Scholar 

  27. Sankoff D. (1985). Simultaneous solution of the RNA folding, alignment and protosequence problems. SIAM J. Appl. Math. 45(5): 810–825

    Article  MATH  MathSciNet  Google Scholar 

  28. Sato K. and Sakakibara Y. (2005). RNA secondary structural alignment with conditional random fields. Bioinformatics 21(suppl 2): ii237–ii242

    Article  Google Scholar 

  29. Schattner P. (2002). Searching for RNA genes using base-composition statistics. Nucleic Acids Res. 30(9): 2076–2082

    Article  Google Scholar 

  30. Torarinsson E., Havgaard J.H. and Gorodkin J. (2007). Multiple structural alignment and clustering of RNA sequences. Bioinformatics 23: 926–932

    Article  Google Scholar 

  31. Uzilov A., Keegan J. and Mathews D. (2006). Detection of non-coding RNAs on the basis of predicted secondary structure formation free energy change. BMC Bioinform. 7(1): 173

    Article  Google Scholar 

  32. Wachi M., Ogawa T., Yokoyama K., Hokii Y., Shimoyama M., Muto A. and Ushida C. (2004). Isolation of eight novel Caenorhabditis elegans small RNAs. Gene 335: 47–56

    Article  Google Scholar 

  33. Washietl S., Hofacker I.L. and Stadler P.F. (2005). Fast and reliable prediction of noncoding RNAs. Proc. Natl. Acad. Sci. USA 102(7): 2454–2459

    Article  Google Scholar 

  34. Zuker M. and Stiegler P. (1981). Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information. Nucleic Acids Res. 9(1): 133–148

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kengo Sato.

Additional information

K. Sato and K. Morita contributed equally to this work.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sato, K., Morita, K. & Sakakibara, Y. PSSMTS: position specific scoring matrices on tree structures. J. Math. Biol. 56, 201–214 (2008). https://doi.org/10.1007/s00285-007-0108-4

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00285-007-0108-4

Keywords

Mathematics Subject Classification (2000)

Navigation