PSSMTS: position specific scoring matrices on tree structures

Sato, Kengo; Morita, Kensuke; Sakakibara, Yasubumi

doi:10.1007/s00285-007-0108-4

PSSMTS: position specific scoring matrices on tree structures

Published: 07 July 2007

Volume 56, pages 201–214, (2008)
Cite this article

Journal of Mathematical Biology Aims and scope Submit manuscript

Kengo Sato¹,
Kensuke Morita² &
Yasubumi Sakakibara²

75 Accesses
3 Citations
Explore all metrics

Abstract

Identifying non-coding RNA regions on the genome using computational methods is currently receiving a lot of attention. In general, it is essentially more difficult than the problem of detecting protein-coding genes because non-coding RNA regions have only weak statistical signals. On the other hand, most functional RNA families have conserved sequences and secondary structures which are characteristic of their molecular function in a cell. These are known as sequence motifs and consensus structures, respectively. In this paper, we propose an improved method which extends a pairwise structural alignment method for RNA sequences to handle position specific scoring matrices and hence to incorporate motifs into structural alignment of RNA sequences. To model sequence motifs, we employ position specific scoring matrices (PSSMs). Experimental results show that PSSMs enable us to find individual RNA families efficiently, especially if we have biological knowledge such as sequence motifs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

RNAlign2D: a rapid method for combined RNA structure and sequence-based alignment using a pseudo-amino acid substitution matrix

Article Open access 16 October 2021

De Novo Secondary Structure Motif Discovery Using RNAProfile

RNA motif discovery: a computational overview

Article Open access 09 October 2015

References

Coventry A., Kleitman D.J. and Berger B. (2004). MSARI: multiple sequence alignments for statistical detection of RNA secondary structure. Proc. Natl Acad. Sci. USA 101(33): 12,102–12,107
Article Google Scholar
Dalli D., Wilm A., Mainz I. and Steger G. (2006). STRAL: progressive alignment of non-coding RNA using base pairing probability vectors in quadratic time. Bioinformatics 22(13): 1593–1599
Article Google Scholar
Deng W., Zhu X., Skogerbø G., Zhao Y., Fu Z., Wang Y., He H., Cai L., Sun H., Liu C., Li B., Bai B., Wang J., Jia D., Sun S., He H., Cui Y., Wang Y., Bu D. and Chen R. (2006). Organization of the Caenorhabditis elegans small non-coding transcriptome: genomic features, biogenesis, and expression. Genome Res. 16(1): 20–29
Article Google Scholar
Do C.B., Mahabhashyam M.S.P., Brudno M. and Batzoglou S. (2005). ProbCons: probabilistic consistency-based multiple sequence alignment. Genome Res. 15(2): 330–340
Article Google Scholar
Eddy S.R. (1998). Profile hidden Markov models. Bioinformatics 14(9): 755–763
Article Google Scholar
Eddy S.R. (2001). Non-coding RNA genes and the modern RNA world. Nat. Rev. Genet. 2(12): 919–929
Article Google Scholar
Eddy S.R. (2002). A memory-efficient dynamic programming algorithm for optimal alignment of a sequence to an RNA secondary structure. BMC Bioinform. 3(1): 18
Article Google Scholar
Eddy S.R. and Durbin R. (1994). RNA sequence analysis using covariance models. Nucleic Acids Res. 22(11): 2079–2088
Article Google Scholar
Freyhult E.K., Bollback J.P. and Gardner P.P. (2007). Exploring genomic dark matter: a critical assessment of the performance of homology search methods on noncoding RNA. Genome Res. 17(1): 117–125
Article Google Scholar
Gautheret D. and Lambert A. (2001). Direct RNA motif definition and identification from multiple sequence alignments using secondary structure profiles. J. Mol. Biol. 313(5): 1003–1011
Article Google Scholar
Griffiths-Jones S., Moxon S., Marshall M., Khanna A., Eddy S.R. and Bateman A. (2005). Rfam: annotating non-coding RNAs in complete genomes. Nucleic Acids Res. 33(Database issue): D121–D124
Article Google Scholar
Havgaard J.H., Lyngsø R.B., Stormo G.D. and Gorodkin J. (2005). Pairwise local structural alignment of RNA sequences with sequence similarity less than 40%. Bioinformatics 21(9): 1815–1824
Article Google Scholar
Höchsmann M., Voss B. and Giegerich R. (2004). Pure multiple RNA secondary structure alignments: a progressive profile approach. IEEE Trans. Comput. Biol. Bioinform. 1(1): 53–62
Article Google Scholar
Hofacker I.L. (2003). Vienna RNA secondary structure server. Nucleic Acids Res. 31(13): 3429–3431
Article Google Scholar
Hofacker I.L., Bernhart S.H.F. and Stadler P.F. (2004). Alignment of RNA base pairing probability matrices. Bioinformatics 20(14): 2222–2227
Article Google Scholar
Hofacker I.L., Fekete M. and Stadler P.F. (2002). Secondary structure prediction for aligned RNA sequences. J. Mol. Biol. 319(5): 1059–1066
Article Google Scholar
Hüttenhofer A., Brosius J. and Bachellerie J.P. (2002). RNomics: identification and function of small, non-messenger RNAs. Curr. Opin. Chem. Biol. 6(6): 835–843
Article Google Scholar
Klein R.J. and Eddy S.R. (2003). RSEARCH: finding homologs of single structured RNA sequences. BMC Bioinform. 4(1): 44
Article Google Scholar
Krogh A., Brown M., Mian I.S., Sjölander K. and Haussler D. (1994). Hidden Markov models in computational biology. Applications to protein modeling. J. Mol. Biol. 235(5): 1501–1531
Article Google Scholar
Lambert A., Legendre M., Fontaine J.F. and Gautheret D. (2005). Computing expectation values for RNA motifs using discrete convolutions. BMC Bioinform. 6(1): 118
Article Google Scholar
Mathews D.H. and Turner D.H. (2002). Dynalign: an algorithm for finding the secondary structure common to two RNA sequences. J. Mol. Biol. 317(2): 191–203
Article Google Scholar
Matsui H., Sato K. and Sakakibara Y. (2005). Pair stochastic tree adjoining grammars for aligning and predicting pseudoknot RNA structures. Bioinformatics 21(11): 2611–2617
Article Google Scholar
Pedersen J.S., Bejerano G., Siepel A., Rosenbloom K., Lindblad-Toh K., Lander E.S., Kent J., Miller W. and Haussler D. (2006). Identification and classification of conserved RNA secondary structures in the human genome. PLoS Comput. Biol. 2(4): e33
Article Google Scholar
Rivas E. and Eddy S.R. (2001). Noncoding RNA gene detection using comparative sequence analysis. BMC Bioinform. 2(1): 8
Article Google Scholar
Sakakibara Y. (2003). Pair hidden Markov models on tree structures. Bioinformatics 19(suppl 1): i232–i240
Article MathSciNet Google Scholar
Sakakibara Y., Brown M., Hughey R., Mian I.S., Sjölander K., Underwood R.C. and Haussler D. (1994). Stochastic context-free grammars for tRNA modeling. Nucleic Acids Res. 22(23): 5112–5120
Article Google Scholar
Sankoff D. (1985). Simultaneous solution of the RNA folding, alignment and protosequence problems. SIAM J. Appl. Math. 45(5): 810–825
Article MATH MathSciNet Google Scholar
Sato K. and Sakakibara Y. (2005). RNA secondary structural alignment with conditional random fields. Bioinformatics 21(suppl 2): ii237–ii242
Article Google Scholar
Schattner P. (2002). Searching for RNA genes using base-composition statistics. Nucleic Acids Res. 30(9): 2076–2082
Article Google Scholar
Torarinsson E., Havgaard J.H. and Gorodkin J. (2007). Multiple structural alignment and clustering of RNA sequences. Bioinformatics 23: 926–932
Article Google Scholar
Uzilov A., Keegan J. and Mathews D. (2006). Detection of non-coding RNAs on the basis of predicted secondary structure formation free energy change. BMC Bioinform. 7(1): 173
Article Google Scholar
Wachi M., Ogawa T., Yokoyama K., Hokii Y., Shimoyama M., Muto A. and Ushida C. (2004). Isolation of eight novel Caenorhabditis elegans small RNAs. Gene 335: 47–56
Article Google Scholar
Washietl S., Hofacker I.L. and Stadler P.F. (2005). Fast and reliable prediction of noncoding RNAs. Proc. Natl. Acad. Sci. USA 102(7): 2454–2459
Article Google Scholar
Zuker M. and Stiegler P. (1981). Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information. Nucleic Acids Res. 9(1): 133–148
Article Google Scholar

Download references

Author information

Authors and Affiliations

Japan Biological Informatics Consortium, 2–45 Aomi, Koto-ku, Tokyo, 135–8073, Japan
Kengo Sato
Department of Biosciences and Informatics, Keio University, 3–14–1 Hiyoshi, Kohoku-ku, Yokohama, 223–8522, Japan
Kensuke Morita & Yasubumi Sakakibara

Authors

Kengo Sato
View author publications
You can also search for this author in PubMed Google Scholar
Kensuke Morita
View author publications
You can also search for this author in PubMed Google Scholar
Yasubumi Sakakibara
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kengo Sato.

Additional information

K. Sato and K. Morita contributed equally to this work.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sato, K., Morita, K. & Sakakibara, Y. PSSMTS: position specific scoring matrices on tree structures. J. Math. Biol. 56, 201–214 (2008). https://doi.org/10.1007/s00285-007-0108-4

Download citation

Received: 01 January 2007
Revised: 15 May 2007
Published: 07 July 2007
Issue Date: January 2008
DOI: https://doi.org/10.1007/s00285-007-0108-4

Keywords

Mathematics Subject Classification (2000)

92B05

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

PSSMTS: position specific scoring matrices on tree structures

Abstract

Access this article

Similar content being viewed by others

RNAlign2D: a rapid method for combined RNA structure and sequence-based alignment using a pseudo-amino acid substitution matrix

De Novo Secondary Structure Motif Discovery Using RNAProfile

RNA motif discovery: a computational overview

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification (2000)

Navigation

PSSMTS: position specific scoring matrices on tree structures

Abstract

Access this article

Similar content being viewed by others

RNAlign2D: a rapid method for combined RNA structure and sequence-based alignment using a pseudo-amino acid substitution matrix

De Novo Secondary Structure Motif Discovery Using RNAProfile

RNA motif discovery: a computational overview

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification (2000)

Search

Navigation