Integrating Thermodynamic and Observed-Frequency Data for Non-coding RNA Gene Search
Among the most powerful and commonly used methods for finding new members of non-coding RNA gene families in genomic data are covariance models. The parameters of these models are estimated from the observed position-specific frequencies of insertions, deletions, and mutations in a multiple alignment of known non-coding RNA family members. Since the vast majority of positions in the multiple alignment have no observed changes, yet there is no reason to rule them out, some form of prior is applied to the estimate. Currently, observed-frequency priors are generated from non-family members based on model node type and child node type allowing for some differentiation between priors for loops versus helices and between internal segments of structures and edges of structures. In this work it is shown that parameter estimates might be improved when thermodynamic data is combined with the consensus structure/sequence and observed-frequency priors to create more realistic position-specific priors.
KeywordsBioinformatics Covariance models Non-coding RNA gene search RNA secondary structure Database search
Unable to display preview. Download preview PDF.
- 1.Gesteland, R.F., Cech, T.R., Atkins, J.F.: The RNA World, 3rd edn. Cold Spring Harbor Laboratory Press, New York (2006)Google Scholar
- 5.Smith, S.F.: Covariance Searches for ncRNA Gene Finding. In: IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, pp. 320–326 (2006)Google Scholar
- 7.Eddy, S.R.: Infernal 0.81 User’s Guide (2007), http://infernal.janelia.org/
- 9.Eddy, S.R.: The HMMER User’s Guide (2003), http://hmmer.janelia.org/
- 11.Brown, M., Hughey, R., Krogh, A., Mian, I.S., Sjölander, K., et al.: Using Dirichlet Mixture Priors to Derive Hidden Markov Models for Protein Families. In: Conference on Intelligent Systems for Molecular Biology, pp. 47–55 (1993)Google Scholar
- 13.Sjölander, K., Karplus, K., Brown, M., Hughey, R., Krogh, A., Mian, I., Haussler, D.: Dirichlet Mixtures: A Method for Improving Detection of Weak but Significant Protein Sequence Homology. Comp. Appl. BioSci. 12, 327–345 (1996)Google Scholar
- 16.Calin, G.A., Dumitru, C.D., Shimizu, M., Bichi, R., Zupo, S., Noch, E., Aldler, H., Rattan, S., Keating, M., Rai, K., Rassenti, L., Kipps, T., Negrini, M., Bullrich, F., Croce, C.M.: Frequent Deletions and Down-Regulation of Micro-RNA Genes miR15 and miR16 at 13q14 in Chronic Lymphocytic Leukemia. Proc. Nat. Acad. Sci. USA 99, 15524–15529 (2002)CrossRefGoogle Scholar
- 20.Smith, S.F., Wiese, K.C.: Improved Covariance Model Parameter Estimation Using RNA Thermodynamic Properties. In: International Conference on Bio-Inspired Models of Network, Information, and Computing Systems - Bionetics (2007)Google Scholar