Abstract
A good scoring function is necessary for ab inito prediction of RNA tertiary structures. In this study, we explored the power of a machine learning based approach as a scoring function. Compared with the traditional scoring functions, the present approach is more flexible in incorporating different kinds of features; it is also free of the difficult problem of choosing the reference state. Two multi-layer neural networks were constructed and trained. They took RNA a structural candidate as input and then output its likeness score that evaluates the likeness of the candidate to the native structure. The first network was working at the coarse-grained level of RNA structures, while the second at the all-atom level. We also built an RNA database and split it into the training, validation, and testing sets, containing 322, 70, and 70 RNAs, respectively. Each RNA was accompanied with 300 decoys generated by high-temperature molecular dynamics simulations. The networks were trained on the training set and then optimized with an early-stop strategy, based on the loss of the validation set. We then tested the performance of the networks on the testing set. The results were found to be consistently better than a recent knowledge-based all-atom potential.
Similar content being viewed by others
REFERENCES
Parisien M., Major F. 2008. The MC-Fold and MC-Sym pipeline infers RNA structure from sequence data. Nature. 452, 51‒55.
Jonikas M.A., Radmer R.J., Laederach A., Das R., Pearlman S., Herschlag D., Altman R.B. 2009. Coarse-grained modeling of large RNA molecules with knowledge-based potentials and structural filters. RNA. 15 (2), 189‒199.
Flores S.C., Wan Y., Russell R., Altman R.B. 2010. Predicting RNA structure by multiple template homology modeling. Pac. Symp. Biocomput. 216‒227.
Sharma S., Ding F., Dokholyan N.V. 2008. iFoldRNA: Three-dimensional RNA structure prediction and folding. Bioinformatics. 24, 1951‒1952.
Das R., Baker D. 2007. Automated de novo prediction of native-like RNA tertiary structures. Proc. Natl. Acad. Sci. U. S. A. 104, 14664‒14669.
Zwieb C., Muller F. 1997. Three-dimensional comparative modeling of RNA. Nucleic Acids Symp Ser. 36, 69‒71.
Martinez H.M., Maizel J.V. Jr., Shapiro B.A. 2008. RNA2D3D: A program for generating, viewing, and comparing 3-dimensional models of RNA. J. Biomol. Struct. Dyn. 25, 669‒683.
Popenda M., Szachniuk M., Antczak M., Purzycka K.J., Lukasiak P., Bartol N., Blazewicz J., Adamiak R.W. 2012. Automated 3D structure composition for large RNAs. Nucleic Acids Res. 40, e112. https://doi.org/10.1093/nar/gks339
Zhao Y.J., Huang Y.Y., Gong Z., Wang Y., Man J., Xiao Y. 2012. Automated and fast building of three-dimensional RNA structures. Sci. Rep. 2, 734. https://doi.org/10.1038/srep00734
Wang J., Mao K.K., Zhao Y.J., Zeng C., Xiang J., Zhang Y., Xiao Y. 2017. Optimization of RNA 3D structure prediction using evolutionary restraints of nucleotide-nucleotide interactions from direct coupling analysis. Nucleic Acids Res. 45 (11), 6299‒6309.
Zhang J., Dundas J., Lin M., Chen R., Wang W., Liang J. 2009. Prediction of geometrically feasible three-dimensional structures of pseudoknotted RNA through free energy estimation. RNA. 15, 2248‒2263.
Zhang J., Zhang Y.J., Wang W. 2010. An RNA base discrete state model toward tertiary structure prediction. Chin. Phys. Lett. 27, 118702.
Zhang J., Bian Y.Q., Lin H., Wang W. 2012. RNA fragment modeling with a nucleobase discrete-state model. Phys. Rev. E. 85, 021909.
Li J., Zhang J., Wang J., Wang W. 2016. Structure prediction of RNA loops with a probabilistic approach. PLoS Comput. Biol. 12, e1005032.
Qasim R., Kauser N., Jilani T. 2011. Secondary structure prediction of RNA using machine learning method. Int. J. Comput. Appl. 10 (6), 24‒28.
Frellsen J., Moltke I., Thiim M., Mardia K.V., Ferkinghoff-Borg J., Hamelryck T. 2009. A probabilistic model of RNA conformational space. PLoS Comput Biol. 5, e1000406.
Wang Z., Xu J. 2011. A conditional random fields method for RNA sequence–structure relationship modeling and conformation sampling. Bioinformatics. 27, i102‒110.
Capriotti E., Norambuena T., Marti-Renom M.A., Melo F. 2011. All-atom knowledge-based potential for RNA structure prediction and assessment. Bioinformatics. 27, 1086‒1093.
Cao S., Chen S.J. 2006. Predicting RNA pseudoknot folding thermodynamics. Nucleic Acids Res. 34, 2634‒2652.
Tan Z.J., Chen S.J. 2011. Salt contribution to RNA tertiary structure folding stability. Biophys. J. 101, 176‒187.
Wu Y.Y., Zhang Z.L., Zhang J.S., Zhu X.L., Tan Z.J. 2015. Multivalent ion-mediated nucleic acid helix-helix interactions: RNA versus DNA. Nucleic Acids Res. 43, 6156‒6165.
Shi Y.Z., Wang F.H., Wu Y.Y., Tan Z.J. 2014. A coarse-grained model with implicit salt for RNAs: Predicting 3D structure, stability and salt effect. J. Chem. Phys. 141, 105102.
Shi Y.Z., Wu Y.Y., Wang F.H., et al. 2014. RNA structure prediction: Progress and perspective. Chinese Phys B. 23, 078701.
Gong S., Wang Y.J., Zhang W.B. 2015. The regulation mechanism of yitJ and metF riboswitches. J. Chem. Phys. 143, 045103.
Zhang W.B., Chen S.J. 2001. A three-dimensional statistical mechanical model of folding double-stranded chain molecules. J. Chem. Phys. 114, 7669‒7681.
Yang Y., Zhao H., Wang J., Zhou Y. 2014. SPOT-Seq-RNA: Predicting protein–RNA complex structure and RNA-binding function by fold recognition and binding affinity prediction. Methods Mol Biol. 1137, 119‒130.
Yang Y., Li X., Zhao H., Zhan J., Wang J., Zhou Y. 2017. Genome-scale characterization of RNA tertiary structures and their functional impact by RNA solvent accessibility prediction. RNA. 23, 14‒22.
Wang X., El Naqa I.M. 2008. Prediction of both conserved and nonconserved microRNA targets in animals. Bioinformatics. 24, 325‒332.
Xu X., Zhao P., Chen S.J. 2014. Vfold: A web server for RNA structure and folding thermodynamics prediction. PLoS One. 9, e107504.
Magnus M., Boniecki M.J., Dawson W., Bujnicki J.M. 2016. SimRNAweb: A web server for RNA 3D structure modeling with optional restraints. Nucleic Acids Res. 44, W315‒W319. https://doi.org/10.1093/nar/gkw279
Magnus M., Matelska D., Lach G., Chojnowski G., Boniecki M.J., Purta E., Dawson W., Dunin-Horkawicz S., Bujnicki J.M. 2014. Computational modeling of RNA 3D structures, with the aid of experimental restraints. RNA Biol. 11, 522‒536.
Zhang J., Lin M., Chen R., Wang W., Liang J. 2008. Discrete state model and accurate estimation of loop entropy of RNA secondary structures. J. Chem. Phys. 128, 125107.
Tang K., Zhang J.F., Liang J. 2014. Fast protein loop sampling and structure prediction using distance-guided sequential chain-growth Monte Carlo method. PLoS Comput. Biol. 10, e1003539.
Goodfellow I., Bengio Y., Courville A. 2016. Deep Learning. Cambridge, MA: MIT Press.
Silver D., Huang A., Maddison C.J., Guez A., Sifre L., van den Driessche G., Schrittwieser J., Antonoglou I., Panneershelvam V., Lanctot M., Dieleman S., Grewe D., Nham J., Kalchbrenner N., Sutskever I., et al. 2016. Mastering the game of Go with deep neural networks and tree search. Nature. 529, 484‒489.
Silver D., Schrittwieser J., Simonyan K., Antonoglou I., Huang A., Guez A., Hubert T., Baker L., Lai M., Bolton A., Chen Y., Lillicrap T., Hui F., Sifre L., van den Driessche G., et al. 2017. Mastering the game of Go without human knowledge. Nature. 550, 354‒359.
Carleo G., Troyer M. 2017. Solving the quantum many-body problem with artificial neural networks. Science. 355, 602‒605.
Carrasquilla J., Melko R.G. 2017. Machine learning phases of matter. Nat. Phys. 13, 431‒434.
van Nieuwenburg E.P.L., Liu Y.H., Huber S.D. 2017. Learning phase transitions by confusion. Nat. Phys. 13, 435‒439.
Author information
Authors and Affiliations
Corresponding author
Additional information
The text was submitted by the author(s) in English.
These authors contribute equally.
Rights and permissions
About this article
Cite this article
Wang, Y.Z., Li, J., Zhang, S. et al. An RNA Scoring Function for Tertiary Structure Prediction Based on Multi-Layer Neural Networks. Mol Biol 53, 118–126 (2019). https://doi.org/10.1134/S0026893319010175
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S0026893319010175