Template-Based and Template-Free Modeling of RNA 3D Structure: Inspirations from Protein Structure Modeling

  • Kristian Rother
  • Magdalena Rother
  • Michał Boniecki
  • Tomasz Puton
  • Konrad Tomala
  • Paweł Łukasz
  • Janusz M. Bujnicki
Chapter
Part of the Nucleic Acids and Molecular Biology book series (NUCLEIC, volume 27)

Abstract

In analogy to proteins, the function of RNA depends on its structure and dynamics, which are encoded in the linear sequence. While there are numerous methods for computational prediction of protein 3D structure from sequence, there have been however very few such methods for RNA. This chapter discusses template-based and template-free approaches for macromolecular structure prediction, with special emphasis on comparison between the already tried-and-tested methods for protein structure modeling and the very recently developed “protein-like” modeling methods for RNA. As examples, we briefly review our recently developed tools for RNA 3D structure prediction, including ModeRNA (template-based or comparative/homology modeling) and SimRNA (template-free or de novo modeling). ModeRNA requires, as an input, atomic 3D coordinates of a template RNA molecule and a user-specified sequence alignment between the target to be modeled and the template. It can model posttranscriptional modifications, a functionally important feature analogous to posttranslational modifications in proteins. It can model the structures of RNAs of essentially any length, provided that a starting template is known. SimRNA can fold RNA 3D structure starting from sequence alone. It is based on a coarse-grained representation of the polynucleotide chains (only three atoms per nucleotide) and uses a Monte Carlo sampling scheme to generate moves in the 3D space, with a statistical potential to estimate the free energy. The current implementation based on simulated annealing is able to find native-like conformations for RNAs <100 nt in length, with multiple runs required to fold long sequences.

Keywords

Structure Prediction Global Energy Minimum Protein Structure Modeling Folding Simulation Monte Carlo Dynamic 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Notes

Acknowledgments

Our work on template-based modeling of RNA structures was supported by the Faculty of Biology, Adam Mickiewicz University (PBWB-03/2009 grant to M.R.), and by the Polish Ministry of Science (PBZ/MNiSW/07/2006 grant to M.B.). Our work on template-free modeling of RNA structures was supported by the Polish Ministry of Science (HISZPANIA/152/2006 grant to J.M.B.) and by the EU (6FP grant “EURASNET,” LSHG-CT-2005-518238). Software development in the Bujnicki laboratory in IIMCB has been supported by the EU structural funds (POIG.02.03.00-00-003/09). K.R. was independently supported by the German Academic Exchange Service (grant D/09/42768).

We thank present and former members of the Bujnicki laboratory in IIMCB and at the UAM, in particular Ewa Wywiał, Pawel Skiba, Piotr Byzia, Irina Tuszynska, Joanna Kasprzak, Jerzy Orlowski, Tomasz Osiński, Marcin Domagalski, Anna Czerwoniec, Stanisław Dunin-Horkawicz, Marcin Skorupski, and Marcin Feder, for their comments and constructive criticism during development of our software. The unit test framework was brought near to us by Sandra Smit, Rob Knight, and Gavin Huttley. Special thanks go to the group of Russ Altman, who provided us with their modeling example to test ModeRNA. We also would like to thank Neocles Leontis for the critical reading of the manuscript of this chapter and him as well as Magda Jonikas, Fabrice Jossinet, Samuel Flores, Alain Laederach, Francois Major, and Eric Westhof for stimulating discussions and helpful advice on various occasions.

References

  1. Anfinsen CB (1973) Principles that govern the folding of protein chains. Science 181:223–230PubMedCrossRefGoogle Scholar
  2. Boniecki M, Rotkiewicz P, Skolnick J, Kolinski A (2003) Protein fragment reconstruction using various modeling techniques. J Comput Aided Mol Des 17:725–738PubMedCrossRefGoogle Scholar
  3. Boomsma W, Hamelryck T (2005) Full cyclic coordinate descent: solving the protein loop closure problem in Calpha space. BMC Bioinformatics 6:159PubMedCrossRefGoogle Scholar
  4. Bujnicki JM (2006) Protein-structure prediction by recombination of fragments. Chembiochem 7:19–27PubMedCrossRefGoogle Scholar
  5. Bujnicki JM (2008) Prediction of protein structures, functions and interactionsGoogle Scholar
  6. Chothia C, Gerstein M (1997) Protein evolution. How far can sequences diverge? Nature 385(579):581Google Scholar
  7. Chothia C, Lesk AM (1986) The relation between the divergence of sequence and structure in proteins. EMBO J 5:823–826PubMedGoogle Scholar
  8. Cock PJ, Antao T, Chang JT, Chapman BA, Cox CJ, Dalke A, Friedberg I, Hamelryck T, Kauff F, Wilczynski B, de Hoon MJ (2009) Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics 25:1422–1423PubMedCrossRefGoogle Scholar
  9. Cohen-Gonsaud M, Catherinot V, Labesse G, Douguet D (2004) From molecular modeling to drug design. In: Bujnicki JM (ed) Practical bioinformatics, vol 15. Springer, Berlin, pp 35–71CrossRefGoogle Scholar
  10. Cruz JA, Blanchet MF, Boniecki M, Bujnicki JM, Chen SJ, Cao S, Das R, Ding F, Dokholyan NV, Flores SC, Huang L, Lavender CA, Lisi V, Major F, Mikolajczak K, Patel DJ, Philips A, Puton T, SantaLucia J, Sijenyi F, Hermann T, Rother K, Rother M, Serganov A, Skorupski M, Soltysinski T, Sripakdeevong P, Tuszynska I, Weeks KM, Waldsich C, Wildauer M, Leontis NB, Westhof E (2012) RNA-Puzzles: A CASP-like evaluation of RNA three-dimensional structure prediction. RNA doi: 10.1261/rna.031054.111Google Scholar
  11. Czerwoniec A, Dunin-Horkawicz S, Purta E, Kaminska KH, Kasprzak JM, Bujnicki JM, Grosjean H, Rother K (2009) MODOMICS: a database of RNA modification pathways. 2008 update. Nucleic Acids Res 37:D118–121PubMedCrossRefGoogle Scholar
  12. Das R, Baker D (2007) Automated de novo prediction of native-like RNA tertiary structures. Proc Natl Acad Sci U S A 104:14664–14669PubMedCrossRefGoogle Scholar
  13. Das R, Karanicolas J, Baker D (2010) Atomic accuracy in predicting and designing noncanonical RNA structure. Nat Methods 7:291–294PubMedCrossRefGoogle Scholar
  14. Ding F, Sharma S, Chalasani P, Demidov VV, Broude NE, Dokholyan NV (2008) Ab initio RNA folding by discrete molecular dynamics: from structure prediction to folding mechanisms. RNA 14:1164–1173PubMedCrossRefGoogle Scholar
  15. Dowell RD, Eddy SR (2006) Efficient pairwise RNA structure prediction and alignment using sequence alignment constraints. BMC Bioinformatics 7:400PubMedCrossRefGoogle Scholar
  16. Dror O, Nussinov R, Wolfson H (2005) ARTS: alignment of RNA tertiary structures. Bioinformatics 21(Suppl 2):ii47–ii53PubMedCrossRefGoogle Scholar
  17. Duarte CM, Pyle AM (1998) Stepping through an RNA structure: A novel approach to conformational analysis. J Mol Biol 284:1465–1478PubMedCrossRefGoogle Scholar
  18. Eddy SR (2002) A memory-efficient dynamic programming algorithm for optimal alignment of a sequence to an RNA secondary structure. BMC Bioinformatics 3:18PubMedCrossRefGoogle Scholar
  19. Fiser A, Feig M, Brooks CL 3rd, Sali A (2002) Evolution and physics in comparative protein structure modeling. Acc Chem Res 35:413–421PubMedCrossRefGoogle Scholar
  20. Freyhult EK, Bollback JP, Gardner PP (2007) Exploring genomic dark matter: a critical assessment of the performance of homology search methods on noncoding RNA. Genome Res 17:117–125PubMedCrossRefGoogle Scholar
  21. Gardner PP, Daub J, Tate JG, Nawrocki EP, Kolbe DL, Lindgreen S, Wilkinson AC, Finn RD, Griffiths-Jones S, Eddy SR, Bateman A (2009) Rfam: updates to the RNA families database. Nucleic Acids Res 37:D136–140PubMedCrossRefGoogle Scholar
  22. Godzik A (2003) Fold recognition methods. Methods Biochem Anal 44:525–546PubMedGoogle Scholar
  23. Grishin NV (2001) Fold change in evolution of protein structures. J Struct Biol 134:167–185PubMedCrossRefGoogle Scholar
  24. Grosjean H (2009) DNA and RNA modification enzymes: structure, mechanism, function and evolution:682Google Scholar
  25. Hajdin CE, Ding F, Dokholyan NV, Weeks KM (2010) On the significance of an RNA tertiary structure prediction. RNA 16:1340–1349PubMedCrossRefGoogle Scholar
  26. Hardin C, Pogorelov TV, Luthey-Schulten Z (2002) Ab initio protein structure prediction. Curr Opin Struct Biol 12:176–181PubMedCrossRefGoogle Scholar
  27. Hinsen K (2000) The molecular modeling toolkit: a new approach to molecular simulations. J Comp Chem 21:79–85CrossRefGoogle Scholar
  28. Holmes I (2005) Accelerated probabilistic inference of RNA structure evolution. BMC Bioinformatics 6:73PubMedCrossRefGoogle Scholar
  29. Johnston MA, Galvan IF, Villa-Freixa J (2005) Framework-based design of a new all-purpose molecular simulation application: the Adun simulator. J Comput Chem 26:1647–1659PubMedCrossRefGoogle Scholar
  30. Jossinet F, Westhof E (2005) Sequence to Structure (S2S): display, manipulate and interconnect RNA data from sequence to structure. Bioinformatics 21:3320–3321PubMedCrossRefGoogle Scholar
  31. Jossinet F, Ludwig TE, Westhof E (2010) Assemble: an interactive graphical tool to analyze and build RNA architectures at the 2D and 3D levels. BioinformaticsGoogle Scholar
  32. Juhling F, Morl M, Hartmann RK, Sprinzl M, Stadler PF, Putz J (2009) tRNAdb 2009: compilation of tRNA sequences and tRNA genes. Nucleic Acids Res 37:D159–162PubMedCrossRefGoogle Scholar
  33. Klein RJ, Eddy SR (2003) RSEARCH: finding homologs of single structured RNA sequences. BMC Bioinformatics 4:44PubMedCrossRefGoogle Scholar
  34. Knight R, Maxwell P, Birmingham A, Carnes J, Caporaso JG, Easton BC, Eaton M, Hamady M, Lindsay H, Liu Z, Lozupone C, McDonald D, Robeson M, Sammut R, Smit S, Wakefield MJ, Widmann J, Wikman S, Wilson S, Ying H, Huttley GA (2007) PyCogent: a toolkit for making sense from sequence. Genome Biol 8:R171PubMedCrossRefGoogle Scholar
  35. Kolinski A (2004) Protein modeling and structure prediction with a reduced representation. Acta Biochim Pol 51:349–371PubMedGoogle Scholar
  36. Kolinski A, Bujnicki JM (2005) Generalized protein structure prediction based on combination of fold-recognition with de novo folding and evaluation of models. Proteins 61(Suppl 7):84–90PubMedCrossRefGoogle Scholar
  37. Kosinski J, Cymerman IA, Feder M, Kurowski MA, Sasin JM, Bujnicki JM (2003) A “FRankenstein’s monster” approach to comparative modeling: merging the finest fragments of Fold-Recognition models and iterative model refinement aided by 3D structure evaluation. Proteins 53(Suppl 6):369–379PubMedCrossRefGoogle Scholar
  38. Krasilnikov AS, Xiao Y, Pan T, Mondragon A (2004) Basis for structural diversity in homologous RNAs. Science 306:104–107PubMedCrossRefGoogle Scholar
  39. Krieger E, Nabuurs SB, Vriend G (2003) Homology modeling. Methods Biochem Anal 44:509–523PubMedGoogle Scholar
  40. Kumar S, Ma B, Tsai CJ, Sinha N, Nussinov R (2000) Folding and binding cascades: dynamic landscapes and population shifts. Protein Sci 9:10–19PubMedCrossRefGoogle Scholar
  41. Levitt M, Gerstein M (1998) A unified statistical framework for sequence comparison and structure comparison. Proc Natl Acad Sci U S A 95:5913–5920PubMedCrossRefGoogle Scholar
  42. Martinez HM, Maizel JV Jr, Shapiro BA (2008) RNA2D3D: a program for generating, viewing, and comparing 3-dimensional models of RNA. J Biomol Struct Dyn 25:669–683PubMedCrossRefGoogle Scholar
  43. Metropolis N, Ulam S (1949) The Monte Carlo method. J Am Stat Assoc 44:335–341PubMedCrossRefGoogle Scholar
  44. Michalsky E, Goede A, Preissner R (2003) Loops In Proteins (LIP)–a comprehensive loop database for homology modelling. Protein Eng 16:979–985PubMedCrossRefGoogle Scholar
  45. Moult J, Fidelis K, Kryshtafovych A, Rost B, Tramontano A (2009) Critical assessment of methods of protein structure prediction—Round VIII. Proteins 77(Suppl 9):1–4PubMedCrossRefGoogle Scholar
  46. Murray LJ, Arendall WB 3rd, Richardson DC, Richardson JS (2003) RNA backbone is rotameric. Proc Natl Acad Sci U S A 100:13904–13909PubMedCrossRefGoogle Scholar
  47. Nawrocki EP, Kolbe DL, Eddy SR (2009) Infernal 1.0: inference of RNA alignments. Bioinformatics 25:1335–1337PubMedCrossRefGoogle Scholar
  48. Olson WK, Flory PJ (1972) Spatial configurations of polynucleotide chains. I. Steric interactions in polyribonucleotides: a virtual bond model. Biopolymers 11:1–23PubMedCrossRefGoogle Scholar
  49. Otto W, Will S, Backofen R (2008) Structure local multiple alignment of RNA. GCB’2008, Germany, vol P, pp 178–188Google Scholar
  50. Parisien M, Major F (2008) The MC-Fold and MC-Sym pipeline infers RNA structure from sequence data. Nature 452:51–55PubMedCrossRefGoogle Scholar
  51. Parisien M, Cruz JA, Westhof E, Major F (2009) New metrics for comparing and assessing discrepancies between RNA 3D structures and models. RNA 15:1875–1885PubMedCrossRefGoogle Scholar
  52. Parsons J, Holmes JB, Rojas JM, Tsai J, Strauss CE (2005) Practical conversion from torsion space to Cartesian space for in silico protein synthesis. J Comput Chem 26:1063–1068PubMedCrossRefGoogle Scholar
  53. Poehlsgaard J, Douthwaite S (2005) The bacterial ribosome as a target for antibiotics. Nat Rev Microbiol 3:870–881PubMedCrossRefGoogle Scholar
  54. Pyle AM (2002) Metal ions in the structure and function of RNA. J Biol Inorg Chem 7:679–690PubMedCrossRefGoogle Scholar
  55. Richardson JS, Schneider B, Murray LW, Kapral GJ, Immormino RM, Headd JJ, Richardson DC, Ham D, Hershkovits E, Williams LD, Keating KS, Pyle AM, Micallef D, Westbrook J, Berman HM (2008) RNA backbone: consensus all-angle conformers and modular string nomenclature (an RNA Ontology Consortium contribution). RNA 14:465–481PubMedCrossRefGoogle Scholar
  56. Rother K, Rother M, Boniecki M, Puton T, Bujnicki JM (2011a) RNA and protein 3D structure modeling: similarities and differences. J Mol Model. 17:2325–2336Google Scholar
  57. Rother M, Rother K, Puton T, Bujnicki JM (2011b) ModeRNA: a tool for comparative modeling of RNA 3D structure. Nucleic Acids Res. 39:4007–4022Google Scholar
  58. Sayers EW, Barrett T, Benson DA, Bryant SH, Canese K, Chetvernin V, Church DM, DiCuccio M, Edgar R, Federhen S, Feolo M, Geer LY, Helmberg W, Kapustin Y, Landsman D, Lipman DJ, Madden TL, Maglott DR, Miller V, Mizrachi I, Ostell J, Pruitt KD, Schuler GD, Sequeira E, Sherry ST, Shumway M, Sirotkin K, Souvorov A, Starchenko G, Tatusova TA, Wagner L, Yaschenko E, Ye J (2009) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 37:D5–15PubMedCrossRefGoogle Scholar
  59. Scheraga HA (1996) Recent developments in the theory of protein folding: searching for the global energy minimum. Biophys Chem 59:329–339PubMedCrossRefGoogle Scholar
  60. Schudoma C, May P, Nikiforova V, Walther D (2010) Sequence-structure relationships in RNA loops: establishing the basis for loop homology modeling. Nucleic Acids Res 38:970–980PubMedCrossRefGoogle Scholar
  61. Schwede T, Kopp J, Guex N, Peitsch MC (2003) SWISS-MODEL: an automated protein homology-modeling server. Nucleic Acids Res 31:3381–3385PubMedCrossRefGoogle Scholar
  62. Simons KT, Kooperberg C, Huang E, Baker D (1997) Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. J Mol Biol 268:209–225PubMedCrossRefGoogle Scholar
  63. Sippl M (1993) Boltzmann’s principle, knowledge-based mean fields and protein folding. An approach to the computational determination of protein structures. J Comput Aided Mol Des 7:473–501PubMedCrossRefGoogle Scholar
  64. Thirumalai D, Hyeon C (2005) RNA and protein folding: common themes and variations. Biochemistry 44:4957–4970PubMedCrossRefGoogle Scholar
  65. Torarinsson E, Havgaard JH, Gorodkin J (2007) Multiple structural alignment and clustering of RNA sequences. Bioinformatics 23:926–932PubMedCrossRefGoogle Scholar
  66. Tozzini V (2009) Multiscale modeling of proteins. Acc Chem ResGoogle Scholar
  67. Weinberg Z, Ruzzo WL (2006) Sequence-based heuristics for faster annotation of non-coding RNA families. Bioinformatics 22:35–39PubMedCrossRefGoogle Scholar
  68. Wilm A, Higgins DG, Notredame C (2008) R-Coffee: a method for multiple alignment of non-coding RNA. Nucleic Acids Res 36:e52PubMedCrossRefGoogle Scholar
  69. Zemla A (2003) LGA: A method for finding 3D similarities in protein structures. Nucleic Acids Res 31:3370–3374PubMedCrossRefGoogle Scholar
  70. Zhang Y, Skolnick J (2004a) Automated structure prediction of weakly homologous proteins on a genomic scale. Proc Natl Acad Sci U S A 101:7594–7599PubMedCrossRefGoogle Scholar
  71. Zhang Y, Skolnick J (2004b) Scoring function for automated assessment of protein structure template quality. Proteins 57:702–710PubMedCrossRefGoogle Scholar
  72. Zwieb C, Muller F (1997) Three-dimensional comparative modeling of RNA. Nucleic Acids Symp Ser:69–71Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Kristian Rother
    • 1
    • 2
  • Magdalena Rother
    • 1
    • 2
  • Michał Boniecki
    • 1
  • Tomasz Puton
    • 1
    • 2
  • Konrad Tomala
    • 1
  • Paweł Łukasz
    • 1
  • Janusz M. Bujnicki
    • 1
    • 2
  1. 1.Laboratory of Bioinformatics and Protein EngineeringInternational Institute of Molecular and Cell BiologyWarsawPoland
  2. 2.Laboratory of Structural Bioinformatics, Institute of Molecular Biology and Biotechnology, Faculty of BiologyAdam Mickiewicz UniversityPoznanPoland

Personalised recommendations