Journal of Biomolecular NMR

, Volume 63, Issue 1, pp 39–52 | Cite as

Prediction of hydrogen and carbon chemical shifts from RNA using database mining and support vector regression

  • Joshua D. Brown
  • Michael F. Summers
  • Bruce A. Johnson


The Biological Magnetic Resonance Data Bank (BMRB) contains NMR chemical shift depositions for over 200 RNAs and RNA-containing complexes. We have analyzed the 1H NMR and 13C chemical shifts reported for non-exchangeable protons of 187 of these RNAs. Software was developed that downloads BMRB datasets and corresponding PDB structure files, and then generates residue-specific attributes based on the calculated secondary structure. Attributes represent properties present in each sequential stretch of five adjacent residues and include variables such as nucleotide type, base-pair presence and type, and tetraloop types. Attributes and 1H and 13C NMR chemical shifts of the central nucleotide are then used as input to train a predictive model using support vector regression. These models can then be used to predict shifts for new sequences. The new software tools, available as stand-alone scripts or integrated into the NMR visualization and analysis program NMRViewJ, should facilitate NMR assignment and/or validation of RNA 1H and 13C chemical shifts. In addition, our findings enabled the re-calibration a ring-current shift model using published NMR chemical shifts and high-resolution X-ray structural data as guides.


RNA Chemical shift Secondary structure NMR signal assignment and validation 



This research was supported by Grants from the National Institute of General Medical Sciences of the National Institutes of Health (NIGMS, R01 GM42561 to MFS and P50 GM 103297 to BAJ), and JDB was supported by a NIGMS Grant for maximizing student diversity, NIGMS R25 GM 055036. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Supplementary material

10858_2015_9961_MOESM1_ESM.pdf (790 kb)
Supplementary material 1 (PDF 789 kb)


  1. Aeschbacher T, Schubert M, Allain FHT (2012) A procedure to validate and correct the 13C chemical shift calibration of RNA datasets. J Biomol NMR 52:179–190CrossRefGoogle Scholar
  2. Aeschbacher T et al (2013) Automated and assisted RNA resonance assignment using NMR chemical shift statistics. Nucleic Acids Res 41:e172. doi: 10.1093/nar/gkt665 CrossRefGoogle Scholar
  3. Altona C, Faber DH, Westra Hoekzema AJA (2000) Double-helical DNA 1H chemical shifts: an accurate and balanced predictive scheme. Magn Reson Chem 38:95–107CrossRefGoogle Scholar
  4. Bartel D (2004) MicroRNAs: genomics, biogenesis, mechanism, and function. Cell 116:281–297CrossRefGoogle Scholar
  5. Barton S, Heng X, Johnson B, Summers M (2013) Database proton NMR chemical shifts for RNA signal assignment and validation. J Biomol NMR 55:33–46. doi: 10.1007/s10858-012-9683-9 CrossRefGoogle Scholar
  6. Bessonov S, Anokhina M, Will C, Urlaub H, Luhrmann R (2008) Isolation of an active step I spliceosome and composition of its RNP core. Nature 452:846–850. doi: 10.1038/nature06842 CrossRefADSGoogle Scholar
  7. Bishop CM (2006) Pattern recognition and machine learning. Information science and statistics. Springer, New YorkGoogle Scholar
  8. Boisvert F, van Koningsbruggen S, Navascues J, Lamond A (2007) The multifunctional nucleolus. Nat Rev Mol Cell Biol 8:574–585. doi: 10.1038/nrm2184 CrossRefGoogle Scholar
  9. Bothe J, Nikolova E, Eichhorn C, Chugh J, Hansen A, Al-Hashimi H (2011) Characterizing RNA dynamics at atomic resolution using solution-state NMR spectroscopy. Nat Methods 8:919–931. doi: 10.1038/nmeth.1735 CrossRefGoogle Scholar
  10. Brodersen P, Voinnet O (2006) The diversity of RNA silencing pathways in plants. Trends Genet 22:268–280. doi: 10.1016/j.tig.2006.03.003 CrossRefGoogle Scholar
  11. Case D (1995) Calibration of ring-current effects in proteins and nucleic acids. J Biomol NMR 6:341–346CrossRefGoogle Scholar
  12. Chang C-C, Lin C-J (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol (TIST) 2:27Google Scholar
  13. Cromsigt JA, Hilbers CW, Wijmenga SS (2001) Prediction of proton chemical shifts in RNA. Their use in structure refinement and validation. J Biomol NMR 21:11–29CrossRefGoogle Scholar
  14. Dejaegere A, Bryce RA, Case DA (1999) An empirical analysis of proton chemical shifts in nucleic acids. In: Facelli J, deDios AC (eds) Modelling NMR chemical shifts: gaining insight into structure and environment. ACS symposium series. American Chemical Society, Washington, pp 194–206Google Scholar
  15. Doudna J, Rath V (2002) Structure and function of the eukaryotic ribosome: the next frontier. Cell 109:153–156CrossRefGoogle Scholar
  16. Edwards T, Klein D, Ferre-D’Amare A (2007) Riboswitches: small-molecule recognition by gene regulatory RNAs. Curr Opin Struct Biol 17:273–279. doi: 10.1016/ CrossRefGoogle Scholar
  17. Fares C, Amata I, Carlomagno T (2007) 13C-detection in RNA bases: revealing structure-chemical shift relationships. J Am Chem Soc 129:15814–15823. doi: 10.1021/ja0727417 CrossRefGoogle Scholar
  18. Fonville JM et al (2012) Chemical shifts in nucleic acids studied by density functional theory calculations and comparison with experiment. Chemistry 18:12372–12387. doi: 10.1002/chem.201103593 CrossRefGoogle Scholar
  19. Frank AT, Bae SH, Stelzer AC (2013) Prediction of RNA 1H and 13C chemical shifts: a structure based approach. J Phys Chem B 117:13497–13506. doi: 10.1021/jp407254m CrossRefGoogle Scholar
  20. Frank A, Law S, Brooks C (2014) A simple and fast approach for predicting 1H and 13C chemical shifts: toward chemical shift-guided simulations of RNA. J Phys Chem 118:12168–12175CrossRefGoogle Scholar
  21. Haigh C, Mallion R (1980) Progress in NMR spectroscopy, vol 13. Pergamon, New York, pp 303–344Google Scholar
  22. Hamada M (2015) RNA secondary structure prediction from multi-aligned sequences. Methods Mol Biol 1269:17–38. doi: 10.1007/978-1-4939-2291-8_2 CrossRefGoogle Scholar
  23. Hassouna N, Michot B, Bachellerie J (1984) The complete nucleotide sequence of mouse 28S rRNA gene. Implications for the process of size increase of the large subunit rRNA in higher eukaryotes. Nucleic Acids Res 12:3563–3583CrossRefGoogle Scholar
  24. Johnson BA, Blevins RA (1994) NMRView: a computer program for the visualization and analysis of NMR data. J Biomol NMR 4:603–614CrossRefGoogle Scholar
  25. Kim V (2005) Small RNAs: classification, biogenesis, and function. Mol Cells 19:1–15CrossRefGoogle Scholar
  26. Korostelev A, Noller H (2007) The ribosome in focus: new structures bring new insights. Trends Biochem Sci 32:434–441. doi: 10.1016/j.tibs.2007.08.002 CrossRefGoogle Scholar
  27. Krahenbuhl B, Lukavsky P, Wider G (2014) Strategy for automated NMR resonance assignment of RNA: application to 48-nucleotide K10. J Biomol NMR 59:231–240. doi: 10.1007/s10858-014-9841-3 CrossRefGoogle Scholar
  28. Kwok CK, Lam SL (2013) NMR proton chemical shift prediction of T·T mismatches in B-DNA duplexes. J Magn Reson 234:184–189. doi: 10.1016/j.jmr.2013.06.022 CrossRefADSGoogle Scholar
  29. Lam SL (2007) DSHIFT: a web server for predicting DNA chemical shifts. Nucleic Acids Res 35:W713–W717. doi: 10.1093/nar/gkm320 CrossRefGoogle Scholar
  30. Lam SL, Lai KF, Chi LM (2007) Proton chemical shift prediction of A·A mismatches in B-DNA duplexes. J Magn Reson 187:105–111. doi: 10.1016/j.jmr.2007.04.005 CrossRefADSGoogle Scholar
  31. Lu X, Olson W (2008) 3DNA: a versatile, integrated software system for the analysis, rebuilding and visualization of three-dimensional nucleic-acid structures. Nat Protoc 3:1213–1227. doi: 10.1038/nprot.2008.104 CrossRefGoogle Scholar
  32. Lu X, Olson W, Bussemaker H (2010) The RNA backbone plays a crucial role in mediating the intrinsic stability of the GpU dinucleotide platform and the GpUpA/GpA miniduplex. Nucleic Acids Res 38:4868–4876. doi: 10.1093/nar/gkq155 CrossRefGoogle Scholar
  33. Ng KS, Lam SL (2015) NMR proton chemical shift prediction of C·C mismatches in B-DNA. J Magn Reson 252:87–93. doi: 10.1016/j.jmr.2015.01.005 CrossRefADSGoogle Scholar
  34. Ponting C, Oliver P, Reik W (2009) Evolution and functions of long noncoding RNAs. Cell 136:629–641. doi: 10.1016/j.cell.2009.02.006 CrossRefGoogle Scholar
  35. Sahakyan AB, Vendruscolo M (2013) Analysis of the contributions of ring current and electric field effects to the chemical shifts of RNA bases. J Phys Chem B 117:1989–1998. doi: 10.1021/jp3057306 CrossRefGoogle Scholar
  36. Shen Y, Bax A (2010) SPARTA+: a modest improvement in empirical NMR chemical shift prediction by means of an artificial neural network. J Biomol NMR 48:13–22. doi: 10.1007/s10858-010-9433-9 CrossRefGoogle Scholar
  37. Sripakdeevong P et al (2014) Structure determination of noncanonical RNA motifs guided by (1)H NMR chemical shifts. Nat Methods 11:413–416. doi: 10.1038/nmeth.2876 CrossRefGoogle Scholar
  38. Steitz T (2008) A structural understanding of the dynamic ribosome machine. Nat Rev Mol Cell Biol 9:242–253. doi: 10.1038/nrm2352 CrossRefGoogle Scholar
  39. Tolbert B et al (2010) Major groove width variations in RNA structures determined by NMR and impact of 13C residual chemical shift anisotropy and 1H-13C residual dipolar coupling on refinement. J Biomol NMR 47:205–219. doi: 10.1007/s10858-010-9424-x CrossRefGoogle Scholar
  40. Ulrich E et al (2008) BioMagResBank. Nucleic Acids Res 36:D402–D408. doi: 10.1093/nar/gkm957 CrossRefGoogle Scholar
  41. van der Werf RM, Tessari M, Wijmenga SS (2013) Nucleic acid helix structure determination from NMR proton chemical shifts. J Biomol NMR 56:95–112. doi: 10.1007/s10858-013-9725-y CrossRefGoogle Scholar
  42. Wakeman CA, Winkler WC, Dann III CE (2007) Structural features of metabolite-sensing riboswitches. Trends Biochem Sci 32:415–424. doi: 10.1016/j.tibs.2007.08.005 CrossRefGoogle Scholar
  43. Wang Y, Witten IH (2002) Modeling for optimal probability prediction. In: Proceedings of the nineteenth international conference on machine learning, 2002. Morgan Kaufmann, San Mateo, pp 650–657Google Scholar
  44. Wang L, Eghbalnia H, Bahrami A, Markley J (2005) Linear analysis of carbon-13 chemical shift differences and its application to the detection and correction of errors in referencing and spin system identifications. J Biomol NMR 32:13–22. doi: 10.1007/s10858-005-1717-0 CrossRefGoogle Scholar
  45. Wang B, Wang Y, Wishart D (2010) A probabilistic approach for validating protein NMR chemical shift assignments. J Biomol NMR 47:85–99. doi: 10.1007/s10858-010-9407-y CrossRefGoogle Scholar
  46. Witten IH, Frank E, Hall MA (2011) Data mining: practical machine learning tools and techniques, 3rd edn (The Morgan Kaufmann Series in Data Management Systems). Morgan Kaufmann, San MateoGoogle Scholar
  47. Wüthrich K (1995) NMR in structural biology: a collection of papers by Kurt Wüthrich. World Scientific series in 20th century chemistry, vol 5. World Scientific, Singapore, River EdgeGoogle Scholar
  48. Xu X, Case D (2001) Automated prediction of 15N, 13Calpha, 13Cbeta and 13C′ chemical shifts in proteins using a density functional database. J Biomol NMR 21:321–333CrossRefGoogle Scholar
  49. Zhang H, Neal S, Wishart D (2003) RefDB: a database of uniformly referenced protein chemical shifts. J Biomol NMR 25:173–195CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2015

Authors and Affiliations

  1. 1.Howard Hughes Medical InstituteUniversity of Maryland Baltimore CountyBaltimoreUSA
  2. 2.Department of Chemistry and BiochemistryUniversity of Maryland Baltimore CountyBaltimoreUSA
  3. 3.One Moon Scientific, Inc.WestfieldUSA
  4. 4.CUNY Advanced Science Research CenterNew YorkUSA

Personalised recommendations