Skip to main content
Log in

Hash: a program to accurately predict protein Hα shifts from neighboring backbone shifts

  • Article
  • Published:
Journal of Biomolecular NMR Aims and scope Submit manuscript

Abstract

Chemical shifts provide not only peak identities for analyzing nuclear magnetic resonance (NMR) data, but also an important source of conformational information for studying protein structures. Current structural studies requiring Hα chemical shifts suffer from the following limitations. (1) For large proteins, the Hα chemical shifts can be difficult to assign using conventional NMR triple-resonance experiments, mainly due to the fast transverse relaxation rate of Cα that restricts the signal sensitivity. (2) Previous chemical shift prediction approaches either require homologous models with high sequence similarity or rely heavily on accurate backbone and side-chain structural coordinates. When neither sequence homologues nor structural coordinates are available, we must resort to other information to predict Hα chemical shifts. Predicting accurate Hα chemical shifts using other obtainable information, such as the chemical shifts of nearby backbone atoms (i.e., adjacent atoms in the sequence), can remedy the above dilemmas, and hence advance NMR-based structural studies of proteins. By specifically exploiting the dependencies on chemical shifts of nearby backbone atoms, we propose a novel machine learning algorithm, called Hash, to predict Hα chemical shifts. Hash combines a new fragment-based chemical shift search approach with a non-parametric regression model, called the generalized additive model, to effectively solve the prediction problem. We demonstrate that the chemical shifts of nearby backbone atoms provide a reliable source of information for predicting accurate Hα chemical shifts. Our testing results on different possible combinations of input data indicate that Hash has a wide rage of potential NMR applications in structural and biological studies of proteins.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Apaydin MS, Çatay B, Patrick N, Donald BR (2010) NVR-BIP: nuclear vector replacement using binary integer programming for NMR structure-based assignments. Comput J

  • Apaydin S, Conitzer V, Donald BR (2008) Structure-based protein NMR assignments using native structural ensembles. J Biomol NMR 40:263–276

    Article  Google Scholar 

  • Arun K, Langmead C (2006) Structure based chemical shift prediction using Random Forests non-linear regression. In: Proceedings of the forth Asia-Pacific bioinformatics conference, (APBC) 2006

  • Bailey-Kellogg C, Widge A, Kelley JJ, Berardi MJ, Bushweller JH, Donald BR (2000) The NOESY jigsaw: automated protein secondary structure and main-chain assignment from sparse, unassigned NMR data. J Comput Biol 7(3–4):537–558

    Article  Google Scholar 

  • Cleveland W, Devlin S (1988) Locally-weighted regression: An approach to regression analysis by local fitting. J Am Stat Assoc 403:596–610

    Article  Google Scholar 

  • Cornilescu G, Delaglio F, Bax A (1999) Protein backbone angle restraints from searching a database for chemical shift and sequence homology. J Biomol NMR 13:289–302

    Article  Google Scholar 

  • Delaglio F, Kontaxis G, Bax A (2000) Protein structure determination using molecular fragment replacement and NMR dipolar couplings. J Am Chem Soc 122:2142–2143

    Article  Google Scholar 

  • Donald BR (2011) Algorithms in structural molecular biology. MIT Press, Cambridge, Mass., USA

    Google Scholar 

  • Donald BR, Martin J (2009) Automated NMR assignment and protein structure determination using sparse dipolar coupling constraints. Prog NMR Spectrosc 55:101–127

    Article  Google Scholar 

  • Han B, Liu Y, Ginzinger SW, Wishart DS (2011) SHIFTX2: significantly improved protein chemical shift prediction. J Biomol NMR 50(1):43–57

    Article  Google Scholar 

  • Hastie T (2011) R Package: generalized additive models. http://cran.r-project.org/web/packages/gam/

  • Hastie TJ, Tibshirani RJ (1990) Generalized additive models. Chapman and Hall, London

    MATH  Google Scholar 

  • He Y, Chen Y, Alexander P, Bryan PN, Orban J (2008) NMR structures of two designed proteins with high sequence identity but different fold and function. Proc Natl Acad Sci USA 105(38):14412–14417

    Article  ADS  Google Scholar 

  • Iwadate M, Asakura T, Williamson MP (1999) C alpha and C beta carbon-13 chemical shifts in proteins from an empirical database. J Biomol NMR 13(3):199–211

    Article  Google Scholar 

  • Jang R, Gao X, Li M (2011) Towards fully automated structure-based NMR resonance assignment of 15N-labeled proteins from automatically picked peaks. J Comput Biol 18(3):347–363

    Article  MathSciNet  Google Scholar 

  • Kohlhoff KJ, Robustelli P, Cavalli A, Salvatella X, Vendruscolo M (2009) Fast and accurate predictions of protein NMR chemical shifts from interatomic distances. J Am Chem Soc 131(39):13894–13895

    Article  Google Scholar 

  • Lange OF, Rossi P, Sgourakis NG, Song Y, Lee H-W, Aramini JM, Ertekin A, Xiao R, Acton TB, Montelione GT, Baker D (2012) Determination of solution structures of proteins up to 40 kda using cs-rosetta with sparse nmr data from deuterated samples. Proc Natl Acad Sci USA 109(27):10873–10878

    Article  ADS  Google Scholar 

  • Langmead C, Donald B (2004) An expectation/maximization nuclear vector replacement algorithm for automated NMR resonance assignments. J Biomol NMR 29(2):111–138

    Article  Google Scholar 

  • Langmead CJ, Yan AK, Lilien RH, Wang L, Donald BR (2004) A polynomial-time nuclear vector replacement algorithm for automated NMR resonance assignments. J Comput Biol 11:277–298

    Article  Google Scholar 

  • Marin A, Malliavin T, Nicolas P, Delsuc M (2004) From NMR chemical shifts to amino acid types: investigation of the predictive power carried by nuclei. J Biomol NMR 30:47–60

    Article  Google Scholar 

  • Meiler J (2003) PROSHIFT: protein chemical shift prediction using artificial neural networks. J Biomol NMR 26(1):25–37

    Article  Google Scholar 

  • Moon S, Case DA (2007) A new model for chemical shifts of amide hydrogens in proteins. J Biomol NMR 38(2):139–150

    Article  Google Scholar 

  • Morrone A, McCully ME, Bryan PN, Brunori M, Daggett V, Gianni S, Travaglini-Allocatelli C (2011) The denatured state dictates the topology of two proteins with almost identical sequence but different native structure and function. J Biol Chem 286(5):3863–3872

    Article  Google Scholar 

  • Mulder FAA, Filatov M (2010) NMR chemical shift data and ab initio shielding calculations: emerging tools for protein structure determination. Chem Soc Rev 39(2):578–590

    Article  Google Scholar 

  • Neal S, Nip AM, Zhang H, Wishart DS (2003) Rapid and accurate calculation of protein 1H, 13C and 15N chemical shifts. J Biomol NMR 26(3):215–240

    Article  Google Scholar 

  • Pople JA (1956) Proton magnetic resonance of hydrocarbons. J Chem Phys 29:1012–1014

    MathSciNet  Google Scholar 

  • Raman S, Lange OF, Rossi P, Tyka M, Wang X, Aramini J, Liu G, Ramelot TA, Eletsky A, Szyperski T, Kennedy MA, Prestegard J, Montelione GT, Baker D (2010) NMR structure determination for larger proteins using backbone-only data. Science 327(5968):1014–1018

    Article  ADS  Google Scholar 

  • Rosato A, Aramini JM, Arrowsmith C, Bagaria A, Baker D, Cavalli A, Doreleijers JF, Eletsky A, Giachetti A, Guerry P, Gutmanas A, G1ntert P, He Y, Herrmann T, Huang YJ, Jaravine V, Jonker HRA, Kennedy MA, Lange OF, Liu G, Malliavin TE, Mani R, Mao B, Montelione GT, Nilges M, Rossi P, van der Schot G, Schwalbe H, Szyperski TA, Vendruscolo M, Vernon R, Vranken WF, de Vries S, Vuister GW, Wu B, Yang Y, Bonvin AMJJ (2012) Blind testing of routine, fully automated determination of protein structures from nmr data. Structure 20(2):227–236

    Google Scholar 

  • Schwieters CD, Kuszewski JJ, Tjandra N, Clore GM (2003) The Xplor-NIH NMR molecular structure determination package. J Magn Reson 160:65–73

    Article  ADS  Google Scholar 

  • Shen Y, Bax A (2007) Protein backbone chemical shifts predicted from searching a database for torsion angle and sequence homology. J Biomol NMR 38(4):289–302

    Article  Google Scholar 

  • Shen Y, Bax A (2010) SPARTA+: a modest improvement in empirical NMR chemical shift prediction by means of an artificial neural network. J Biomol NMR 48(1):13–22

    Article  Google Scholar 

  • Shen Y, Lange O, Delaglio F, Rossi P, Aramini JM, Liu G, Eletsky A, Wu Y, Singarapu KK, Lemak A, Ignatchenko A, Arrowsmith CH, Szyperski T, Montelione GT, Baker D, Bax A (2008) Consistent blind protein structure generation from NMR chemical shift data. Proc Natl Acad Sci USA 105(12):4685–4690

    Article  ADS  Google Scholar 

  • Shen Y, Vernon R, Baker D, Bax A (2009) De novo protein structure generation from incomplete chemical shift assignments. J Biomol NMR 43:63–78

    Article  Google Scholar 

  • Thompson JM, Sgourakis NG, Liu G, Rossi P, Tang Y, Mills JL, Szyperski T, Montelione GT, Baker D (2012) Accurate protein structure modeling using sparse nmr data and homologous structure information. Proc Natl Acad Sci USA 109(25):9875–9880

    Article  ADS  Google Scholar 

  • Tripathy C, Zeng J, Zhou P, Donald BR (2012) Protein loop closure using orientational restraints from NMR Data. Proteins Struct Funct Bioinform 80(2):433 – 453

    Article  Google Scholar 

  • Ulrich E, Akutsu H, Doreleijers J, Harano Y, Ioannidis Y, Lin J, Livny M, Mading S, Maziuk D, Miller Z, Nakatani E, Schulte C, Tolmie D, Wenger R, Yao H, Markley J (2007) BioMagResBank. Nucleic Acids Res 36:D402–D408

    Article  Google Scholar 

  • Vila JA, Arnautova YA, Martin OA, Scheraga HA (2009) Quantum-mechanics-derived 13Calpha chemical shift server (CheShift) for protein structure validation. Proc Natl Acad Sci USA 106(40):16972–16977

    Article  ADS  Google Scholar 

  • Vila JA, Serrano P, Wüthrich K, Scheraga HA (2010) Sequential nearest-neighbor effects on computed 13calpha chemical shifts. J Biomol NMR 48(1):23–30

    Article  Google Scholar 

  • Wand MP, Jones MC (1995) Kernel smoothing. Chapman and Hall, London

    MATH  Google Scholar 

  • Wang L, Donald BR (2004) Exact solutions for internuclear vectors and backbone dihedral angles from NH residual dipolar couplings in two media, and their application in a systematic search algorithm for determining protein backbone structure. J Biomol NMR 29(3):223–242

    Article  Google Scholar 

  • Wang L, Eghbalnia HR, Bahrami A, Markley JL (2005) Linear analysis of carbon-13 chemical shift differences and its application to the detection and correction of errors in referencing and spin system identifications. J Biomol NMR 32(1):13–22

    Article  Google Scholar 

  • Wang L, Markley JL (2009) Empirical correlation between protein backbone 15N and 13C secondary chemical shifts and its application to nitrogen chemical shift re-referencing. J Biomol NMR 44(2):95–99

    Article  Google Scholar 

  • Wang L, Mettu R, Donald BR (2006) A polynomial-time algorithm for De Novo protein backbone structure determination from NMR data. J Comput Biol 13(7):1276–1288

    Article  MathSciNet  Google Scholar 

  • Wishart DS (2011) Interpreting protein chemical shift data. Prog Nucl Magn Reson Spectros 58:62–87

    Article  Google Scholar 

  • Wishart DS, Arndt D, Berjanskii M, Tang P, Zhou J, Lin G (2008) CS23D: a web server for rapid protein structure generation using NMR chemical shifts and sequence data. Nucleic Acids Res 36(Web Server issue):W496–W502

    Article  Google Scholar 

  • Wishart DS, Watson MS, Boyko RF, Sykes BD (1997) Automated 1H and 13C chemical shift prediction using the BioMagResBank. J Biomol NMR 10(4):329–336

    Article  Google Scholar 

  • Xiong F, Pandurangan G, Bailey-Kellogg C (2008) Contact replacement for NMR resonance assignment. Bioinformatics 24(13):i205–i213

    Article  Google Scholar 

  • Xu XP, Case DA (2001) Automated prediction of 15N, 13Calpha, 13Cbeta and 13C’ chemical shifts in proteins using a density functional database. J Biomol NMR 21(4):321–333

    Article  Google Scholar 

  • Xu Y, Xu D, Uberbacher EC (1998) An efficient computational method for globally optimal threading. J Comput Biol. 5(3):597–614

    Article  Google Scholar 

  • Yershova A, Tripathy C, Zhou P, Donald B (2011) Algorithms and analytic solutions using sparse residual dipolar couplings for high-resolution automated protein backbone structure determination by NMR. In Workshop on the algorithmic foundations of robotics (WAFR), Singapore

  • Zeng J, Boyles J, Tripathy C, Wang L, Yan A, Zhou P, Donald BR (2009) High-resolution protein structure determination starting with a global fold calculated from exact solutions to the RDC equations. J Biomol NMR 45(3):265–281

    Article  Google Scholar 

  • Zeng J, Roberts KE, Zhou P, Donald BR (2011a) A bayesian approach for determining protein side-chain rotamer conformations using unassigned NOE data. In: Proceedings of the 15th annual international conference on research in computational molecular biology (RECOMB’11), Vancouver

  • Zeng J, Tripathy C, Zhou P, Donald BR (2008) A Hausdorff-Based NOE assignment algorithm using protein backbone determined from residual dipolar couplings and rotamer patterns. In: Proceedings of the 7th annual international conference on computational systems bioinformatics, Stanford, pp 169–181. ISBN 1752–7791. PMID: 19122773

  • Zeng J, Zhou P, Donald BR (2010) A markov random field framework for protein side-chain resonance assignment. In: Proceedings of the 14th annual international conference on research in computational molecular biology (RECOMB’10), Lisbon, Portugal

  • Zeng J, Zhou P, Donald BR (2011b) Protein side-chain resonance assignment and NOE assignment using RDC-Defined backbones without TOCSY Data. J Biomol NMR 50(4):371–95

    Article  Google Scholar 

  • Zhang H, Neal S, Wishart DS (2003) RefDB: a database of uniformly referenced protein chemical shifts. J Biomol NMR 25(3):173–195

    Article  Google Scholar 

Download references

Acknowledgments

We thank all members of the Donald and Zhou labs for helpful discussions and comments. This work is supported by the following grants from National Institutes of Health: R01 GM-65982 to B.R.D. and R01 GM-079376 to P.Z.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Jianyang Zeng or Bruce Randall Donald.

Additional information

The source code of Hash is available by contacting the authors, and is distributed open-source under the GNU Lesser General Public License (Gnu 2002). The source code can be freely downloaded.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zeng, J., Zhou, P. & Donald, B.R. Hash: a program to accurately predict protein Hα shifts from neighboring backbone shifts. J Biomol NMR 55, 105–118 (2013). https://doi.org/10.1007/s10858-012-9693-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10858-012-9693-7

Keywords

Navigation