We have determined refined multidimensional chemical shift ranges for intra-residue correlations (13C–13C, 15N–13C, etc.) in proteins, which can be used to gain type-assignment and/or secondary-structure information from experimental NMR spectra. The chemical-shift ranges are the result of a statistical analysis of the PACSY database of >3000 proteins with 3D structures (1,200,207 13C chemical shifts and >3 million chemical shifts in total); these data were originally derived from the Biological Magnetic Resonance Data Bank. Using relatively simple non-parametric statistics to find peak maxima in the distributions of helix, sheet, coil and turn chemical shifts, and without the use of limited “hand-picked” data sets, we show that ~94 % of the 13C NMR data and almost all 15N data are quite accurately referenced and assigned, with smaller standard deviations (0.2 and 0.8 ppm, respectively) than recognized previously. On the other hand, approximately 6 % of the 13C chemical shift data in the PACSY database are shown to be clearly misreferenced, mostly by ca. −2.4 ppm. The removal of the misreferenced data and other outliers by this purging by intrinsic quality criteria (PIQC) allows for reliable identification of secondary maxima in the two-dimensional chemical-shift distributions already pre-separated by secondary structure. We demonstrate that some of these correspond to specific regions in the Ramachandran plot, including left-handed helix dihedral angles, reflect unusual hydrogen bonding, or are due to the influence of a following proline residue. With appropriate smoothing, significantly more tightly defined chemical shift ranges are obtained for each amino acid type in the different secondary structures. These chemical shift ranges, which may be defined at any statistical threshold, can be used for amino-acid type assignment and secondary-structure analysis of chemical shifts from intra-residue cross peaks by inspection or by using a provided command-line Python script (PLUQin), which should be useful in protein structure determination. The refined chemical shift distributions are utilized in a simple quality test (SQAT) that should be applied to new protein NMR data before deposition in a databank, and they could benefit many other chemical-shift based tools.
Protein chemical shift Databases Protein secondary structure Data mining PIQC PACSY PLUQin SQAT
This is a preview of subscription content, log in to check access.
K. S. R. gratefully acknowledges Brandeis University for support. This work was partly supported by NIH Grant GM066976 to M. H.
Fritzsching KJ, Yang Y, Schmidt-Rohr K, Hong M (2013) Practical use of chemical shift databases for protein solid-state NMR: 2D chemical shift maps and amino-acid assignment with secondary-structure information. J Biomol NMR 56:155–167. doi:10.1007/s10858-013-9732-zCrossRefGoogle Scholar
Hastie T, Tibshirani R, Firedman J (2009) Model inference and averaging: the elements of statistical learning: data mining, inference, and prediction, 2nd edn. Springer, BerlinGoogle Scholar
Hazan C et al (2008) Structural insights on the pamoic acid and the 8 kDa domain of DNA polymerase beta complex: towards the design of higher-affinity inhibitors. BMC Struct Biol 8:22. doi:10.1186/1472-6807-8-22CrossRefGoogle Scholar
Heinig M, Frishman D (2004) STRIDE: a web server for secondary structure assignment from known atomic coordinates of proteins. Nucleic Acids Res 32:W500–W502. doi:10.1093/nar/gkh429CrossRefGoogle Scholar
Raschle T, Hiller S, Yu TY, Rice AJ, Walz T, Wagner G (2009) Structural and functional characterization of the integral membrane protein VDAC-1 in lipid bilayer nanodiscs. J Am Chem Soc 131:17777–17779CrossRefGoogle Scholar
Spera S, Bax A (1991) Empirical correlation between protein backbone conformation and C. alpha. and C. beta. 13C nuclear magnetic resonance chemical shifts. J Am Chem Soc 113:5490–5492. doi:10.1021/ja00014a071CrossRefGoogle Scholar
Tycko R, Hu KN (2010) A Monte Carlo/simulated annealing algorithm for sequential resonance assignment in solid state NMR of uniformly labeled proteins with magic-angle spinning. J Magn Reson 205:304–314. doi:10.1016/j.jmr.2010.05.013CrossRefADSGoogle Scholar
Wang L, Markley JL (2009) Empirical correlation between protein backbone 15N and 13C secondary chemical shifts and its application to nitrogen chemical shift re-referencing. J Biomol NMR 44:95–99. doi:10.1007/s10858-009-9324-0CrossRefGoogle Scholar
Wang L, Eghbalnia HR, Bahrami A, Markley JL (2005) Linear analysis of carbon-13 chemical shift differences and its application to the detection and correction of errors in referencing and spin system identifications. J Biomol NMR 32:13–22. doi:10.1007/s10858-005-1717-0CrossRefGoogle Scholar
Yang Y, Fritzsching KJ, Hong M (2013) Resonance assignment of the NMR spectra of disordered proteins using a multi-objective non-dominated sorting genetic algorithm. J Biomol NMR 57:281–296. doi:10.1007/s10858-013-9788-9CrossRefGoogle Scholar