A computer program (SHIFTX) is described which rapidly and accurately calculates the diamagnetic 1H, 13C and 15N chemical shifts of both backbone and sidechain atoms in proteins. The program uses a hybrid predictive approach that employs pre-calculated, empirically derived chemical shift hypersurfaces in combination with classical or semi-classical equations (for ring current, electric field, hydrogen bond and solvent effects) to calculate 1H, 13C and 15N chemical shifts from atomic coordinates. The chemical shift hypersurfaces capture dihedral angle, sidechain orientation, secondary structure and nearest neighbor effects that cannot easily be translated to analytical formulae or predicted via classical means. The chemical shift hypersurfaces were generated using a database of IUPAC-referenced protein chemical shifts – RefDB (Zhang et al., 2003), and a corresponding set of high resolution (<2.1 Å) X-ray structures. Data mining techniques were used to extract the largest pairwise contributors (from a list of ∼20 derived geometric, sequential and structural parameters) to generate the necessary hypersurfaces. SHIFTX is rapid (< 1 CPU second for a complete shift calculation of 100 residues) and accurate. Overall, the program was able to attain a correlation coefficient (r) between observed and calculated shifts of 0.911 (1Hα), 0.980 (13Cα), 0.996 (13Cβ), 0.863 (13CO), 0.909 (15N), 0.741 (1HN), and 0.907 (sidechain 1H) with RMS errors of 0.23, 0.98, 1.10, 1.16, 2.43, 0.49, and 0.30 ppm, respectively on test data sets. We further show that the agreement between observed and SHIFTX calculated chemical shifts can be an extremely sensitive measure of the quality of protein structures. Our results suggest that if NMR-derived structures could be refined using heteronuclear chemical shifts calculated by SHIFTX, their precision could approach that of the highest resolution X-ray structures. SHIFTX is freely available as a web server at http://redpoll.pharmacy.ualberta.ca.
calculation chemical shift data mining NMR prediction protein