Automated error-tolerant macromolecular structure determination from multidimensional nuclear Overhauser enhancement spectra and chemical shift assignments: improved robustness and performance of the PASD algorithm
- 115 Downloads
We report substantial improvements to the previously introduced automated NOE assignment and structure determination protocol known as PASD (Kuszewski et al. (2004) J Am Chem Soc 26:6258–6273). The improved protocol includes extensive analysis of input spectral data to create a low-resolution contact map of residues expected to be close in space. This map is used to obtain reasonable initial guesses of NOE assignment likelihoods which are refined during subsequent structure calculations. Information in the contact map about which residues are predicted to not be close in space is applied via conservative repulsive distance restraints which are used in early phases of the structure calculations. In comparison with the previous protocol, the new protocol requires significantly less computation time. We show results of running the new PASD protocol on six proteins and demonstrate that useful assignment and structural information is extracted on proteins of more than 220 residues. We show that useful assignment information can be obtained even in the case in which a unique structure cannot be determined.
KeywordsAutomated structure determination Automated NOE assignment Xplor-NIH
This work was supported by the CIT (to CDS) and NIDDK (to GMC) Intramural Research Programs of the NIH.
Glossary of terms and symbols
An NOE assignment which contributes to the linear (Pass 1) or quadratic (Pass 2) restraint terms. Whether an assignment is active or inactive is determined from its assignment likelihoods via the procedure described in Section “Determination of active peak assignments”.
An NOE peak with one or more active assignments.
The probability of the correctness of assignment j of peak i. λ p is the previous likelihood of an assignment based on previously obtained information; in Pass 1 λ p is denoted \(\lambda_p^n\) and is based on the network contact map, while in Pass 2 previous likelihoods \(\lambda_p^v\) are based on distance violations of the structures calculated in Pass 1. The violation likelihood λ v is the probability of correctness of an assignment based on distance violations in the current structure. The overall peak assignment likelihood λ o is a weighted average of previous and violation likelihoods. The assignment likelihood λ a is used to determine which single assignment to use for a given peak during Pass 2.
The size of chemical shift bins used in the initial assignment procedure. [Section “Shift assignment stripe correction”]
NOE peaks corresponding to intraresidue or backbone sequential connectivities, used for stripe correction and network analysis. [Section “Shift assignment stripe correction”]
Distance used in determining assignment likelihood λ v . Smaller values reduce the likelihood of assignments with large violations. [Eq. 13]
Energy term used in Pass 1 which is linear in NOE violation. [Eq. 6]
The residue pair score between residues a and b, based on connectivities deduced from the initial collection of possible NOE assignments. R′(a,b) is the normalized score used for assigning initial likelihoods; associated assignments are specified as active for R′ > R c . Larger R′ corresponds to a larger number of connections. [Eqs. 1 and 2]
A specific NOE peak assignment relating a single peak to a pair of assigned chemical shifts.
Weight determining the contribution of λ p and λ v to λ o . [Eq. 14]
Energy term used in Pass 2 which is quadratic in NOE violation. [Eq. 10]
Energy term used in Pass 1 which repels atoms associated with shift assignments which are inactive. [Eq. 11]
The fraction of calibration peaks consistent with a particular chemical shift assignment. [Section “Shift assignment stripe correction”]
Two NOE peaks with from- and to- assignments reversed.
The size of chemical shift bins used during peak assignment after the stripe correction procedure. [Section “Shift assignment stripe correction”]
- Busam RD, Lehtio L, Arrowsmith CH, Collins R, Dahlgren LG, Edwards AM, Flodin S, Flores A, Graslund S, Hammarstrom M, Hallberg BM, Herman MD, Johansson A, Johansson I, Kallas A, Karlberg T, Kotenyova T, Moche M, Nilsson ME, Nordlund P, Nyman T, Persson C, Sagemark J, Sundstrom M, Svensson L, Thorsell AG, Tresaugues L, Van den Berg S, Weigelt J, Welin M, Berglund H Crystal structure of human thiamine triphosphatase. To be PublishedGoogle Scholar
- Billeter M, Braun W, Wüthrich K (1982) Sequential resonance assignments in protein 1H nuclear magnetic resonance spectra: computation of sterically allowed proton-proton distances and statistical analysis of proton-proton distances in single crystal protein conformations. J Mol Biol 155:321–346CrossRefGoogle Scholar
- BMRB NMR-STAR Data Dictionary (2004) http://www.bmrb.wisc.edu/dictionary/htmldocs/nmr_star/dictionary.html
- Clore GM, Nilges M, Sukuraman DK, Brünger AT, Karplus M, Gronenborn AM (1986) The three-dimensional structure of α1-purothionin in solution: combined use of nuclear magnetic resonance, distance geometry and restrained molecular dynamics. EMBO J 5:2729–2735Google Scholar
- de Vlieg J, Boelens R, Scheek RM, Kaptein R, van Gunsteren WF (1986) Restrained molecular dynamics procedure for protein tertiary structure determination from NMR data: a lac repressor headpiece structure based on information on J-coupling and from presence and absence of NOEs. Isr J Chem 27:181–188Google Scholar
- Garrett DS, Powers R, Gronenborn AM, Clore GM (1991) A common sense approach to peak picking two-, three- and four-dimensional spectra using automatic computer analysis of contour diagrams. J Magn Reson 95:214–220Google Scholar
- Kuszewski J, Schwieters CD, Garrett DS, Byrd RA, Tjandra N, Clore GM (2004) Completely automated, highly error-tolerant macromolecular structure determination from multidimensional nuclear overhauser enhancement spectra and chemical shift assignments. J Am Chem Soc 26:6258–6273CrossRefGoogle Scholar
- Nilges M, Gronenborn AM, Brunger AT, Clore GM (1988) Determination of three-dimensional structures of proteins by simulated annealing with interproton distance restraints: application to crambin, potato carboxypeptidase inhibitor and barley serine proteinase inhibitor 2. Protein Eng 2:27–38CrossRefGoogle Scholar
- Ramelot TA, Cort JR, Yee AA, Guido V, Lukin JA, Arrowsmith CH, Kennedy MA. To be publishedGoogle Scholar
- Rieping W, Habeck M, Bardiaux, Bernard A, Malliavin TE, Nilges M (2007) ARIA2: automated NOE assignment and data integration in NMR structure calculation. Bioinformatics 23:381–382Google Scholar
- Shen Y, Lange O, Delaglio F, Rossi P, Aramini JM, Liu G, Eletsky A, Wu Y, Singarapu KK, Lemak A, Ignatchenko A, Arrowsmith CH, Szyperski T, Montelione GT, Baker D, Bax A (2008) Consistent blind protein structure generation from NMR chemical shift data. Proc Natl Acad Sci USA 105:4685–4690CrossRefADSGoogle Scholar
- Tang C, Iwahara J, Clore GM (2005) Accurate determination of leucine and valine side-chain conformations using U-[15N/13C/2H]/[1H-(methyl/methine)-Leu/Val] isotope labeling, NOE pattern recognition and methine CγHγ/CβHβ residual dipolar couplings: application to the 34 kDa enzyme IIAChitobiose. J Biomol NMR 33:105–121CrossRefGoogle Scholar
- Wilcox GR, Fogh RH, Norton RS (1993) Refined structure in solution of the sea anemone neurotoxin ShI. J Biol Chem 268:24707–24719Google Scholar
- Yee A, Chang X, Pineda-Lucena A, Wu B, Semesi A, Le B, Ramelot T, Lee GM, Bhattacharyya S, Gutierrez P, Denisov A, Lee CH, Cort JR, Kozlov G, Liao J, Finak G, Chen L, Wishart D, Lee W, McIntosh LP, Gehring K, Kennedy MA, Edwards AM, Arrowsmith CH (2002) An NMR approach to structural proteomics. Proc Natl Acad Sci USA 99:1825–1830CrossRefADSGoogle Scholar