# Automated error-tolerant macromolecular structure determination from multidimensional nuclear Overhauser enhancement spectra and chemical shift assignments: improved robustness and performance of the PASD algorithm

- 115 Downloads
- 12 Citations

## Abstract

We report substantial improvements to the previously introduced automated NOE assignment and structure determination protocol known as PASD (Kuszewski et al. (2004) J Am Chem Soc 26:6258–6273). The improved protocol includes extensive analysis of input spectral data to create a low-resolution contact map of residues expected to be close in space. This map is used to obtain reasonable initial guesses of NOE assignment likelihoods which are refined during subsequent structure calculations. Information in the contact map about which residues are predicted to not be close in space is applied via conservative repulsive distance restraints which are used in early phases of the structure calculations. In comparison with the previous protocol, the new protocol requires significantly less computation time. We show results of running the new PASD protocol on six proteins and demonstrate that useful assignment and structural information is extracted on proteins of more than 220 residues. We show that useful assignment information can be obtained even in the case in which a unique structure cannot be determined.

## Keywords

Automated structure determination Automated NOE assignment Xplor-NIH## Notes

### Acknowledgements

This work was supported by the CIT (to CDS) and NIDDK (to GMC) Intramural Research Programs of the NIH.

## Glossary of terms and symbols

An NOE assignment which contributes to the linear (Pass 1) or quadratic (Pass 2) restraint terms. Whether an assignment is active or inactive is determined from its assignment likelihoods via the procedure described in Section “Determination of active peak assignments”.

An NOE peak with one or more active assignments.

*i*,

*j*)

The probability of the correctness of assignment *j* of peak *i*. λ_{ p } is the previous likelihood of an assignment based on previously obtained information; in Pass 1 λ_{ p } is denoted \(\lambda_p^n\) and is based on the network contact map, while in Pass 2 previous likelihoods \(\lambda_p^v\) are based on distance violations of the structures calculated in Pass 1. The violation likelihood λ_{ v } is the probability of correctness of an assignment based on distance violations in the current structure. The overall peak assignment likelihood λ_{ o } is a weighted average of previous and violation likelihoods. The assignment likelihood λ_{ a } is used to determine which single assignment to use for a given peak during Pass 2.

_{B}

The size of chemical shift bins used in the initial assignment procedure. [Section “Shift assignment stripe correction”]

NOE peaks corresponding to intraresidue or backbone sequential connectivities, used for stripe correction and network analysis. [Section “Shift assignment stripe correction”]

*r*

_{c}

Distance used in determining assignment likelihood λ_{ v }. Smaller values reduce the likelihood of assignments with large violations. [Eq. 13]

*E*

_{lin}

Energy term used in Pass 1 which is linear in NOE violation. [Eq. 6]

*R*(

*a*,

*b*)

The residue pair score between residues *a* and *b*, based on connectivities deduced from the initial collection of possible NOE assignments. *R*′(*a*,*b*) is the normalized score used for assigning initial likelihoods; associated assignments are specified as active for *R*′ > *R* _{ c }. Larger *R*′ corresponds to a larger number of connections. [Eqs. 1 and 2]

A specific NOE peak assignment relating a single peak to a pair of assigned chemical shifts.

*w*

_{p}

Weight determining the contribution of λ_{ p } and λ_{ v } to λ_{ o }. [Eq. 14]

*E*

_{quad}

Energy term used in Pass 2 which is quadratic in NOE violation. [Eq. 10]

*E*

_{repul}

Energy term used in Pass 1 which repels atoms associated with shift assignments which are inactive. [Eq. 11]

*C*

The fraction of calibration peaks consistent with a particular chemical shift assignment. [Section “Shift assignment stripe correction”]

Two NOE peaks with from- and to- assignments reversed.

_{T}

The size of chemical shift bins used during peak assignment after the stripe correction procedure. [Section “Shift assignment stripe correction”]

## Supplementary material

## References

- Bartels C, Xia TH, Billeter M, Güntert P, Wüthrich K (1995) The program XEASY for computer-supported NMR spectral analysis of biological macromolecules. J Biomol NMR 6:1–10CrossRefGoogle Scholar
- Bax A, Kontaxis G, Tjandra N (2001) Dipolar couplings in macromolecular structure determination. Meth Enzymol 339:127–174CrossRefGoogle Scholar
- Busam RD, Lehtio L, Arrowsmith CH, Collins R, Dahlgren LG, Edwards AM, Flodin S, Flores A, Graslund S, Hammarstrom M, Hallberg BM, Herman MD, Johansson A, Johansson I, Kallas A, Karlberg T, Kotenyova T, Moche M, Nilsson ME, Nordlund P, Nyman T, Persson C, Sagemark J, Sundstrom M, Svensson L, Thorsell AG, Tresaugues L, Van den Berg S, Weigelt J, Welin M, Berglund H Crystal structure of human thiamine triphosphatase. To be PublishedGoogle Scholar
- Bewley CA, Gustafson KR, Boyd MR, Covell DG, Bax A, Clore GM, Gronenborn AM (1998) Solution structure of cyanovirin-N, a potent HIV-inactivating protein. Nat Struct Biol 5:571–578CrossRefGoogle Scholar
- Billeter M, Braun W, Wüthrich K (1982) Sequential resonance assignments in protein
^{ 1}H nuclear magnetic resonance spectra: computation of sterically allowed proton-proton distances and statistical analysis of proton-proton distances in single crystal protein conformations. J Mol Biol 155:321–346CrossRefGoogle Scholar - BMRB NMR-STAR Data Dictionary (2004) http://www.bmrb.wisc.edu/dictionary/htmldocs/nmr_star/dictionary.html
- Brüschweiler R, Blackledge M, Ernst RR (1991) Multi-conformational peptide dynamics derived from NMR data: a new search algorithm and its application to antamanide. J Biomol NMR 1:13–11CrossRefGoogle Scholar
- Cavalli A, Salvatella X, Dobson CM, Vendruscolo M (2007) Protein structure determination from NMR chemical shifts. Proc Natl Acad Sci USA 104:9615–9620CrossRefADSGoogle Scholar
- Clore GM, Gronenborn AM (1989) Determination of three-dimensional structures of proteins and nucleic acids in solution by nuclear magnetic resonance spectroscopy. Crit Rev Biochem Mol Biol 24:479–564CrossRefGoogle Scholar
- Clore GM, Gronenborn AM (1991a) Applications of three- and four-dimensional heteronuclear NMR spectroscopy to protein structure determination. Progr Nucl Magn Reson Spectrosc 23:43–92CrossRefGoogle Scholar
- Clore GM, Gronenborn AM (1991b) Two, three and four dimensional NMR methods for obtaining larger and more precise three-dimensional structures of proteins in solution. Ann Rev Biophys Biophys Chem 20:29–63CrossRefGoogle Scholar
- Clore GM, Kuszewski J (2002) χ
_{1}Rotamer populations and angles of mobile surface side chains are accurately predicted by a torsion angle database potential of mean force. J Am Chem Soc 124:2866–2867CrossRefGoogle Scholar - Clore GM, Nilges M, Sukuraman DK, Brünger AT, Karplus M, Gronenborn AM (1986) The three-dimensional structure of α1-purothionin in solution: combined use of nuclear magnetic resonance, distance geometry and restrained molecular dynamics. EMBO J 5:2729–2735Google Scholar
- Clore GM, Gronenborn AM, Nilges M, Ryan CA (1987) The three-dimensional structure of potato carboxypeptidase inhibitor in solution: a study using nuclear magnetic resonance, distance geometry and restrained molecular dynamics. Biochemistry 26:8012–8023CrossRefGoogle Scholar
- Cornilescu G, Delaglio F, Bax A (1999) Protein backbone angle restraints from searching a database for chemical shift and sequence homology. J Biomol NMR 13:289–302CrossRefGoogle Scholar
- Delaglio F, Grzesiek S, Vuister GW, Zhu G, Pfeifer J, Bax A (1995) NMRPipe: a multidimensional spectral processing system based on UNIX pipes. J Biomol NMR 6:277–293CrossRefGoogle Scholar
- de Vlieg J, Boelens R, Scheek RM, Kaptein R, van Gunsteren WF (1986) Restrained molecular dynamics procedure for protein tertiary structure determination from NMR data: a lac repressor headpiece structure based on information on J-coupling and from presence and absence of NOEs. Isr J Chem 27:181–188Google Scholar
- Garrett DS, Powers R, Gronenborn AM, Clore GM (1991) A common sense approach to peak picking two-, three- and four-dimensional spectra using automatic computer analysis of contour diagrams. J Magn Reson 95:214–220Google Scholar
- Garrett DS, Seok Y-J, Liao DT, Peterkofsky A, Gronenborn AM, Clore GM (1997) Solution structure of the 30 kDa N-terminal domain of enzyme I of the
*Escherichia coli*phosphoenolpyruvate:sugar phosphotransferase system by multidimensional NMR. Biochemistry 36:2517–2530CrossRefGoogle Scholar - Goto NK, Gardner KH, Mueller GA, Willis RC, Kay LE (1999) A robust and cost-effective method for the production of Val, Leu, Ile (δ1) methyl-protonated
^{ 15}N-,^{ 13}C-,^{ 2}H-labeled proteins. J Biomol NMR 13:369–374CrossRefGoogle Scholar - Grishaev A, Llinás M (2002) CLOUDS, a protocol for deriving a molecular proton density via NMR. Proc Natl Acad Sci USA 99:6707–6712CrossRefADSGoogle Scholar
- Grishaev A, Wu J, Trewheela J, Bax A (2005) Refinement of multidomain structures by combination of solution small-angle X-ray scattering and NMR data. J Am Chem Soc 127:16621–16628CrossRefGoogle Scholar
- Güntert P (2003) Automated NMR protein structure calculation. Prog Nucl Magn Reson Spectrosc 43:105–125CrossRefGoogle Scholar
- Herrmann T, Güntert P, Wüthrich K (2002a) Protein NMR structure determination with automated NOE assignment using the new software CANDID and the torsion angle dynamics algorithm DYANA. J Mol Biol 319:209–227CrossRefGoogle Scholar
- Herrmann T, Güntert P, Wüthrich K (2002b) Protein NMR structure determination with automated NOE-identification in the NOESY spectra using the new software ATNOS. J Biomol NMR 24:171–189CrossRefGoogle Scholar
- Huang YJ, Tejero R, Powers R, Montelione GT (2006) A topology-constrained distance network algorithm for protein structure determination from NOESY data. Prot Struct Funct Bioinf 62:587–603CrossRefGoogle Scholar
- Kuszewski J, Gronenborn AM, Clore GM (1996) Improving the quality of NMR and crystallographic protein structures by means of a conformational database potential derived from structure databases. Protein Sci 5:1067–1080CrossRefGoogle Scholar
- Kuszewski J, Schwieters CD, Garrett DS, Byrd RA, Tjandra N, Clore GM (2004) Completely automated, highly error-tolerant macromolecular structure determination from multidimensional nuclear overhauser enhancement spectra and chemical shift assignments. J Am Chem Soc 26:6258–6273CrossRefGoogle Scholar
- McFeeters RL, Altieri AS, Cherry S, Tropea JE, Waugh DS, Byrd RA (2007) The high-precision solution structure of Yersinia modulating protein YmoA provides insight into interaction with H-NS. Biochemistry 46:13975–13982CrossRefGoogle Scholar
- Nilges M (1993) A calculation strategy for the solution structure determination of symmetric dimers by
^{1}H-NMR. Proteins 17:297–309CrossRefGoogle Scholar - Nilges M, Gronenborn AM, Brunger AT, Clore GM (1988) Determination of three-dimensional structures of proteins by simulated annealing with interproton distance restraints: application to crambin, potato carboxypeptidase inhibitor and barley serine proteinase inhibitor 2. Protein Eng 2:27–38CrossRefGoogle Scholar
- Nilges M, Macias MJ, O’Donoghue SI, Oschkinat H (1997) Automated NOESY interpretation with ambiguous distance restraints: the refined NMR solution structure of the Pleckstrin homology domain from β-spectrin. J Mol Biol 269:408–422CrossRefGoogle Scholar
- Powers R, Garrett DS, March CJ, Frieden EA, Gronenborn AM, Clore GM (1993) The high-resolution, three-dimensional solution structure of human interleukin-4 determined by multidimensional heteronuclear magnetic resonance spectroscopy. Biochemistry 32:6744–6762CrossRefGoogle Scholar
- Ramelot TA, Cort JR, Yee AA, Guido V, Lukin JA, Arrowsmith CH, Kennedy MA. To be publishedGoogle Scholar
- Rieping W, Habeck M, Bardiaux, Bernard A, Malliavin TE, Nilges M (2007) ARIA2: automated NOE assignment and data integration in NMR structure calculation. Bioinformatics 23:381–382Google Scholar
- Schwieters CD, Clore GM (2001) Internal coordinates for molecular dynamics and minimization in structure determination and refinement. J Magn Reson 152:288–302CrossRefADSGoogle Scholar
- Schwieters CD, Clore GM (2007) A physical picture of atomic motions within the Dickerson DNA dodecamer in solution derived from joint ensemble refinement against NMR and large angle X-ray scattering data. Biochemistry 46:1152–1166CrossRefGoogle Scholar
- Schwieters CD, Kuszewski JJ, Tjandra N, Clore GM (2003) The Xplor-NIH NMR molecular structure determination package. J Magn Reson 160:66–74CrossRefADSGoogle Scholar
- Schwieters CD, Kuszewski JJ, Clore GM (2006) Using Xplor-NIH for NMR molecular structure determination. Progr NMR Spectrosc 48:47–62CrossRefGoogle Scholar
- Shen Y, Lange O, Delaglio F, Rossi P, Aramini JM, Liu G, Eletsky A, Wu Y, Singarapu KK, Lemak A, Ignatchenko A, Arrowsmith CH, Szyperski T, Montelione GT, Baker D, Bax A (2008) Consistent blind protein structure generation from NMR chemical shift data. Proc Natl Acad Sci USA 105:4685–4690CrossRefADSGoogle Scholar
- Song J, Bettendorff L, Tonelli M, Markley JL (2008) Structural basis for the catalytic mechanism of mammalian 25 kDa thiamine triphosphatase. J Biol Chem 283:10939–10948CrossRefGoogle Scholar
- Summers MF, South TL, Kim B, Hare DR (1990) High-resolution structure of an HIV zinc fingerlike domain via a new NMR-based distance geometry approach. Biochemistry 29:329–340CrossRefGoogle Scholar
- Tang C, Iwahara J, Clore GM (2005) Accurate determination of leucine and valine side-chain conformations using U-[
^{15}N/^{13}C/^{2}H]/[^{1}H-(methyl/methine)-Leu/Val] isotope labeling, NOE pattern recognition and methine CγHγ/CβHβ residual dipolar couplings: application to the 34 kDa enzyme IIA^{Chitobiose}. J Biomol NMR 33:105–121CrossRefGoogle Scholar - Theobald DL, Wuttke DS (2006a) Empirical Bayes hierarchical models for regularizing maximum likelihood estimation in the matrix Gaussian Procrustes problem. Proc Natl Acad Sci 103:18521–18527zbMATHCrossRefMathSciNetADSGoogle Scholar
- Theobald DL, Wuttke DS (2006b) THESEUS: Maximum likelihood superpositioning and analysis of macromolecular structures. Bioinformatics 22:2171–2172CrossRefGoogle Scholar
- Tjandra N, Garrett DS, Gronenborn AM, Bax A, Clore GM (1997) Defining long range order in NMR structure determination from the dependence of heteronuclear relaxation times on rotational diffusion anisotropy. Nat Struct Biol 4:443–449CrossRefGoogle Scholar
- Verlet L (1967) Computer “experiments” on classical fluids. I. Thermodynamical properties of Lennard-Jones molecules. Phys Rev 159:98–103CrossRefADSGoogle Scholar
- Wilcox GR, Fogh RH, Norton RS (1993) Refined structure in solution of the sea anemone neurotoxin ShI. J Biol Chem 268:24707–24719Google Scholar
- Wlodawer A, Pavlovsky A, Gustchina A (1992) Crystal structure of human recombinant interleukin-4 at 2.25 Å resolution. FEBS Lett 309:59–64CrossRefGoogle Scholar
- Yee A, Chang X, Pineda-Lucena A, Wu B, Semesi A, Le B, Ramelot T, Lee GM, Bhattacharyya S, Gutierrez P, Denisov A, Lee CH, Cort JR, Kozlov G, Liao J, Finak G, Chen L, Wishart D, Lee W, McIntosh LP, Gehring K, Kennedy MA, Edwards AM, Arrowsmith CH (2002) An NMR approach to structural proteomics. Proc Natl Acad Sci USA 99:1825–1830CrossRefADSGoogle Scholar