Abstract
One bottleneck in NMR structure determination lies in the laborious and time-consuming process of side-chain resonance and NOE assignments. Compared to the well-studied backbone resonance assignment problem, automated side-chain resonance and NOE assignments are relatively less explored. Most NOE assignment algorithms require nearly complete side-chain resonance assignments from a series of through-bond experiments such as HCCH-TOCSY or HCCCONH. Unfortunately, these TOCSY experiments perform poorly on large proteins. To overcome this deficiency, we present a novel algorithm, called Nasca (NOE Assignment and Side-Chain Assignment), to automate both side-chain resonance and NOE assignments and to perform high-resolution protein structure determination in the absence of any explicit through-bond experiment to facilitate side-chain resonance assignment, such as HCCH-TOCSY. After casting the assignment problem into a Markov Random Field (MRF), Nasca extends and applies combinatorial protein design algorithms to compute optimal assignments that best interpret the NMR data. The MRF captures the contact map information of the protein derived from NOESY spectra, exploits the backbone structural information determined by RDCs, and considers all possible side-chain rotamers. The complexity of the combinatorial search is reduced by using a dead-end elimination (DEE) algorithm, which prunes side-chain resonance assignments that are provably not part of the optimal solution. Then an A* search algorithm is employed to find a set of optimal side-chain resonance assignments that best fit the NMR data. These side-chain resonance assignments are then used to resolve the NOE assignment ambiguity and compute high-resolution protein structures. Tests on five proteins show that Nasca assigns resonances for more than 90% of side-chain protons, and achieves about 80% correct assignments. The final structures computed using the NOE distance restraints assigned by Nasca have backbone RMSD 0.8–1.5 Å from the reference structures determined by traditional NMR approaches.
Similar content being viewed by others
Abbreviations
- NMR:
-
Nuclear magnetic resonance
- ppm:
-
Parts per million
- RMSD:
-
Root mean square deviation
- HSQC:
-
Heteronuclear single quantum coherence spectroscopy
- NOE:
-
Nuclear Overhauser effect
- NOESY:
-
Nuclear Overhauser and exchange spectroscopy
- TOCSY:
-
Total correlation spectroscopy
- TROSY:
-
Transverse relaxation-optimized spectroscopy
- RDC:
-
Residual dipolar coupling
- PDB:
-
Protein Data Bank
- BMRB:
-
Biological Magnetic Resonance Bank
- pol η UBZ:
-
Ubiquitin-binding zinc finger domain of the human Y-family DNA polymerase Eta
- hSRI:
-
Human Set2-Rpb1 interacting domain
- FF2:
-
FF Domain 2 of human transcription elongation factor CA150
- GB1:
-
B1 domain of Protein G
- CH:
-
Cα−Hα
- SSE:
-
Secondary structure element
- \(\hbox{C}^{\prime}\) :
-
Carbonyl carbon
- MRF:
-
Markov Random Field
- DEE:
-
Dead-end elimination
- GMEC:
-
Global minimum energy conformation
- SA:
-
Simulated annealing
- MD:
-
Molecular dynamics
- \({\mathbb{R}}^3\) :
-
3-Dimensional Euclidean space
References
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215(3):403–410
Atreya HS, Sahu SC, Chary KV, Govil G (2000) A tracked approach for automated nmr assignments in proteins (tatapro). J Biomol NMR 17(2):125–136
Bahrami A, Assadi AH, Markley JL, Eghbalnia HR (2009) Probabilistic interaction network of evidence algorithm and its application to complete labeling of peak lists from protein NMR spectroscopy. PLoS Comput Biol 5(3):e1000307
Bailey-Kellogg C, Chainraj S, Pandurangan G (2005) A random graph approach to NMR sequential assignment. J Comput Biol 12(6):569–583
Bailey-Kellogg C, Widge A, Kelley JJ, Berardi MJ, Bushweller JH, Donald BR (2000) The NOESY jigsaw: automated protein secondary structure and main-chain assignment from sparse, unassigned NMR data. J Comput Biol 7(3-4):537–558
Baker D, Sali A (2001) Protein structure prediction and structural genomics. Science 294:93–96
Ball G, Meenan N, Bromek K, Smith BO, Bella J, Uhrín D (2006) Measurement of one-bond 13Cα−1Hα residual dipolar coupling constants in proteins by selective manipulation of CαHα spins. J Magn Reson 180:127–136
Baran MC, Huang YJ, Moseley HN, Montelione GT (2004) Automated analysis of protein NMR assignments and structures. Chem Rev 104:3541–3456
Bartels C, Xia T, Billeter M, Güntert P, Wüthrich K (1995) The program XEASY for computer-supported NMR spectral analysis of biological macromolecules. J Biomol NMR 6:1–10
Besag J (1974) Spatial interaction and the statistical analysis of lattice systems. J R Stat Soc B 36
Bomar MG, Pai M, Tzeng S, Li S, Zhou P (2007) Structure of the ubiquitin-binding zinc finger domain of human DNA Y-polymerase η. EMBO Reports 8:247–251
Boykov Y, Veksler O, Zabih R (1998) Markov random fields with efficient approximations. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, p 648
Chen C, Georgiev I, Anderson A, Donald B (2009) Computational structure-based redesign of enzyme activity. Proc Natl Acad Sci USA 106:3764–3769
Choy WY, Tollinger M, Mueller GA, Kay LE (2001) Direct structure refinement of high molecular weight proteins against residual dipolar couplings and carbonyl chemical shift changes upon alignment: an application to maltose binding protein. J Biomol NMR 21(1):31–40
Coggins BE, Zhou P (2003) PACES: protein sequential assignment by computer-assisted exhaustive search. J Biomol NMR 26:93–111
Cornilescu G, Delaglio F, Bax A (1999) Protein backbone angle restraints from searching a database for chemical shift and sequence homology. J Biomol NMR 13:289–302
Cornilescu G, Marquardt JL, Ottiger M, Bax A (1998) Validation of protein structure from anisotropic carbonyl chemical shifts in a dilute liquid crystalline phase. J Am Chem Soc 120:6836–6837
Delaglio F, Grzesiek S, Vuister GW, Zhu G, Pfeifer J, Bax A (1995) NMRPipe: a multidimensional spectral processing system based on UNIX pipes. J Biomol NMR 6:277–293
Desmet J, Maeyer M, Hazes B, Lasters I (1992) The dead-end elimination theorem and its use in protein side-chain positioning. Nature 356:539–542
Donald BR, Martin J (2009) Automated NMR assignment and protein structure determination using sparse dipolar coupling constraints. Prog NMR Spectrosc 55:101–127
Eghbalnia H, Bahrami A, Wang L, Assadi A, Markley J (2005) Probabilistic identification of spin systems and their assignments including coil-helix inference as output (PISTACHIO). J Biomol NMR 32:219–33
Fiorito F, Herrmann T, Damberger F, Wüthrich K (2008) Automated amino acid side-chain NMR assignment of proteins using (13)C- and (15)N-resolved 3D [(1)H, (1)H]-NOESY. J Biomol NMR 42:23–33
Fowler CA, Tian F, Al-Hashimi HM, Prestegard JH (2000) Rapid determination of protein folds using residual dipolar couplings. J Mol Biol 304:447–460
Georgiev I, Lilien RH, Donald BR (2008) The minimized dead-end elimination criterion and its application to protein redesign in a hybrid scoring and search algorithm for computing partition functions over molecular ensembles. J Comput Chem 29:1527–1542
GNU (2007) Free Software Foundation, GNU Lesser General Public License. http://www.gnu.org/copyleft/lesser.html
Goldstein RF (1994) Efficient rotamer elimination applied to protein side-chains and related spin glasses. Biophys J 66:1335–1340
Goto N, Gardner K, Mueller G, Willis R, Kay L (1999) A robust and cost-effective method for the production of Val, Leu, Ile (δ1) methyl-protonated 15N-, 13C-, 2H-labeled proteins. J Biomol NMR 13:369–374
Grishaev A, Llinás M (2002) CLOUDS, a protocol for deriving a molecular proton density via NMR. Proc Natl Acad Sci USA 99:6707–6712
Grishaev A, Llinás M (2002) Protein structure elucidation from NMR proton densities. Proc Natl Acad Sci USA 99:6713–6718
Gronwald W, Moussa S, Elsner R, Jung A, Ganslmeier B, Trenner J, Kremer W, Neidig K-P, Kalbitzer HR (2002) Automated assignment of NOESY NMR spectra using a knowledge based method (KNOWNOE). J Biomol NMR 23:271–287
Güntert P (2003) Automated NMR protein structure determination. Prog Nucl Magn Reson Spectrosc 43:105–125
Hammersley JM, Clifford P (1971) Markov field on finite graphs and lattices (unpublished)
Herrmann T, Güntert P, Wüthrich K (2002) Protein NMR structure determination with automated NOE assignment using the new software CANDID and the torsion angle dynamics algorithm DYANA. J Mol Biol 319(1):209–227
Huang YJ, Tejero R, Powers R, Montelione GT (2006) A topology-constrained distance network algorithm for protein structure determination from noesy data. Proteins 62(3):587–603
Huttenlocher DP, Jaquith EW (1995) Computing visual correspondence: incorporating the probability of a false match. In: Procedings of the fifth international conference on computer vision (ICCV 95), pp 515–522
Huttenlocher DP, Kedem K (1992) Distance metrics for comparing shapes in the plane. In: Donald BR, Kapur D, Mundy J (eds) Symbolic and numerical computation for artificial intelligence. Academic press, London, pp 201–219
Huttenlocher DP, Klanderman GA, Rucklidge W (1993) Comparing images using the Hausdorff distance. IEEE Trans Pattern Anal Mach Intell 15(9):850–863
Johnson BA, Blevins RA (1994) NMRView: a computer program for the visualization and analysis of NMR data. J Biomol NMR 4:603–614
Juszewski K, Gronenborn AM, Clore GM (1999) Improving the packing and accuracy of NMR structures with a pseudopotential for the radius of gyration. J Am Chem Soc 121:2337–2338
Kamisetty H, Bailey-Kellogg C, Pandurangan G (2006) An efficient randomized algorithm for contact-based NMR backbone resonance assignment. Bioinformatics 22(2):172–180
Kamisetty H, Xing E, Langmead C (2008) Free energy estimates of all-atom protein structures using generalized belief propagation. J Comput Biol 15:755–766
Kindermann R, Snell J (1980) Markov random fields and their applications. American Mathematical Society, Providence
Koradi R, Billeter M, Wüthrich K (1996) MOLMOL: a program for display and analysis of macromolecular structures. J Mol Graph 14(1):51–55
Kuszewski J, Schwieters CD, Garrett DS, Byrd RA, Tjandra N, Clore GM (2004) Completely automated, highly error-tolerant macromolecular structure determination from multidimensional nuclear overhauser enhancement spectra and chemical shift assignments. J Am Chem Soc 126(20):6258–6273
Kuszewski J, Thottungal R, Clore G, Schwieters C (2008) Automated error-tolerant macromolecular structure determination from multidimensional nuclear overhauser enhancement spectra and chemical shift assignments: improved robustness and performance of the PASD algorithm. J Biomol NMR 41(4):221–239
Langmead C, Donald B (2004) An expectation/maximization nuclear vector replacement algorithm for automated NMR resonance assignments. J Biomol NMR 29(2):111–138
Langmead CJ, Donald BR (2003) 3D structural homology detection via unassigned residual dipolar couplings. In: Procedings of 2003 IEEE comput syst bioinform conf, pp 209–217
Langmead CJ, Donald BR (2004b) High-throughput 3D structural homology detection via NMR resonance assignment. In: Procedings of 2004 IEEE comput syst bioinform conf, pp. 278–289
Langmead CJ, Yan AK, Lilien RH, Wang L, Donald BR (2003) A polynomial-time nuclear vector replacement algorithm for automated NMR resonance assignments. In: Proceedings of the seventh annual international conference on research in computational molecular biology, pp 176–187
Leach A, Lemon A (1998) Exploring the conformational space of protein side chains using dead-end elimination and the A* algorithm. Proteins 33(2):227–239
Lemak A, Gutmanas A, Chitayat S, Karra M, Farès C, Sunnerhagen M, Arrowsmith CH (2011) A novel strategy for nmr resonance assignment and protein structure determination. J Biomol NMR 49(1):27–38
Lemak A, Steren CA, Arrowsmith CH, Llinás M (2008) Sequence specific resonance assignment via multicanonical monte carlo search using an abacus approach. J Biomol NMR 41(1):29–41
Li K, Sanctuary B (1996) Automated extracting of amino acid spin systems in proteins using 3D HCCH-COSY/TOCSY spectroscopy and constrained partitioning algorithm (CPA). J Chem Inf Comput Sci 36:585–593
Li K, Sanctuary B (1997) Automated resonance assignment of proteins using heteronuclear 3D NMR. 2. Side chain and sequence-specific assignment. J Chem Inf Comput Sci 37:467–477
Li M, Phatnani HP, Guan Z, Sage H, Greenleaf AL, Zhou P (2005) Solution structure of the Set2-Rpb1 interacting domain of human Set2 and its interaction with the hyperphosphorylated C-terminal domain of Rpb1. Proc Natl Acad Sci 102:17636–17641
Lin Y, Wagner G (1999) Efficient side-chain and backbone assignment in large proteins: application to tGCN5. J Biomol NMR 15:227–239
Linge JP, Habeck M, Rieping W, Nilges M (2003) ARIA: Automated NOE assignment and NMR structure calculation. Bioinformatics 19(2):315–316
Looger L, Hellinga H (2001) Generalized dead-end elimination algorithms make large-scale protein side-chain structure prediction tractable: implications for protein design and structural genomics. J Mol Biol 3007(1):429–445
Lovell SC, Word JM, Richardson JS, Richardson DC (2000) The penultimate rotamer library. Proteins: Structure, Function and Genetics 40:389–408
Masse J, Keller R, Pervushin K (2006) SideLink: automated side-chain assignment of biopolymers from NMR data by relative-hypothesis-prioritization-based simulated logic. J Magn Reson 181(1):45–67
Montelione GT, Moseley HNB (1999) Automated analysis of NMR assignments and structures for proteins. Curr Opin Struct Biol 9:635–642
Mumenthaler C, Güntert P, Braun W, Wüthrich K (1997) Automated combined assignment of NOESY spectra and three-dimensional protein structure determination. J Biomol NMR 10(4):351–362
Ottiger M, Delaglio F, Bax A (1998) Measurement of J and dipolar couplings from simplified two-dimensional NMR spectra. J Magn Reson 138:373–378
Permi P, Rosevear PR, Annila A (2000) A set of HNCO-based experiments for measurement of residual dipolar couplings in 15N, 13C, (2H)-labeled proteins. J Biomol NMR 17:43–54
Pons J, Delsuc M (2001) RESCUE: an artificial neural network tool for the NMR spectral assignment of proteins. J Biomol NMR 15:15–16
Prestegard JH, Bougault CM, Kishore AI (2004) Residual dipolar couplings in structure determination of biomolecules. Chem Rev 104:3519–3540
Raman S, Lange OF, Rossi P, Tyka M, Wang X, Aramini J, Liu G, Ramelot TA, Eletsky A, Szyperski T, Kennedy MA, Prestegard J, Montelione GT, Baker D (2010) NMR structure determination for larger proteins using backbone-only data. Science 327(5968):1014–1018
Rieping W, Habeck M, Nilges M (2005) Inferential structure determination. Science 309:303–306
Rohl CA, Baker D (2002) De novo determination of protein backbone structure from residual dipolar couplings using rosetta. J Am Chem Soc 124:2723–2729
Ruan K, Briggman KB, Tolman JR (2008) De novo determination of internuclear vector orientations from residual dipolar couplings measured in three independent alignment media. J Biomol NMR 41:61–76
Russell S, Norvig P (2002) Artificial intelligence: a modern approach. Prentice Hall, Englewood Cliffs
Schwieters CD, Kuszewski JJ, Tjandra N, Clore GM (2003) The Xplor-NIH NMR molecular structure determination package. J Magn Reson 160:65–73
Shen Y, Lange O, Delaglio F, Rossi P, Aramini JM, Liu G, Eletsky A, Wu Y, Singarapu KK, Lemak A, Ignatchenko A, Arrowsmith CH, Szyperski T, Montelione GT, Baker D, Bax A (2008) Consistent blind protein structure generation from NMR chemical shift data. Proc Natl Acad Sci USA 105(12):4685–4690
Sun X, Druzdzel MJ, Yuan C (2007) Dynamic Weighting A* Search-Based MAP Algorithm for Bayesian Networks. In: Proceedings of the 20th international joint conference on artificial intelligence, pp 2385–2390
Tang Y, Schneider WM, Shen Y, Raman S, Inouye M, Baker D, Roth MJ, Montelione GT (2010) Fully automated high-quality nmr structure determination of small (2)h-enriched proteins. J Struct Funct Genomics 11(4):223–232
Tian F, Valafar H, Prestegard JH (2001) A dipolar coupling based strategy for simultaneous resonance assignment and structure determination of protein backbones. J Am Chem Soc 123:11791–11796
Tjandra N, Bax A (1997) Direct measurement of distances and angles in biomolecules by NMR in a dilute liquid crystalline medium. Science 278:1111–1114
Tolman JR, Flanagan JM, Kennedy MA, Prestegard JH (1995) Nuclear magnetic dipole interactions in field-oriented proteins: information for structure determination in solution. Proc Natl Acad Sci USA 92:9279–9283
Tugarinov V, Kanelis V, Kay LE (2006) Isotope labeling strategies for the study of high-molecular-weight proteins by solution NMR spectroscopy. Nat Protoc 1:749–754
Ulrich E, Akutsu H, Doreleijers J, Harano Y, Ioannidis Y, Lin J, Livny M, Mading S, Maziuk D, Miller Z, Nakatani E, Schulte C, Tolmie D, Wenger R, Yao H, Markley J (2007) BioMagResBank. Nucleic Acids Res 36:D402–D408
Vitek O, Bailey-Kellogg C, Craig B, Vitek J (2006) Inferential backbone assignment for sparse data. J Biomol NMR 35:187–208
Wang L, Donald BR (2004) Exact solutions for internuclear vectors and backbone dihedral angles from NH residual dipolar couplings in two media, and their application in a systematic search algorithm for determining protein backbone structure. J Biomol NMR 29(3):223–242
Wang L, Mettu R, Donald BR (2006) A polynomial-time algorithm for de novo protein backbone structure determination from NMR data. J Comput Biol 13(7):1276–1288
Wei Z, Li H (2007) A Markov random field model for network-based analysis of genomic data. Bioinformatics 23:1537–1544
Wu K-P, Chang J-M, Chen J-B, Chang C-F, Wu W-J, Huang T-H, Sung T-Y, Hsu W-L (2005) RIBRA-an error-tolerant algorithm for the NMR backbone assignment problem. In: Proceedings of the international conference on research in computational molecular biology (RECOMB’05), pp 229–244
Xu Y, Xu D, Uberbacher EC (1998) An efficient computational method for globally optimal threading. J Comput Biol 5(3):597–614
Yanover C, Weiss Y (2002) Approximate inference and protein-folding. In: NIPS, pp 1457–1464
Zeng J, Boyles J, Tripathy C, Wang L, Yan A, Zhou P, Donald BR (2009) High-resolution protein structure determination starting with a global fold calculated from exact solutions to the RDC equations. J Biomol NMR 45(3):265–281
Zeng J, Zhou P, Donald BR (2010) A Markov random field framework for protein side-chain resonance assignment. In: Proceedings of the 14th annual international conference on research in computational molecular biology (RECOMB’10) Lisbon, Portugal, vol 6044/2010. Springer-Verlag (Berlin), pp 550–570
Zheng D, Huang YJ, Moseley HNB, Xiao R, Aramini J, Swapna GVT, Montelione GT (2003) Automated protein fold determination using a minimal nmr constraint strategy. Prot Sci 12(6):1232–1246
Zimmerman D, Kulikowski C, Feng W, Tashiro M, Chien C-Y, Ríos C, Moy F, Powers R, Montelione G (1997) Automated analysis of protein NMR assignments using methods from artificial intelligence. J Mol Biol 269:592–610
Acknowledgements
We thank Dr. C. Bailey-Kellogg, Dr. M.S. Apaydin and Mr. J. Martin for reading our draft and providing us with valuable comments. We thank all members of the Donald and Zhou Labs for helpful discussions and comments. We are grateful to Ms. M. Bomar for helping us with pol η UBZ NMR data. We thank Dr. J. Liu for helping us check the side-chain resonance assignments of FF2. We thank the anonymous reviewers for their helpful comments and suggestions. This work is supported by the following grants from National Institutes of Health: R01 GM-65982 and R01 GM-78031 to B.R.D. and R01 GM-079376 to P.Z
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zeng, J., Zhou, P. & Donald, B.R. Protein side-chain resonance assignment and NOE assignment using RDC-defined backbones without TOCSY data. J Biomol NMR 50, 371–395 (2011). https://doi.org/10.1007/s10858-011-9522-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10858-011-9522-4