Advertisement

A Markov Random Field Framework for Protein Side-Chain Resonance Assignment

  • Jianyang Zeng
  • Pei Zhou
  • Bruce Randall Donald
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6044)

Abstract

Nuclear magnetic resonance (NMR) spectroscopy plays a critical role in structural genomics, and serves as a primary tool for determining protein structures, dynamics and interactions in physiologically-relevant solution conditions. The current speed of protein structure determination via NMR is limited by the lengthy time required in resonance assignment, which maps spectral peaks to specific atoms and residues in the primary sequence. Although numerous algorithms have been developed to address the backbone resonance assignment problem [68,2,10,37,14,64,1,31,60], little work has been done to automate side-chain resonance assignment [43, 48, 5]. Most previous attempts in assigning side-chain resonances depend on a set of NMR experiments that record through-bond interactions with side-chain protons for each residue. Unfortunately, these NMR experiments have low sensitivity and limited performance on large proteins, which makes it difficult to obtain enough side-chain resonance assignments. On the other hand, it is essential to obtain almost all of the side-chain resonance assignments as a prerequisite for high-resolution structure determination. To overcome this deficiency, we present a novel side-chain resonance assignment algorithm based on alternative NMR experiments measuring through-space interactions between protons in the protein, which also provide crucial distance restraints and are normally required in high-resolution structure determination. We cast the side-chain resonance assignment problem into a Markov Random Field (MRF) framework, and extend and apply combinatorial protein design algorithms to compute the optimal solution that best interprets the NMR data. Our MRF framework captures the contact map information of the protein derived from NMR spectra, and exploits the structural information available from the backbone conformations determined by orientational restraints and a set of discretized side-chain conformations (i.e., rotamers). A Hausdorff-based computation is employed in the scoring function to evaluate the probability of side-chain resonance assignments to generate the observed NMR spectra. The complexity of the assignment problem is first reduced by using a dead-end elimination (DEE) algorithm, which prunes side-chain resonance assignments that are provably not part of the optimal solution. Then an A* search algorithm is used to find a set of optimal side-chain resonance assignments that best fit the NMR data. We have tested our algorithm on NMR data for five proteins, including the FF Domain 2 of human transcription elongation factor CA150 (FF2), the B1 domain of Protein G (GB1), human ubiquitin, the ubiquitin-binding zinc finger domain of the human Y-family DNA polymerase Eta (pol η UBZ), and the human Set2-Rpb1 interacting domain (hSRI). Our algorithm assigns resonances for more than 90% of the protons in the proteins, and achieves about 80% correct side-chain resonance assignments. The final structures computed using distance restraints resulting from the set of assigned side-chain resonances have backbone RMSD 0.5 − 1.4 Å and all-heavy-atom RMSD 1.0 − 2.2 Å from the reference structures that were determined by X-ray crystallography or traditional NMR approaches. These results demonstrate that our algorithm can be successfully applied to automate side-chain resonance assignment and high-quality protein structure determination. Since our algorithm does not require any specific NMR experiments for measuring the through-bond interactions with side-chain protons, it can save a significant amount of both experimental cost and spectrometer time, and hence accelerate the NMR structure determination process.

Keywords

Nuclear Magnetic Resonance Markov Random Field Resonance Assignment Nuclear Magnetic Resonance Data Nuclear Overhauser Effect 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bailey-Kellogg, C., Chainraj, S., Pandurangan, G.: A Random Graph Approach to NMR Sequential Assignment. Journal of Computational Biology 12(6), 569–583 (2005)CrossRefGoogle Scholar
  2. 2.
    Bailey-Kellogg, C., Widge, A., Kelley, J.J., Berardi, M.J., Bushweller, J.H., Donald, B.R.: The NOESY jigsaw: automated protein secondary structure and main-chain assignment from sparse, unassigned NMR data. Journal of Computational Biology 7(3-4), 537–558 (2000)CrossRefGoogle Scholar
  3. 3.
    Baker, D., Sali, A.: Protein structure prediction and structural genomics. Science 294, 93–96 (2001)CrossRefGoogle Scholar
  4. 4.
    Ball, G., Meenan, N., Bromek, K., Smith, B.O., Bella, J., Uhrín, D.: Measurement of one-bond 13Cα-1Hα residual dipolar coupling constants in proteins by selective manipulation of CαHα spins. Journal of Magnetic Resonance 180, 127–136 (2006)Google Scholar
  5. 5.
    Baran, M.C., Huang, Y.J., Moseley, H.N., Montelione, G.T.: Automated analysis of protein NMR assignments and structures. Chem. Rev. 104, 3456–3541 (2004)CrossRefGoogle Scholar
  6. 6.
    Besag, J.: Spatial interaction and the statistical analysis of lattice systems. J. Royal Stat. Soc. B 36 (1974)Google Scholar
  7. 7.
    Bomar, M.G., Pai, M., Tzeng, S., Li, S., Zhou, P.: Structure of the ubiquitin-binding zinc finger domain of human DNA Y-polymerase η. EMBO reports 8, 247–251 (2007)CrossRefGoogle Scholar
  8. 8.
    Boykov, Y., Veksler, O., Zabih, R.: Markov random fields with efficient approximations. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, p. 648 (1998)Google Scholar
  9. 9.
    Chen, C.Y., Georgiev, I., Anderson, A.C., Donald, B.R.: Computational structure-based redesign of enzyme activity. Proc. Natl. Acad. Sci. USA 106, 3764–3769 (2009)CrossRefGoogle Scholar
  10. 10.
    Coggins, B.E., Zhou, P.: PACES: Protein sequential assignment by computer-assisted exhaustive search. Journal of Biomolecular NMR 26, 93–111 (2003)CrossRefGoogle Scholar
  11. 11.
    Cornilescu, G., Marquardt, J.L., Ottiger, M., Bax, A.: Validation of Protein Structure from Anisotropic Carbonyl Chemical Shifts in a Dilute Liquid Crystalline Phase. Journal of the American Chemical Society 120, 6836–6837 (1998)CrossRefGoogle Scholar
  12. 12.
    Desmet, J., Maeyer, M.D., Hazes, B., Lasters, I.: The dead-end elimination theorem and its use in protein side-chain positioning. Nature 356, 539–542 (1992)CrossRefGoogle Scholar
  13. 13.
    Donald, B.R., Martin, J.: Automated NMR assignment and protein structure determination using sparse dipolar coupling constraints. Progress in NMR Spectroscopy 55, 101–127 (2009)CrossRefGoogle Scholar
  14. 14.
    Eghbalnia, H.R., Bahrami, A., Wang, L.Y., Assadi, A., Markley, J.L.: Probabilistic identification of spin systems and their assignments including coil-helix inference as output (PISTACHIO). J. Biomol. NMR 32, 219–233 (2005)CrossRefGoogle Scholar
  15. 15.
    Fiorito, F., Herrmann, T., Damberger, F.F., Wüthrich, K.: Automated amino acid side-chain NMR assignment of proteins using (13)C- and (15)N-resolved 3D [(1)H, (1)H]-NOESY. J. Biomol. NMR 42, 23–33 (2008)CrossRefGoogle Scholar
  16. 16.
    Fiorito, F., Hiller, S., Wider, G., Wüthrich, K.: Automated resonance assignment of proteins: 6D APSY-NMR. J. Biomol. NMR 35, 27–37 (2006)CrossRefGoogle Scholar
  17. 17.
    Fowler, C.A., Tian, F., Al-Hashimi, H.M., Prestegard, J.H.: Rapid determination of protein folds using residual dipolar couplings. Journal of Molecular Biology 304, 447–460 (2000)CrossRefGoogle Scholar
  18. 18.
    Georgiev, I., Lilien, R.H., Donald, B.R.: The minimized dead-end elimination criterion and its application to protein redesign in a hybrid scoring and search algorithm for computing partition functions over molecular ensembles. Journal of Computational Chemistry 29, 1527–1542 (2008)CrossRefGoogle Scholar
  19. 19.
    Goldstein, R.F.: Efficient rotamer elimination applied to protein side-chains and related spin glasses. Biophysical Journal 66, 1335–1340 (1994)CrossRefGoogle Scholar
  20. 20.
    Grishaev, A., Llinás, M.: CLOUDS, a protocol for deriving a molecular proton density via NMR. Proc. Natl. Acad. Sci. USA 99, 6707–6712 (2002)CrossRefGoogle Scholar
  21. 21.
    Grishaev, A., Llinás, M.: Protein structure elucidation from NMR proton densities. Proc. Natl. Acad. Sci. USA 99, 6713–6718 (2002)CrossRefGoogle Scholar
  22. 22.
    Güntert, P.: Automated NMR Protein Structure Determination. Progress in Nuclear Magnetic Resonance Spectroscopy 43, 105–125 (2003)CrossRefGoogle Scholar
  23. 23.
    Güntert, P.: Automated NMR protein structure calculation with CYANA. Meth. Mol. Biol. 278, 353–378 (2004)Google Scholar
  24. 24.
    Herrmann, T., Güntert, P., Wüthrich, K.: Protein NMR Structure Determination with Automated NOE Assignment Using the New Software CANDID and the Torsion Angle Dynamics Algorithm DYANA. Journal of Molecular Biology 319(1), 209–227 (2002)CrossRefGoogle Scholar
  25. 25.
    Hiller, S., Joss, R., Wider, G.: Automated NMR assignment of protein side chain resonances using automated projection spectroscopy (APSY). J. Am. Chem. Soc. 130(36), 12073–12079 (2008)CrossRefGoogle Scholar
  26. 26.
    Huang, Y.J., Tejero, R., Powers, R., Montelione, G.T.: A topology-constrained distance network algorithm for protein structure determination from NOESY data. Proteins: Structure Function and Bioinformatics 62(3), 587–603 (2006)CrossRefGoogle Scholar
  27. 27.
    Huttenlocher, D.P., Jaquith, E.W.: Computing visual correspondence: Incorporating the probability of a false match. In: Proceedings of the Fifth International Conference on Computer Vision (ICCV 1995), pp. 515–522 (1995)Google Scholar
  28. 28.
    Huttenlocher, D.P., Kedem, K.: Distance Metrics for Comparing Shapes in the Plane. In: Donald, B.R., Kapur, D., Mundy, J. (eds.) Symbolic and Numerical Computation for Artificial Intelligence, pp. 201–219. Academic Press, London (1992)Google Scholar
  29. 29.
    Huttenlocher, D.P., Klanderman, G.A., Rucklidge, W.: Comparing Images Using the Hausdorff Distance. IEEE Trans. Pattern Anal. Mach. Intell. 15(9), 850–863 (1993)CrossRefGoogle Scholar
  30. 30.
    Juszewski, K., Gronenborn, A.M., Clore, G.M.: Improving the Packing and Accuracy of NMR Structures with a Pseudopotential for the Radius of Gyration. Journal of the American Chemical Society 121, 2337–2338 (1999)CrossRefGoogle Scholar
  31. 31.
    Kamisetty, H., Bailey-Kellogg, C., Pandurangan, G.: An efficient randomized algorithm for contact-based NMR backbone resonance assignment. Bioinformatics 22(2), 172–180 (2006)CrossRefGoogle Scholar
  32. 32.
    Kamisetty, H., Xing, E.P., Langmead, C.J.: Free Energy Estimates of All-atom Protein Structures Using Generalized Belief Propagation. Journal of Computational Biology 15, 755–766 (2008)CrossRefMathSciNetGoogle Scholar
  33. 33.
    Kindermann, R., Snell, J.L.: Markov Random Fields and Their Applications. American Mathematical Society, Providence (1980)Google Scholar
  34. 34.
    Kuszewski, J., Schwieters, C.D., Garrett, D.S., Byrd, R.A., Tjandra, N., Clore, G.M.: Completely automated, highly error-tolerant macromolecular structure determination from multidimensional nuclear overhauser enhancement spectra and chemical shift assignments. J. Am. Chem. Soc. 126(20), 6258–6273 (2004)CrossRefGoogle Scholar
  35. 35.
    Langmead, C.J., Donald, B.R.: 3D structural homology detection via unassigned residual dipolar couplings. In: Proceedings of 2003 IEEE Comput. Syst. Bioinform. Conf., pp. 209–217 (2003)Google Scholar
  36. 36.
    Langmead, C.J., Donald, B.R.: High-throughput 3D structural homology detection via NMR resonance assignment. In: Proceedings of 2004 IEEE Comput. Syst. Bioinform. Conf., pp. 278–289 (2004)Google Scholar
  37. 37.
    Langmead, C.J., Yan, A.K., Lilien, R.H., Wang, L., Donald, B.R.: A polynomial-time nuclear vector replacement algorithm for automated NMR resonance assignments. In: Proceedings of the seventh annual international conference on Research in computational molecular biology, pp. 176–187 (2003)Google Scholar
  38. 38.
    Langmead, C.J., Donald, B.R.: An expectation/maximization nuclear vector replacement algorithm for automated NMR resonance assignments. J. Biomol. NMR 29(2), 111–138 (2004)CrossRefGoogle Scholar
  39. 39.
    Leach, A.R., Lemon, A.P.: Exploring the conformational space of protein side chains using dead-end elimination and the A* algorithm. Proteins 33(2), 227–239 (1998)CrossRefGoogle Scholar
  40. 40.
    Li, K.B., Sanctuary, B.C.: Automated extracting of amino acid spin systems in proteins using 3D HCCH-COSY/TOCSY spectroscopy and constrained partitioning algorithm (CPA). J. Chem. Inf. Comput. Sci. 36, 585–593 (1996)Google Scholar
  41. 41.
    Li, K.B., Sanctuary, B.C.: Automated resonance assignment of proteins using heteronuclear 3D NMR. 2. Side chain and sequence-specific assignment. J. Chem. Inf. Comput. Sci. 37, 467–477 (1997)Google Scholar
  42. 42.
    Li, M., Phatnani, H.P., Guan, Z., Sage, H., Greenleaf, A.L., Zhou, P.: Solution structure of the Set2-Rpb1 interacting domain of human Set2 and its interaction with the hyperphosphorylated C-terminal domain of Rpb1. Proceedings of the National Academy of Sciences 102, 17636–17641 (2005)CrossRefGoogle Scholar
  43. 43.
    Lin, Y., Wagner, G.: Efficient side-chain and backbone assignment in large proteins: Application to tGCN5. J. Biomol. NMR 15, 227–239 (1999)CrossRefGoogle Scholar
  44. 44.
    Linge, J.P., Habeck, M., Rieping, W., Nilges, M.: ARIA: Automated NOE assignment and NMR structure calculation. Bioinformatics 19(2), 315–316 (2003)CrossRefGoogle Scholar
  45. 45.
    Looger, L.L., Hellinga, H.W.: Generalized dead-end elimination algorithms make large-scale protein side-chain structure prediction tractable: implications for protein design and structural genomics. J. Mol. Biol. 3007(1), 429–445 (2001)CrossRefGoogle Scholar
  46. 46.
    Marin, A., Malliavin, T.E., Nicolas, P., Delsuc, M.A.: From NMR chemical shifts to amino acid types: investigation of the predictive power carried by nuclei. Journal of Biomolecular NMR 30, 47 (2004)CrossRefGoogle Scholar
  47. 47.
    Masse, J.E., Keller, R., Pervushin, K.: SideLink: automated side-chain assignment of biopolymers from NMR data by relative-hypothesis-prioritization-based simulated logic. Journal of Magnetic Resonance 181(1), 45–67 (2006)CrossRefGoogle Scholar
  48. 48.
    Montelione, G.T., Moseley, H.N.B.: Automated analysis of NMR assignments and structures for proteins. Curr. Opin. Struct. Biol. 9, 635–642 (1999)CrossRefGoogle Scholar
  49. 49.
    Mumenthaler, C., Güntert, P., Braun, W., Wüthrich, K.: Automated combined assignment of NOESY spectra and three-dimensional protein structure determination. Journal of Biomolecular NMR 10(4), 351–362 (1997)CrossRefGoogle Scholar
  50. 50.
    Ottiger, M., Delaglio, F., Bax, A.: Measurement of J and dipolar couplings from simplified two-dimensional NMR spectra. Journal of Magnetic Resonance 138, 373–378 (1998)CrossRefGoogle Scholar
  51. 51.
    Prestegard, J.H., Bougault, C.M., Kishore, A.I.: Residual Dipolar Couplings in Structure Determination of Biomolecules. Chemical Reviews 104, 3519–3540 (2004)CrossRefGoogle Scholar
  52. 52.
    Rieping, W., Habeck, M., Nilges, M.: Inferential Structure Determination. Science 309, 303–306 (2005)CrossRefGoogle Scholar
  53. 53.
    Ruan, K., Briggman, K.B., Tolman, J.R.: De novo determination of internuclear vector orientations from residual dipolar couplings measured in three independent alignment media. Journal of Biomolecular NMR 41, 61–76 (2008)CrossRefGoogle Scholar
  54. 54.
    Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach. Prentice Hall, Englewood Cliffs (2002)Google Scholar
  55. 55.
    Schwieters, C.D., Kuszewski, J.J., Tjandra, N., Clore, G.M.: The Xplor-NIH NMR molecular structure determination package. J. Magn. Reson. 160, 65–73 (2003)CrossRefGoogle Scholar
  56. 56.
    Sun, X., Druzdzel, M.J., Yuan, C.: Dynamic Weighting A* Search-Based MAP Algorithm for Bayesian Networks. In: Proceedings of the 20th International Joint Conference on Artificial Intelligence, pp. 2385–2390 (2007)Google Scholar
  57. 57.
    Tjandra, N., Bax, A.: Direct measurement of distances and angles in biomolecules by NMR in a dilute liquid crystalline medium. Science 278, 1111–1114 (1997)CrossRefGoogle Scholar
  58. 58.
    Tolman, J.R., Flanagan, J.M., Kennedy, M.A., Prestegard, J.H.: Nuclear magnetic dipole interactions in field-oriented proteins: Information for structure determination in solution. Proc. Natl. Acad. Sci. USA 92, 9279–9283 (1995)CrossRefGoogle Scholar
  59. 59.
    Ulrich, E.L., Akutsu, H., Doreleijers, J.F., Harano, Y., Ioannidis, Y.E., Lin, J., Livny, M., Mading, S., Maziuk, D., Miller, Z., Nakatani, E., Schulte, C.F., Tolmie, D.E., Wenger, R.K., Yao, H., Markley, J.L.: BioMagResBank. Nucleic Acids Research 36, D402–D408 (2007)CrossRefGoogle Scholar
  60. 60.
    Vitek, O., Bailey-Kellogg, C., Craig, B., Vitek, J.: Inferential backbone assignment for sparse data. J. Biomolecular NMR 35, 187–208 (2006)CrossRefGoogle Scholar
  61. 61.
    Wang, L., Donald, B.R.: Exact solutions for internuclear vectors and backbone dihedral angles from NH residual dipolar couplings in two media, and their application in a systematic search algorithm for determining protein backbone structure. Jour. Biomolecular NMR 29(3), 223–242 (2004)CrossRefGoogle Scholar
  62. 62.
    Wang, L., Mettu, R., Donald, B.R.: A Polynomial-Time Algorithm for De Novo Protein Backbone Structure Determination from NMR Data. Journal of Computational Biology 13(7), 1276–1288 (2006)CrossRefMathSciNetGoogle Scholar
  63. 63.
    Wei, Z., Li, H.: A Markov random field model for network-based analysis of genomic data. Bioinformatics 23, 1537–1544 (2007)CrossRefMathSciNetGoogle Scholar
  64. 64.
    Wu, K.-P., Chang, J.-M., Chen, J.-B., Chang, C.-F., Wu, W.-J., Huang, T.-H., Sung, T.-Y., Hsu, W.-L.: RIBRA-an Error-Tolerant Algorithm for the NMR Backbone Assignment Problem. In: Proceedings of the International conference on Research in Computational Molecular Biology (RECOMB 2005), pp. 229–244 (2005)Google Scholar
  65. 65.
    Xu, Y., Xu, D., Uberbacher, E.C.: An efficient computational method for globally optimal threading. J. Comput. Biol. 5(3), 597–614 (1998)CrossRefGoogle Scholar
  66. 66.
    Zeng, J., Boyles, J., Tripathy, C., Wang, L., Yan, A., Zhou, P., Donald, B.R.: High-Resolution Protein Structure Determination Starting with a Global Fold Calculated from Exact Solutions to the RDC Equations. Journal of Biomolecular NMR 45, 265–281 (2009)CrossRefGoogle Scholar
  67. 67.
    Zeng, J., Zhou, P., Donald, B.R.: A Markov Random Field Framework for Protein Side-Chain Resonance Assignment – Supplementary Material. Department of Computer Science, Duke University (January 2010), http://www.cs.duke.edu/donaldlab/Supplementary/recomb10/
  68. 68.
    Zimmerman, D.E., Kulikowski, C.A., Feng, W., Tashiro, M., Chien, C.-Y., Ríos, C.B., Moy, F.J., Powers, R., Montelione, G.T.: Automated analysis of protein NMR assignments using methods from artificial intelligence. J. Mol. Biol. 269, 592–610 (1997)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Jianyang Zeng
    • 1
  • Pei Zhou
    • 2
  • Bruce Randall Donald
    • 1
    • 2
  1. 1.Department of Computer ScienceDuke UniversityDurhamUSA
  2. 2.Department of BiochemistryDuke University Medical CenterDurhamUSA

Personalised recommendations