Unique opportunities for NMR methods in structural genomics
- 587 Downloads
This Perspective, arising from a workshop held in July 2008 in Buffalo NY, provides an overview of the role NMR has played in the United States Protein Structure Initiative (PSI), and a vision of how NMR will contribute to the forthcoming PSI-Biology program. NMR has contributed in key ways to structure production by the PSI, and new methods have been developed which are impacting the broader protein NMR community.
KeywordsFuture of structural genomics Functional genomics NMR Crystallography NMR methods Protein Structure Initiative (PSI)
The mission of structural coverage of most protein domain families, pioneered in PSI phases 1 and 2, is well on its way to completion . NMR has played an integral role in this endeavor [35, 43]. The goal of structural coverage at a sequence identity level of ~30% for most protein domains in nature will represent a monumental achievement for humankind, contributing in many ways toward our understanding of the relationships between protein sequence, structure, and function. As we ponder the future contributions of structural genomics (SG) for biomedical research, we envision many future opportunities beyond structure production that have been created by these high throughput structural biology platforms.
In the coming years, target selection strategies likely will go beyond the current sparse sampling of representative members of protein families to strategies aimed at providing extensive structural coverage of functional biological systems at high resolution. These systems could include (i) signaling networks and metabolic pathways, (ii) proteomes of medically important species, particularly humans, (iii) human disease-related proteins including infectious diseases, (iv) the human and environmental microbiomes (‘metagenomics’), and (v) comparative analysis of structure, dynamics, and biochemical function across protein families. The application of SG platforms to one or more of these biological systems would leverage NIH’s investment in SG pipelines to further our understanding of fundamental mechanisms of protein function, molecular evolution, biological processes, and human disease at a reduced cost. Alternatively, SG centers could be redefined to focus on increasing the range and types of structures that presently cannot be routinely determined or modeled; for example, membrane proteins, higher order protein complexes, and eukaryotic proteins with extensive natively disordered regions and/or posttranslational modifications.
In considering future efforts, we note that the purified proteins themselves are among the most valuable products of SG efforts. The largest expense in SG is the preparation of pure, soluble protein. Much more could be done with these proteins, particularly the large fraction that does not readily yield structures. Given that all proteins carry out their biochemical function through their interactions with other molecules, we propose that the full realization of the potential of SG platforms must integrate studies of functionally relevant interacting molecules for each protein target. Therefore, we envision that a key element of future SG projects or platforms would include a systematic attempt to integrate experimental protein binding, and/or biochemical information with structural data. Examples of such strategies, which would include HTP biochemical characterization of proteins, are (i) screening of ligand binding coupled with 3D structure analysis of functional protein-ligand complexes (see, for example, [23, 37], (ii) screening or characterization of enzymatic activity coupled with 3D structures of relevant protein substrate/cofactor/inhibitor complexes (see, for example, , and (iii) identification of protein-protein interaction partners coupled with 3D structures of relevant multiprotein complexes. A particularly powerful application of such integrated SG/functional studies would be the systematic and comprehensive characterization of the structural basis of ligand (or substrate) binding specificity of proteins with related, but distinct, binding profiles, so as to understand the structural basis of their specificity. Here we define “ligand” as any small molecule or macromolecule that interacts functionally with a protein. By adopting this approach, SG would have stronger synergy with functional genomics activities, and better integration with systems biology. These studies would also identify complexes that stabilize protein structures, and enable structures to be determined for otherwise refractory proteins.
NMR spectroscopy has a unique and valuable role in SG
During the course of PSI phases 1 and 2, we have shown that NMR is a highly complementary approach to X-ray crystallography for protein structure determination [32, 44]. Many proteins that provide good NMR spectra have not been successfully crystallized. In particular, in contrast to X-ray crystallography, NMR is about equally successful for prokaryotic and eukaryotic proteins. Therefore, comprehensive structural coverage of any protein system involving small to medium sized proteins would benefit from an NMR component.
NMR data provide the basis for extending the static structural view of proteins, through the rapid identification of natively unfolded proteins and residue-specific characterization of disordered protein segments, including functionally important flexible surface loops. NMR is also an essential tool for characterizing alternative conformations and allosteric states. In some cases, the minor conformational states that can only be characterized by NMR studies are critically important for biological function. NMR can also be used to measure the rates of transitions between these conformational states. As such, future SG efforts seeking to understand the evolution of structural, functional, and dynamic diversity across a protein family will require NMR studies to provide dynamic information.
NMR is also a powerful method for screening of functional protein-ligand, protein-protein, and protein-nucleic acid interactions. While other biophysical techniques are also capable of identifying such interactions, NMR is uniquely able to identify even transient, but functionally important, interactions. The protein samples, and most of the instrumentation and techniques required for rapid NMR screening studies, are the same as those already used in PSI NMR structure determination pipelines, allowing easy integration of functional screening techniques. NMR methods are also valuable for validating initial ‘hits’ identified in HTP screening. It is important to recognize that the use of NMR as a HTP screening tool is not limited by protein size, since one may monitor either the protein or the ligand to detect the interaction.
Finally, NMR data are used to generate new functional hypotheses, and to confirm functional annotations, interactions, or biochemical reaction rates revealed in other “omics” projects (e.g., functional genomics, transcriptomics, or metabonomics). Hence, we envision that NMR will play a key role to connect SG with these ‘omics’ approaches, thereby better integrating SG into systems biology.
Accomplishments of NMR SG groups during PSI
NESG, CESG, and JCSG have also developed new methodology for lowering the costs per NMR structure, including (i) protocols for HTP preparation of 13C/15N- and 13C/15N/2H- enriched samples using novel eukaryotic wheat-germ based cell-free expression systems [39, 40] and bacterial single protein production (SPP) systems [29, 33, 34], (ii) HTP NMR screening platforms using microprobe robotics for buffer and construct optimization , (iii) GFT NMR [2, 3, 19, 20, 36], and related HIFI  and APSY [13, 14, 15] NMR experiments for reducing NMR measurement times by more than an order of magnitude, (iv) software for semi-automated data analysis and structure calculations [4, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 21, 25, 26, 41, 46], (v) software and protocols for structure validation and refinement based on residual dipolar couplings (RDCs) and chemical shifts [31, 38, 42], and (vi) software and servers for comprehensive structure quality assessment [5, 17] and refinement . These methods have reduced the average time required per structure to 2–3 weeks for small to medium sized proteins; in favorable cases, NMR structures are determined in only a few days. Although not in the original charge to the PSI NMR groups, recent efforts in technology development have focused on addressing larger proteins, oligomeric structures, and protein-protein complexes. For example, the NYCOMPS and CSMP have made significant advances in developing new methods for sample preparation and NMR analysis of membrane protein structures [45, 27].
A promising future for NMR contributions to SG and the larger biomedical community
NMR’s role in structural biology is still rapidly evolving. Unlike x-ray crystallography, which has matured to a state in which almost all aspects can be highly automated, NMR is still approaching this goal. We are very optimistic that over the next decade NMR will continue to make gains analogous to those seen for crystallography over the past few decades. For example, recent advances demonstrate that sparse constraints, such as chemical shift, residual dipolar coupling data, and/or small numbers of long-range distance constraints, can be combined with conformational energy calculations to provide good quality protein structures. These emerging technologies will expand the range of proteins that can be addressed at high resolution by NMR, as well as the speed with which this can be done. The new avenues of biological research opened by SG platforms will be tremendously enhanced by these NMR technologies. Clearly, NMR approaches offer tremendous opportunities for SG projects, and will be required in order to extract the greatest knowledge and understanding of whichever biological systems are targeted in the next phase of SG research.
We thank B. Mao and J. Everett for their assistance in statistical analyses. This work was supported by National Institutes of Health Grants U54-GM074958 (Northeast Structure Genomics Consortium), U54-GM75026 (New York Consortium on Membrane Protein Structure), U54-GM074901 (Center for Eukaryotic Structural Genomics), and P41-RR02301 (National Magnetic Resonance Facility at Madison).
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
- 7.Davis IW, Leaver-Fay A, Chen VB, Block JN, Kapral GJ, Wang X, Murray LW, Arendall WB 3rd, Snoeyink J, Richardson JS, Richardson DC (2007) MolProbity: all-atom contacts and structure validation for proteins and nucleic acids. Nucleic Acids Res 35:W375–W383. doi: 10.1093/nar/gkm216 PubMedCrossRefGoogle Scholar
- 22.Liu G, Li Z, Chiang Y, Acton T, Montelione GT, Murray D, Szyperski T (2005) High-quality homology models derived from NMR and X-ray structures of E. coli proteins YgdK and Suf E suggest that all members of the YgdK/Suf E protein family are enhancers of cysteine desulfurases. Protein Sci 14:1597–1608. doi: 10.1110/ps.041322705 PubMedCrossRefGoogle Scholar
- 32.Snyder DA, Chen Y, Denissova NG, Acton T, Aramini JM, Ciano M, Karlin R, Liu J, Manor P, Rajan PA et al (2005) Comparisons of NMR spectral quality and success in crystallization demonstrate that NMR and X-ray crystallography are complementary methods for small protein structure determination. J Am Chem Soc 127:16505–16511. doi: 10.1021/ja053564h PubMedCrossRefGoogle Scholar
- 35.Szyperski T (2008) On NMR-based structural proteomics. In: Sussman JL, Silman I (eds) Structural proteomics. World Scientific Publishing, Hackensack, NJGoogle Scholar
- 37.Vedadi M, Niesen FH, Allali-Hassani A, Fedorov OY, Finerty PJ Jr, Wasney GA, Yeung R, Arrowsmith C, Ball LJ, Berglund H et al (2006) Chemical screening methods to identify ligands that promote protein stability, protein crystallization, and structure determination. Proc Natl Acad Sci USA 103:15835–15840. doi: 10.1073/pnas.0605224103 PubMedCrossRefGoogle Scholar
- 38.Vila JA, Aramini JM, Rossi P, Kuzin A, Su M, Seetharaman J, Xiao R, Tong L, Montelione GT, Scheraga HA (2008) Quantum chemical 13C(alpha) chemical shift calculations for protein NMR structure determination, refinement, and validation. Proc Natl Acad Sci USA 105:14389–14394. doi: 10.1073/pnas.0807105105 PubMedCrossRefGoogle Scholar