A simplified recipe for assigning amide NMR signals using combinatorial 14N amino acid inverse-labeling
- First Online:
- Cite this article as:
- Hiroaki, H., Umetsu, Y., Nabeshima, Y. et al. J Struct Funct Genomics (2011) 12: 167. doi:10.1007/s10969-011-9116-0
- 282 Downloads
Assignment of backbone amide proton resonances is one of the most time-consuming stages of any protein NMR study when the protein samples behave non-ideally. A robust and convenient NMR procedure for analyzing spectra of marginal-to-low quality is helpful for high-throughput structure determination. The 14N selective- and inverse-labeling method is a candidate solution. Here, we present a simplified protocol for assigning protein backbone amide NMR signals. When 14N inversely labeled residues are present in a protein, their backbone NH cross peaks vanish from the protein’s 1H–15N HSQC spectrum, and thus, their chemical shifts can be readily identified by a process of elimination. Some metabolically related amino acids, for example, Ile, Leu, and Val, cannot be individually incorporated but can be inversely labeled together. We optimized and simplified the protocol and M9-based medium formula for the 14N selective- and inverse-labeling method without any additives. Our approach should be cost-effective, because the method could be additively applied stepwise, even when the proteins of interest were found to be non-ideal.
KeywordsCombinatorial inverse-labeling Aβ(1–40) peptide NMR sample preparation Isotope labeling
Heteronuclear single quantum coherence spectroscopy
Band-selective optimized flip-angle short-transient heteronuclear multiple quantum coherence spectroscopy
Band-selective excitation short-transient
Transverse relaxation optimized correlation spectroscopy
The development of a robust and cost-effective NMR method that is applicable to assignment of backbone amide proton resonances of proteins with non-ideal properties is a challenge in structural genomics research. Usually, assignment of the backbone resonances is one of the most time-consuming but indispensable stages of any NMR study. A 1H–15N HSQC spectrum (and its variants, for example, SOFAST-HMQC  and TROSY ) provides a fingerprint of the protein of interest and is thereby important as a reference spectrum for other triple resonance experiments. The information obtained from a 1H–15N HSQC spectrum is not only necessary for 3D structure determination but also indispensable for examining protein dynamics, H/D exchange, and protein–ligand interactions. A process for assignment of the backbone resonances of proteins with molecular weights as large as 20 kDa routinely starts with preparation of a uniformly 13C/15N-labeled non-deuterated protein. Using this sample, several pairs of 3D NMR experiments are recorded, for example, HNCA/HN (CO)CA, HNCACB/CBCA (CO)NH, and HNCO/HN (CA)CO [3, 4, 5]. Among these pairs, one provides information on the intraresidual spin systems while another provides details of the interresidual connectivities. Thus, analysis of these data sets leads to sequence-specific assignment of all main chain signals along the peptide sequence.
Nevertheless, non-ideal properties of a protein sample often hamper the analysis of a protein’s NMR data set because some key resonances are often missing. Such undesired behaviors include limited solubility, nonspecific self-association, chemical exchange, and internal motion of the protein. These properties often result in low signal-to-noise ratios of certain cross-peaks in 3D NMR experiments, for example, those arising from 13Cβs in HNCACB and CBCA (CO)NH. As a result, assignment of the backbone resonances becomes difficult or even impossible because connectivities between residues are not uniquely identified. In addition, some non-ideal experimental conditions provide similar results. Examples of non-ideal NMR conditions of physiological or biophysical interest include highly viscous conditions, NMR analysis of a protein inside a living cell, presence of lipid micelles, and a high level of denaturants or salts.
At present, there are several options for solving the problem of missing resonances: (a) use a SOFAST-/BEST- NMR pulse scheme to gain signal intensity per experimental time [1, 6]; (b) use a nonlinear, sparse sampling method for the indirect dimension of a 3D/4D NMR experiment to minimize the measurement time and to maximize S/N [7, 8, 9]; and (c) use protein samples that contain selectively 15N-labeled residues [10, 11, 12]. However, there exist many drawbacks associated with these solutions in terms of the costs. For example, using the third strategy, the cost and effort associated with sample preparation is a major concern when projects like structural genomics that require massive protein preparation and NMR data acquisition are involved. The cost and effort required to introduce selectively labeled amino acids by using cell-free translation systems are now relatively minimized, but are still relevant .
In this study, we describe a simple method to obtain the information necessary for residue-type assignments that uses “combinatorial inverse-labeling” of sets of specific amino acids, an idea that was initially proposed by Shortle . In the original method, MOPS minimal media , modified such that glycerol was the sole carbon source, was employed with the combinatorial inverse-labeling method. A matrix of four combinations of amino acids redundantly covering 17 amino acids was designed and inversely labeled. The resulting five HSQC data sets (four inversely labeled and one uniformly labeled) were processed with the “labeling pattern matrix” to identify each amino acid type independent from the other 3D experiments. One major drawback of this method is the requirement of a special processing application.
Here, we further optimized the method for high-throughput structural determination in the current manner. By this method, we prepared 15N uniformly labeled and 14N selectively and inversely labeled protein samples as recombinant proteins expressed in Escherichia coli BL21(DE3) that were grown in M9 minimal media containing 15NH4Cl as the sole nitrogen source with or without supplementation of unlabeled (14N) amino acids. In addition, we proposed some new sets of combinations of metabolically related amino acids, which were found to be useful.
Materials and methods
The procedure for 14N selective- and inverse-labeling of a protein
Prepare 15N M9 medium containing glucose (4 g/L), 15N-ammonium chloride (0.7 g/L), NaH2PO4–12H2O (15 g/L), K2HPO4 (3 g/L), NaCl (0.5 g/L), MgSO4 (0.24 g/L), CaCl2-H2O (15 mg/L), vitamins, nucleobases (each at 100 mg/L), and ampicillin (50 mg/L)
Pick three to five colonies of E. coli BL21(DE3) to add into 1 mL LB medium and grow the culture overnight at 30–37°C
Add 1 mL of the bacterial culture to 50 mL M9 medium containing 15NH4Cl and incubate at 30–37°C until the OD600 of the culture is 0.8–1.0 (pre-culture medium)
Add the pre-culture medium to 450 mL of M9 medium containing 15NH4Cl and incubate at 30–37°C until the OD600 of the culture is 0.2–0.4 (main culture medium)
Add 100 mg/L (final concentration) of each 14N selected-labeled amino acid* and incubate the culture at 30–37°C for an additional 30–60 min
Induce protein expression with the addition of 0.5–1.0 mM IPTG (final concentration)
Incubate the culture at 30–37°C for an additional 2–4 h
Harvest the cells
The E. coli expression plasmid for human Aβ(1–40) peptide, which has a N-terminal maltose-binding protein tag and a linker containing tobacco etch virus protease cleavage site followed by Aβ(1–40), was generously provided by Dr. D. Hamada. 15N-labeled recombinant Aβ(1–40) and 14N selectively and inversely labeled Aβ(1–40)s were expressed in E. coli BL21(DE3) that had been transformed with the plasmid. Cells were grown at 37°C in 0.1 L M9 minimal media containing 15NH4Cl as the sole nitrogen source with or without unlabeled amino acid(s). Protein production was induced by adding IPTG to the final concentration of 1 mM when the OD600 of a culture reached 0.45. 14N-amino acids were added 45 min prior to IPTG induction. Cells were harvested after 18 h at 20°C in order to prevent aggregation of Aβ(1–40). Each cell-free extract was applied to an amylose resin column (New England Biolabs), and the fusion protein was eluted by maltose. The fusion protein was treated with tobacco etch virus protease (New England Biolabs), and the resultant peptide was finally purified by reversed-phase HPLC using a linear acetonitrile gradient containing dilute ammonium hydroxide.
Samples used for NMR spectroscopy were approximately 0.5 mM and 0.1 mM in fully 13C/15N-labeled and inversely labeled protein, respectively, 5% D2O–95% H2O, and 20 mM sodium phosphate (pH 5.5). For assignment of the backbone resonances, initially, HNCA, HN (CO)CA, HNCACB, CBCA (CO)NH, HNCO, HN (CA)CO, and 3D 15N-edited NOESY-HSQC spectra of fully 13C/15N-labeled proteins were recorded at 25°C using a 600 MHz Bruker DMX600 spectrometer or a 600 MHz Bruker AVANCE-III spectrometer . Then, 1H–15N HSQC spectra or 1H–15N SOFAST-HMQC spectra of several inversely labeled proteins were recorded for residue-type assignment according to the backbone resonance assignment circumstances. Data were processed using NMRPipe . IL1β backbone signal assignments were taken from the literature . Aβ(1–40) backbone signal assignments were taken from the literature  and confirmed by this method.
Results and discussion
To obtain the information necessary for backbone signal assignments when using data sets derived from low-quality 3D NMR spectra, we developed an assignment strategy using a “combinatorial inverse-labeling” method. According to this method, we prepared 15N uniformly labeled and 14N selectively and inversely labeled IL1βs. We expressed them in E. coli BL21(DE3) grown in M9 minimal media that contained 15NH4Cl as the sole nitrogen source with or without supplementation of specific 14N-labeled amino acids at concentrations of 100 mg/L each. The 14N amino acids were added to the cultures approximately 30–60 min prior to IPTG-induced protein expression. Because E. coli BL21(DE3) was a competent host strain, no special auxotrophic host strains were needed. The simplified fermentation protocol is summarized in Table 1.
Cross peak identification in the 1H–15N HSQC of IL1β inversely labeled with the selected 14N amino acid(s)
Number of expected cross peaks
Number of eliminated cross peaks
Number of overlapping cross peaksa
The merit of our simplified method compared with existing labeling methods (e.g., residue-specific 15N-labeling methods [30, 31] and a side chain/residue-specific 12C-inverse-labeling method, where other carbons are 13C-labeled ) is in the preparation of proteins with specific, metabolically related amino acids that are 14N inversely labeled in combination. Although our method requires two to seven additional protein samples with different 14N inversely labeled amino acids, the required concentration of each sample is less than 0.1 mM, which suffices for a single experiment of 1H–15N HSQC. Such samples are routinely prepared economically with a small scale fermentation (i.e., 0.2 L). In addition, inverse-labeling is routinely achieved by expressing the target protein in E. coli strain BL21(DE3) with the normal glucose-based M9 minimal medium—auxotrophic host strains, cell-free protein expression systems, or specialized glycerol-based MOPS medium are not required. This implies that one can skip the step for optimizing the conditions of individual experiments to prepare the additional inversely labeled samples. Thus, we found our method experimentally advantageous because it can be used additively after the standard NMR assignment experiments using the fully labeled sample, even when the quality of some of the 3D NMR spectra were found to be insufficient to complete the full assignment. Finally, this approach is not necessarily limited by sample viscosity (suitable for highly viscous samples) or protein locality (compatible with an in-cell NMR approach with E. coli) .
Our simplified combinatorial 14N selective- and inverse-labeling method has enormous potential for NMR-based structural biology studies of proteins, because the strategy should be widely applicable for proteins with non-ideal properties at any stage of the research project. The information from the spectra of 14N selectively and inversely labeled proteins could be applied to compensate for an otherwise incomplete 3D NMR data set in a stepwise manner. The method reduces the ambiguity often present during the initial sequential assignment trials and can confirm the tentative assignments derived from 3D-NMR spectra of marginal quality because the specific residue type of a cross peak can be readily identified as a result of the elimination of its cross peak in the 1H–15N HSQC spectrum.
We thank the people who helped to confirm the versatility of the method by applying it to each of the proteins: Dr. M. Shimizu (gcm-DBD), Mr. M. Itoh (LOV2phot), Dr. H. Tochio and Dr. H. Ohnishi (TIR-MyD88). We also thank Dr. D. Hamada for providing an Aβ expression system.