Introduction

RNA molecules have taken center stage in effecting a broad range of important biological functions, partly because they can adopt complex three-dimensional (3D) architectures that are pliable and adaptable to their environment (Al-Hashimi and Walter 2008; Dayie 2008; Hall 2008; Lu et al. 2010). The development of isotopic labeling methods in making uniformly labeled RNA (Batey et al. 1992; Nikonowicz et al. 1992; Michnicka et al. 1993; Hines et al. 1994) had an impact on the development of multidimensional NMR techniques to explore the structural characteristics of small to medium sized RNA molecules (Batey et al. 1995; Latham et al. 2005). However, incorporation of site specific isotopically labeled nucleotides into RNA is necessary to overcome the drawbacks of uniform labeling (Johnson et al. 2006; Johnson and Hoogstraten 2008; Schultheisz et al. 2008; Dayie and Thakur 2010; Thakur et al. 2010a, b; Thakur and Dayie 2011) as we and others have demonstrated using RNAs labeled with uniform and selective NTPs (Johnson et al. 2006; Johnson and Hoogstraten 2008; Schultheisz et al. 2008; Dayie and Thakur 2010; Thakur et al. 2010a, b; Thakur and Dayie 2011). Previous work had focused on site specific labeling using the gluconeogenic carbon sources of glycerol, glycerol and formate, pyruvate, or acetate using either wild type or two mutant E. coli strains (Hoffman and Holland 1995; Johnson et al. 2006; Johnson and Hoogstraten 2008; Dayie and Thakur 2010; Thakur et al. 2010a, b; Thakur and Dayie 2011), and glucose using wild type E. coli (Kishore et al. 2005). However because E. coli grows readily on glucose with very good yield compared to the other gluconeogenic carbon precursors (Thakur et al. 2010a), it is attractive to explore glucose as an alternative carbon source. Unfortunately, all the E. coli strains used so far have undesirable scrambling in the ribose ring when grown on glucose.

Most of the scrambling in the ribose region is caused by the admixture of the metabolic flux from both the oxidative pentose phosphate pathway (oPPP) and the non-oxidative pentose phosphate pathway (noPPP) (Josephson and Fraenkel 1969; Josephson and Fraenkel 1974). Thus, an E. coli variant that had the noPPP genes disabled would provide new labels without the undesired scrambling when grown on glucose (Fig. 1). The transketolase (tktA) enzyme shunts metabolites in both directions of the noPPP. Deletion of this enzyme, therefore, would enable the redirection of most of the metabolic flux through the oPPP (Fig. 1). To test this hypothesis, K-12 E. coli strain was grown on labeled [2-13C]-glucose, whereas tktA E. coli was grown on either [1-13C]-glucose, or [2-13C]-glucose, or both [1-13C]-glucose and [2-13C]-glucose. To compare to previous work, K12 E. coli was also grown on [1-13C]-acetate and [2-13C]-acetate (Thakur and Dayie 2011). The results demonstrate the advantages of the new, site-specifically labeled nucleotides for NMR structural and dynamics studies.

Fig. 1
figure 1

Metabolic pathway involved in the production of nucleic acid nucleotides from glucose for wild type and tktA E. coli strain as derived from Covert and Palsson (2002). Atom labels for the terminal (1, 6) carbons (magenta and thin circle) and central (2, 5) carbon (cyan and thick circle) of glucose are highlighted. Positions that are enriched due to the presence of 13CO2 in the growth medium are shown with an X. Pyrimidine base derived from the oxaloacetate (OAA). The labeling pattern of purines from glycine derived from 3-phosphate glycerate (3PG) are shown such that if [1-13C]-glucose is used only the Cβ position of Gly and therefore C2 and C8 positions of the purine ring are labeled. Otherwise if [2-13C]-glucose is used the Cα of Gly and therefore C5 of the purine ring is labeled. The segments labeled I are the patterns of labeling expected from both oPPP and noPPP, and those labeled II are labeling patterns expected if GA3P is derived from glycolysis. For the TCA cycle, the expected pattern of labeling following the first, second, and fourth passes through the cycle are also shown

Materials and methods

Bacterial strains

The mutant strain tktA (CGSC # 11606, F-, Δ(araD-araB)567, ΔlacZ4787(::rrnB-3), λ, ΔtktA783::kan, rph-1, Δ(rhaD-rhaB)568, hsdR514) and the wild-type K12 strain (Clowes 1968) (CGSC # 4401:F+) used in this work were obtained from the Yale Coli Genetic Stock Center; Dr. Paliy kindly provided the wild-type K12 NCM3722 (Soupene et al. 2003).

Isotopes

The following isotopic-enriched compounds were purchased from Cambridge Isotope Laboratory (Andover, MA) and Isotec-Sigma-Aldrich (Miamisburg, OH): [1-13C]-glucose (99%), [2-13C]-glucose (99%), [1-13C]-acetate (99%), [2-13C]-acetate (99%) and [15N]–(NH4)2SO4 (99%).

Media for bacterial growth

Both Luria–Bertani (LB) and LeMaster Richard (LMR) minimal media were prepared as described (Sambrook and Russell 2001; Dayie and Thakur 2010; Thakur et al. 2010a, b; Thakur and Dayie 2011). The LMR media contains 176 mM KH2PO4, 25 mM NaOH, 10 μl H2SO4, 12.6 mM (NH4)2SO4, 2 mM MgSO4, 10 μM FeSO4 and 0.2% trace metals, supplemented with the appropriate 13C-enriched carbon (glucose or acetate) and 15N-enriched nitrogen ((15NH4)2SO4) sources.

Method for growth optimization

The growth of the E. coli mutant strain tktA was optimized for biomass production as described previously for other strains (Thakur et al. 2010a, b). Only in this case kanamycin was added to both the growth media and the LB plates. Briefly, a single colony of tktA E. coli grown on kanamycin LB plates was inoculated into a 5 ml starter culture in unlabeled LMR medium and incubated at 37°C overnight. The overnight culture was pelleted, the pellet was washed twice in 1× phosphate-buffered saline (PBS) and dissolved in fresh 5 ml LMR medium with no carbon source; 1 ml from this solution was diluted into a 50 ml culture in LMR medium, and grown overnight at 37°C. The 50 ml culture was pelleted prior to complete saturation of these cells, the pellet was washed twice in 1× PBS, and then dissolved in 50 ml of fresh medium with no carbon sources. A 5 ml aliquot of this resuspension was added to 500 ml LMR medium supplemented with labeled carbon sources (glucose) and incubated at 37°C for 6–8 h.

P1 nuclease digestion and boronate affinity column nucleotide purification

The labeled ribonucleotides were isolated from E. coli cultures as described previously (Batey et al. 1992; Dayie and Thakur 2010). Briefly, the mixture containing the digested nucleic acids was separated into individual ribo- or deoxyribo-nucleotides using a cis-diol boronate affinity chromatography column as described (Batey et al. 1992; Dayie and Thakur 2010). The purity of these rNMPs was confirmed by NMR prior to use in phosphorylation reactions. For relaxation experiments, the uniformly or site specifically labeled cytidine 5′-monophosphate (CMP) was first dissolved in 10 mM phosphate buffer pH 6.4 (made in 100% D2O), lyophilized and re-suspended in 250 μl of 100% D8 Glycerol. The site specific labeled CMP (extracted from tktA E. coli strain grown on [2-13C]-glucose) was first separated by FPLC, re-suspended in 10 mM phosphate buffer pH 6.4 (this buffer was again made in 100% D2O), lyophilized, and re-dissolved in 100% D8 glycerol.

Enzymatic phosphorylation

The enzymatic phosphorylation procedure was adapted from the method developed by Nyholm et al. (1995). The individual rNMPs were chromatographically separated and purified, and the purified rNMPs were phosphorylated to the corresponding rNTPs using kinases specific for each of the rNMPs in the presence of a coupled thioredoxin-dithiothretol redox system (Nyholm et al. 1995). Labeled rAMP was converted to the labeled rATP enzymatically in less than 1 h, and the progression of the rATP phosphorylation was monitored on an analytical TARGA C18 column. This reaction mixture was purified further on a TARGA C18 column (250 × 20 mm) by reverse phase FPLC. A linear gradient of Buffer A (0.1 M KH2PO4) and Buffer B (20% MeCN in 0.1 M KH2PO4) were run at a flow rate of 10 ml/min in five column volumes for the separation. Collected ATP fractions were pooled and desalted prior to use for NMR and RNA transcription reactions.

RNA labeling and transcription

RNA samples were synthesized by in vitro transcription with T7 RNA polymerase using unlabeled rCTP, rGTP, and rUTP combined with either uniformly 13C/15N-labeled rATP or site specifically 13C/15N-labeled rATP using established protocols (Milligan et al. 1987; Milligan and Uhlenbeck 1989; Puglisi and Wyatt 1995). A second set of samples were also made using unlabeled rATP, rCTP, and rGTP, combined with either uniformly 13C/15N-labeled rUTP or site specifically 13C/15N-labeled rUTP. A mutant His-tagged T7 RNA polymerase (Guillerez et al. 2005) was overexpressesed in E. coli BL21 (DE3) and purified on a Ni-chelating Sepharose column (Pharmacia) using established methods. The DNA promoter sequence, or top strand, has a C nucleotide at the −18 T7 promoter region (Baklanov et al. 1996) (CTOP) with the following sequence: 5′ CTA ATA CGA CTC ACT ATA G-3′. The corresponding templates used for NMR analysis for D5 RNA and A-Site RNA were 5′-g AAC CGT ACG TGC GAC TTT CAT CGC ATA CGG CTC c TAT AGT GAG TCG TAT TAG-3′ (lower case letters represent nucleotides introduced to improve transcription yield and they do not affect the catalytic ability of the D123 ribozyme (Gumbs et al. 2006); and 5′-mGmGC GAC TTC ACC CGA AGG TGT GAC GCC TAT AGT GAG TCG TAT TAG-3′ (the lower case letter m represents 2′-methoxyl modification of the first two nucleotides of the template strand). Two terminal 2′-O-methyl modifications in the template strand indicated by “m” were introduced to substantially reduce the amount of transcripts with extra nucleotides at the 3′-end (Kao et al. 1999). The optimal transcription conditions were found by a systematic variation of Mg2+ and rNTP to be as follows: for D5 RNA these were 15 mM total NTPs and 13.7 mM Mg2+, and for A-Site RNA these were 10 mM total NTPs and 15 mM Mg+2. The reactions were carried out in transcription buffer C (40 mM Tris–HCl, pH 8.1, 1 mM spermidine, 5 mM dithiothreitol (DTT), 0.01% Triton X-100, 80 mg/ml PEG 8000), 300 nM each DNA strand, and 1.5 μl T7 polymerase (optimized amount) per 40 μl of transcription volume. After 3 h of incubation at 37°C, each RNA from the transcription reaction was purified and dialyzed extensively as described (Dayie 2005). After dialysis, the RNA was lyophilized, and resuspended into NMR buffer (100 mM KCl, 10 mM potassium phosphate pH 6.2, 8% D2O (or 100%), and a trace of sodium azide) with or without 3 mM MgCl2. A small sample was hydrolyzed in 1.0 M sodium hydroxide, neutralized with HCl, and the UV absorbance was used to calculate the sample concentrations. NMR sample volumes were 250 μl in Shigemi tubes, with concentrations of 0.1 mM and 0.4 mM for D5 and A-Site RNAs respectively.

NMR experiments

All NMR experiments were run on a four channel Bruker Avance III 600 MHz spectrometer equipped with a triple resonance probe with actively shielded z-axis gradient. All the experiments were conducted at temperatures from 15 to 45°C. The NMR data sets were processed and the peak positions and intensities were analyzed with Bruker’s TOPSPIN 2.1 as described previously (Dayie and Thakur 2010). One dimensional (1D) 13C spectra and two-dimensional non-constant-time 1H, 13C-heteronuclear single quantum correlation (HSQC) spectra (Bodenhausen and Ruben 1980; Bax et al. 1990) were acquired to analyze the rNMP fractions extracted from each bacterial strain. Three methods were used to ascertain the labeling pattern of carbon in both the ribose and the base rings. First, the fractional 13C enrichment at each carbon site was quantified using the 13C satellite of the well resolved H1′ resonance in a 1D 1H experiment without 13C decoupling during acquisition. The ratio of the sum of the intensities of satellite peaks to the sum of the intensities of the satellite and center peaks was used to calculate the absolute enrichment level. Second, for cases of overlap in the 1D experiments, a 2-bond (2JHN) HSQC (Dayie and Thakur 2010; Thakur et al. 2010b) was used. Here again the ratio of the sum of the intensities of satellite peaks to the sum of the intensities of the satellite and center peaks was used to calculate the absolute enrichment level. Third, for those carbon sites where no quantitative information could be obtained from either the 1D or the 2-bond (2JHN) HSQC experiments, the absolute enrichment of proton resonances obtained from the 1D 1H experiments and relative enrichment values obtained from a carbon HSQC spectrum were used to compute the absolute enrichment levels (Walker et al. 1982; Kishore et al. 2005).

To ascertain the effect of constant time and band selective decoupling on spectral sensitivity, three HSQC variants were used. Non-constant time HSQC with and without band-selective decoupling during the t1 carbon evolution period and constant-time HSQC experiments were run on uniform and site-specific 13C labeled RNA samples. For the band-selective decoupled HSQC, WURST-2 decoupling of C2′ resonances was used during the carbon evolution period, as described previously for 13C-edited NOESY (Brutscher et al. 2001; Dayie 2005). For constant time experiments, total constant-time delays of 26.6 and 53.2 ms were used to examine the effects of resolution and sensitivity.

To explore the effect of uniform 13C labeling on relaxation measurements, longitudinal (R1) relaxation rates were measured for ribose C1′ and cytosine nucleobase C6 carbons using TROSY detected experiments (Hansen and Al-Hashimi 2007) for uniformly and site-specifically labeled samples without the need for selective pulses. The following R1 mixing times were used for both uniformly labeled and site specifically labeled CMPs dissolved in perdeuterated glycerol (prepared as described above): 21.1, 63.4, 147.8, 316.8, 401.3 and 823.7 ms. The experiments were carried out at 45°C using TROSY detected R1 experiments with the carbon carrier at 142.4 ppm. For R1 measurements on uniformly ATP-labeled D5 RNA and D5 RNA made using site specific labeled ATP, the following mixing times were used: 21.1 (2×), 63.4, 147.8, 232.3, 401.3,528.0, 612.5, and 823.7 ms. For both samples, the carbon carrier was placed at 89.9 ppm. R1 rates were fitted assuming monoexponential decay as described (Eldho and Dayie 2007). The uncertainty, ΔR1, was calculated based on random noise of the spectra, using the expression, ΔR1 = δe/I, where δe is the r.m.s. noise measured in the reference spectrum, and I is the intensity of each peak (Ishima and Torchia 2005).

The magnesium ion binding surface of D5 RNA was mapped by the chemical shift perturbation method using non-constant time two-dimensional 1H–13C HSQC experiment. Spectra were collected on ~0.1 mM uniformly 13C/15N-ATP labeled or site selectively ATP labeled D5 RNA samples in either 0 or 4 mM MgCl2. Each complex was heated to 363 K for 1 min and rapidly cooled on ice for 10 min before taking the NMR spectra. All the experiments were carried out at 298 K on a D5 sample in MOPS buffer pH 6.5, and 100 mM KCl. The non-constant time HSQC utilized pulsed field gradients for coherence order selection and sensitivity enhancement (Palmer et al. 1991; Kay et al. 1992), each dataset consisted of 128 complex points in t1, 2 K real points in t2, 128 scans per t1 increment, and a recycle delay of 1 s. The carbon carrier was placed at 89 ppm and a spectral width of 8 ppm was used. All the data sets were processed identically without applying any window functions. Similarly, binding of paramomycin to ~0.4 mM uniformly 13C/15N-ATP labeled or site selectively ATP labeled A-Site RNA fragment was monitored through changes in the peak positions of each nucleotide. Experiments were conducted at 315 K in 10 mM potassium phosphate buffer (pH 6.5), and 100 mM KCl. For each experiment a spectral width of 8 ppm, a 1.3 s recycle delay and 256 scans were used.

Results

Alternate site-specifically labeled rNMP from biomass production using mutant tktA E. coli

To test the hypothesis that tktA E. coli will shunt most of the metabolic flux through the oPPP and suppress scrambling of labels in the ribose ring, the growth of both wild type E. coli strain K12 and mutant tktA E. coli on [2-13C]-glucose were compared (Fig. 1). The level of 13C labeling efficiency was estimated using three methods of 1D 1H or natural abundance 13C carbon spectra, long range proton-nitrogen correlations in 15N-HSQC spectra as described previously (Dayie and Thakur 2010; Thakur et al. 2010b) or the method of fractional enrichment (Walker et al. 1982; Kishore et al. 2005).

For wild type K12, from the 1D 1H spectrum of AMP, the C1′ ribose carbon was significantly enriched, as its attached proton displayed satellite peaks due to the 13C–1H coupling (1J = 168.9 Hz). The intensity ratio of the center peak to the sum of its two satellite peaks was 5.71:20.06, indicating that C1′ was approximately 77% enriched. A similar consideration for tktA E. coli indicates that the label at C1′ was ~93% enriched (Fig. 2). Obtaining the enrichment level for the other ribose carbons other than C1′ was problematic using the 1D method because of extensive overlap. To obtain the enrichment levels for these ribose carbon atoms, a 2D 13C-HSQC experiment was conducted. The results are summarized in Table 1 where the relative intensity of each cross-peak is expressed as the percentage of the intensity of the C1′ cross-peak.

Fig. 2
figure 2

1D 1H traces extracted from 2D 1H-13C non-constant HSQC spectra for each carbon position within the ribose ring (C1′, C2′, C3′, C4′, and C5′) of rAMP showing the 13C isotopic enrichment levels in two E. coli variants grown on [2-13C]-glucose. The percentage labels were calculated as described in the text. A Labeling pattern found in the ribose region of rAMP isolated from tktA E. coli strain grown with [2-13C]-glucose. B Labeling pattern found in the ribose region of rAMP isolated from K12 E. coli strain grown with [2-13C]-glucose

Table 1 13C enrichment levels at various carbon positions within ribonucleotides harvested from tkta E. coli grown on [1-13C]-glucose, [2-13C]-glucose, and an equimolar mixture of [1-13C]-glucose and [2-13C]-glucose

Using the absolute enrichment of C1′ carbon, the relative enrichment obtained from either 1D direct carbon or 2D HSQC experiments was used to find the absolute enrichment level. For instance for wild type K12, the 13C content of C2′ and C4′ were found to be approximately 38.4 and 46.9% respectively of C1′. Given that C1′ was 77% enriched, we deduced that C2′ and C4′ were enriched 29 and 36% respectively, as summarized in Fig. 2 and Table 1. In contrast, for tktA E. coli the label at C2′ was ~7% and the label at C4′ was ~19% (Fig. 2).

Growth of mutant tktA E. coli on [1-13C]-glucose led to maximum labeling at the C5′ position (~87%), ~20% labeling at the C1′ and essentially no labeling (<2%) elsewhere in the ribose ring (Fig. 3A, B). Furthermore, growth of mutant tktA E. coli on a mixture of [1-13C]-glucose and [2-13C]-glucose led to maximum labeling at the C1′ position (~54%), high labeling at the C5′ (~35%), some labeling at C2′ and C4′, and very little label (<1%) at C3′ in the ribose ring (Fig. 3C, D).

Fig. 3
figure 3

The 2D 1H-13C non-constant HSQC spectra and the corresponding 1D 1H traces extracted from the 2D spectra for each carbon position within the ribose ring (C1′, C2′, C3′, C4′, and C5′) of rCMP showing the distribution of 13C isotopic enrichment levels of rCMP derived from tktA E. coli grown on three different carbon sources. A, B [1-13C]-glucose, C, D [2-13C]-glucose, and E, F mixture of [1-13C]-glucose and [2-13C]-glucose

Selective labeling removes unwanted couplings without compromising sensitivity

As shown previously when the ribose ring is uniformly labeled, the ribose C2′ and C4′ positions form a triplet and the C1′ and C5′ positions form doublets in non-constant time HSQC spectra (Dayie and Thakur 2010). These unwanted splittings can be removed using either constant time experiments (Bax et al. 1979; Bax and Freeman 1981; Vuister and Bax 1992; Van de Ven and Philippens 1992), adiabatic band selective decoupling schemes (Kupce and Wagner 1996; Brutscher et al. 2001;Dayie 2005), or maximum entropy reconstruction-deconstruction (Shimba et al. 2003).

Unfortunately, the length of the constant time period (T) limits the acquisition times (t max1 ) to multiples of the homonuclear coupling constant (JCC), i.e. t max1  = n/JCC where n is an integer, JC5C6 = 67 Hz and JC1′C2′ = 43 Hz. To obtain reasonable digital resolution, large values of T are needed. Also during the constant-time period (T), the decay of the transverse magnetization is proportional to exp (−R2T), where R2 is the homogenous transverse rate constant. The long constant-time delays needed to improve resolution lead to significant signal attenuation for RNA molecules larger than 30 nucleotides (Dayie 2005), and thus a compromise must be found between sensitivity (small values of T) and resolution (large values of T) when using uniformly labeled samples. With the selective labels, one is not forced to compromise resolution for improved sensitivity or vice versa (Fig. 4). To obtain the spectral resolution achievable using selective labels, a constant time delay of T = 53.2 ms is needed. Yet under these conditions, most of the useful signals are lost due to rapid signal decay (Fig. 4). When resolution is not an issue, a constant time delay of T = 26.6 ms is adequate to obtain good signal-to-noise ratio.

Fig. 4
figure 4

NMR spectra of paromomycin bound A-Site RNA showing enhanced sensitivity and resolution afforded by site selective labeling. The experiments were performed on the RNA A site-selectively 13C-labeled using 2D non-constant time HSQC spectra, B uniformly 13C-labeled using 2D constant time HSQC spectra with total delay of 53.2 ms, C uniformly 13C-labeled using 2D non-constant time HSQC spectra

Selective labeling allows facile monitoring of the response of the catalytic lynchpin D5 RNA to magnesium ion binding

To demonstrate the usefulness of the site selectively labeled nucleotides for mapping binding interactions, the 36-nt D5 RNA element (Dayie and Padgett 2008; Toor et al. 2008) was transcribed using either uniformly 13C/15N-labeled ATP or site selectively-labeled ATP derived from tktA E. coli cells grown on [2-13C]-glucose as described above. Three structural elements implicated in the catalytic activity of the ribozyme (Boulanger et al. 1995; Peebles et al. 1995; Schmidt et al. 1996; Dayie and Padgett 2008; Toor et al. 2008) are critical Mg2+ ions binding sites: the highly conserved trinucleotide AGC sequence, a variable two nucleotide AA asymmetrical bulge and a capping GAAA tetraloop (Chanfreau and Jacquier 1994; Abramovitz et al. 1996; Zhang and Doudna 2002; Seetharaman et al. 2006; Toor et al. 2008). These crucial bulge and internal loop nucleotides (A17, A24, A28) have much more reduced intensity in a uniformly labeled sample than in the site-selective labeled sample (Fig. 5A, B), and as expected these nucleotides are responsive to the presence of magnesium ions (Seetharaman et al. 2006). The selective labels reduce the overlap within the ribose C1′ region that can be very crowded and enable the facile monitoring of key nucleotides involved in magnesium ion binding without the need for band selective decoupling or constant-time evolution NMR experiments (Brutscher et al. 2001; Dayie 2005).

Fig. 5
figure 5

NMR spectra showing enhanced resolution afforded by site selective labeling that enables facile monitoring of chemical shift perturbations of D5 RNA on binding MgCl2. Arrows depict residues with the most chemical shift perturbation with increased magnesium ion concentration (0 and 5 mM), indicative of likely ion binding site. A Selective-ATP labeled D5 RNA, B uniform-ATP labeled D5 RNA, with both RNAs at 0.1 mM and 5 mM Mg2+

Site selective labeling affords more accurate relaxation rate measurements using non-constant time non-selective pulse experiments

As a fourth demonstration of the usefulness of the site selectively versus the uniformly labeled nucleotides for quantifying dynamics in RNA, TROSY detected R1 experiments were carried out on a number of samples. The highly conserved 36-nt D5 RNA was transcribed using either uniformly 13C/15N-labeled ATP or site selectively-labeled ATP derived from tktA E. coli cells grown on [2-13C]-glucose.

For R1 measurements, adjacent carbons can contribute significantly to the relaxation of macromolecules, and so we expect the selective labels to make a significant difference in the accuracy of the measured rates. As is readily apparent from Fig. 6, there is a discrepancy between R1 measured for uniformly and site-selectively labeled RNAs: for the 36 nt D5 RNA the discrepancy for the C1′ carbon is not only 0.54 s−1 (2.2 ± 0.07 s−1 for selective labeling versus 1.6 ± 0.31 s−1 for uniform labeling) but also the fit to a monoexponential decay function is only applicable to the site selectively labeled RNA sample (Fig. 6A, B). Again, for the CMP dissolved in glycerol to mimic high molecular weight RNA, the discrepancy for the C6 carbon is ~0.2 s−1 (2.9 ± 0.1 s−1 for selective labeling versus 3.1 ± 0.3 s−1 for uniform labeling) and the monoexponential fit for the uniformly labeled CMP is poorer (Fig. 6C, D).

Fig. 6
figure 6

Representative longitudinal R1 relaxation decay curves for RNA samples showing marked deviation from monoexponential decay for uniformly labeled samples. A Ribose C1′ R1 relaxation measurements at 25°C for the D5 RNA labeled with site selectively-labeled ATP. B Ribose C1′ R1 relaxation measurements at 25°C for D5 RNA labeled with uniformly 13C/15N-labeled ATP. C Base C6 R1 measurements at 45°C for site specifically labeled CMP dissolved in perdeuterated glycerol to mimic ~40–50-nt RNA. D Base C6 R1 measurements at 45°C for uniformly labeled CMP dissolved in perdeuterated glycerol to mimic ~40–50-nt RNA

Discussion

The development of uniform isotopic labeling methods for transcribing labeled RNA (Batey et al. 1992; Nikonowicz et al. 1992; Michnicka et al. 1993; Hines et al. 1994) spurred the development of new solution NMR tools that enabled the structural characterization of small to medium sized RNA molecules (Batey et al. 1995; Puglisi and Wyatt 1995; Latham et al. 2005). Nonetheless, uniform labeling does not alleviate the degeneracy problem for large RNAs and it introduces direct one-bond and residual dipolar couplings that negatively impact accurate measurement of 13C relaxation rates (Dayie and Thakur 2010; Thakur et al. 2010b). We set out to test the hypothesis that tktA E. coli strain will redirect most of the metabolic flux through the oxidative pentose phosphate pathway, limiting dilution of label within the ribose ring. This redirection of the metabolic flux would thereby afford site specific labeling within the ribose ring for useful NMR applications that overcome some of the drawbacks of uniform labeling.

Incorporation of 13C into ribose ring and nucleobase of nucleotides using glucose

Within the context of metabolic pathways in E. coli for nucleotide metabolism (Nelson and Cox 2008; Voet et al. 2008), the ribose moiety, derived exclusively from ribose-5-phosphate (R5P), can be produced directly from glucose or indirectly from gluconeogenic carbon sources. In the case of glucose, R5P can be produced from glucose-6-phosphate (G6P) by the elimination of the C-1 carbon from glucose as CO2 (Fig. 1). The isotopic enrichment pattern that stems from the input glucose carbons 2–6 can be readily predicted because of the one-to-one correspondence with the label incorporation at the five ribose carbon atoms numbered 1′–5′ using the oPPP. The noPPP leads to scrambling of labels relative to the input G6P (Fig. 1). This analysis accords with the results obtained for wild type K12 E. coli, wherein the flux through both the oxidative and non-oxidative pathways are operative (Fig. 2). Under noPPP conditions, the five-carbon sugars (including R5P) can be recycled into three- and six-carbon glycolytic intermediates through the action of tktA and transaldolase (TAL) (Fig. 2). Thus, knocking out tktA appears to block the noPPP scrambling and retain labeling via oPPP. Our result with ~93% label at C1′ for the tktA mutant compared to ~77% for the wild type K12, both grown on [2-13C]-glucose, suggests that most of the flux is indeed routed through the oPPP; however, the residual labeling of ~7% at the C2′ and 19% at C4′ positions suggests that the blockage is not complete. Rather, some of this labeling likely originates from residual tktA activity or reverse glycoslysis or both (Josephson and Fraenkel 1974).

Labeling in the nucleobases can be broadly divided into two categories based on metabolic precursors originating from glycolysis products, just as amino acid side chains can be grouped into three classes (Lundström et al. 2007). For example, G6P, isomerized to fructose-6-phosphate (F6P) and phosphorylated to fructose-6-bisphosphate (FBP), can be converted into dihydroxy acetone phosphate (DHAP) and glyceraldehyde-3-phosphate (GA3P). The rapid equilibration of DHAP and GA3P by triosephosphate isomerase (TIM) ensures that the carbonyl carbon (C-1) of GA3P is derived from glucose carbons 3 or 4, the C-2 of GA3P is derived from glucose C2 or C5, and the C-3 is derived from glucose C1 or C6 (Fig. 1). The net effect of this equilibration is that the purine nucleobase intermediates derived from serine and glycine can be enriched to ~50%. We classify these purine nucleobases into group I (adenine and guanine) and the pyrimidine nucleotides into group II (derived from OA and therefore aspartic acid). Before the entry into the TCA cycle, these group II nucleobases can also be enriched up to ~50%, but subsequent passage through the TCA cycle will reduce this further. In agreement with this analysis, we find that these group II nucleobases derived from [1-13C]-glucose or a mixture of [1-13C]-glucose and [2-13C]-glucose gave rise to ~26–28% enrichment at the pyrimidine C6 position (Table 1).

The main advantage of using the tktA E. coli strain grown on glucose is that it affords the highest level of enrichment at the C1′ position using [2-13C]-glucose without introducing unwanted 13C1′-13C2′ couplings. It also affords high level of enrichment at the C5′ position using [1-13C]-glucose without any unwanted 13C4′-13C5′ coupling. Three different E. coli strains have been used in the past for alternate site specific 13C isotopic labeling (Johnson et al. 2006; Johnson and Hoogstraten 2008; Dayie and Thakur 2010; Thakur et al. 2010b) using gluconeogenic carbon precursors such as glycerol that attain at most 55% enrichment in the C1′ position. Growth of wild type E. coli on glucose, on one hand, afforded at most 73% labeling. On the other hand, growth on acetate afforded close to 90% uniform labeling but with multiplet splitting from C2′ and C4′ (Hoffman and Holland 1995; Kishore et al. 2005). In the nucleobases, however, use of [2-13C]-glucose leads to a maximum of 50% enrichment compared to ~90% attainable using [2-13C]-glycerol (Johnson et al. 2006; Johnson and Hoogstraten 2008; Dayie and Thakur 2010; Thakur et al. 2010b). Also use of [1-13C]-glucose again leads to a maximum of 50% enrichment in the nucleobases compared to ~90% attainable using [1,3-13C]-glycerol; however since the price of [1,3-13C]-glycerol is ~4× that of [1-13C]-glucose, glucose is an excellent alternative in this case.

Site specific labeling of the ribose ring and purine and pyrimidine bases will advance the quantitative analysis of NMR relaxation parameters for RNA. For example, previous 13C relaxation studies using uniformly labeled samples precluded the clean extraction of relaxation rates of base C5 and C6 positions in pyrimidine ring and ribose ring atoms.

Conclusion

Here we have shown that an E. coli strain deficient in the transketolase gene (tktA) shunts most of the metabolic flux via the oxidative pentose phosphate pathway, and as a result growth of this E. coli strain on a number of glucose sources affords very high levels of enrichment of nucleotides specifically labeled at C1′ and C5′ positions without the attendant deleterious one bond 13C-13C scalar and dipolar couplings that normally interfere with measuring accurate spin-relaxation parameters and that normally exacerbate spectral crowding. We demonstrated the utility of this labeling approach on a 36-nt D5 RNA, wherein these specifically labeled rNTPs eliminated the extensive 13C–13C coupling within the nitrogenous base and ribose ring, leading to simpler and higher resolution NMR spectra. Having uncluttered spectra enabled the facile identification of nucleotides within the binding interface of the D5-Mg2+ complex. Finally, we showed that these selective labels enabled more accurate TROSY detected R1 relaxation rate measurements. These selective labels should prove valuable in applying high resolution NMR analysis to RNAs that require conformational switching for their biological functions.