Introduction

Nuclear Magnetic Resonance (NMR) Spectroscopy has matured into a powerful tool to characterize interactions between biological molecules and their ligands at atomic resolution. Although originally described as a versatile technique to screen for potential protein binders (Gossert and Jahnke 2016; Harner et al. 2017; Luchinat et al. 2020; Gronenborn 2022), more sophisticated detection schemes have now emerged. Ligand-based detection schemes, such as waterLOGSY, Saturation Transfer Difference (STD) spectroscopy and 15N–13C-filtered 1H-1D spectra, can reveal valuable information about site specific interaction patterns and their mechanistic details (Dalvit et al. 2000; Mayer and Meyer 2001). Together with protein NMR-based mapping of ligand binding, this unveils unique information about the characteristics of protein-ligand complexes. The potential of protein NMR for drug discovery and design campaigns was realized by the discovery of Structure-Affinity-Relationship (SAR) by NMR by the Fesik group, in which 1H–15N-HSQCs on uniformly labeled proteins are recorded in order to investigate ligand binding interactions (Shuker et al. 1996). A particularly powerful application of SAR is Fragment-based drug discovery (FBDD) (Rees et al. 2004). FBDD starts with small fragments (< 300 g/mol) of weak binders (mM-100 µM), which are then structurally optimized throughout the process in order to maximize non-covalent interactions, most importantly the Van der Waals, electrostatic, and hydrogen bond interactions. Despite many past applications, backbone (1H–15N) detection schemes do not fully harness the potential of NMR spectroscopy in drug design campaigns due to limited experimental sensitivity, spectral overlap, and the ambiguous relationship between ligand binding modes and 1HN15N chemical shift perturbations (Williamson 2013). These problems can be largely overcome by resorting to 1H–13C side-chain detection exploiting the exquisite sensitivity and spectral dispersion obtained with Methyl-TROSY approaches and multiple quantum coherence experiments, which allow applications to slowly tumbling, high molecular weight proteins (Tugarinov et al. 2003, 2004). Selective labeling techniques for aliphatic and/or aromatic residues have been described and are by now well established (Lichtenecker et al. 2013a, b). Labeling of the aliphatic amino acids Isoleucine, Valine and Leucine was one of the first methods of selective 13C incorporation reported in the literature (Cardillo et al. 1977). These amino acids are highly abundant in the hydrophobic cores of proteins and thus represent valuable targets to introduce 13C- and 2H-nuclei with the intention to reduce signal overlap, increase signal-to-noise ratio and optimize magnetization transfer pathways. Moreover, metabolic α-ketoacid precursors can be added to the minimal growth media of E.coli-based overexpression systems and are effectively metabolized into the corresponding target aliphatic amino acids in-vivo within defined metabolic pathways (Gardner and Kay 1997; Goto et al. 1999). So far, 13C incorporation in aliphatic residues has mainly focused on methyl groups to benefit from their advantageous relaxation properties due to fast C–C bond rotation and the reduction of signal overlap in the otherwise crowded spectral region of methyl signals. Methyl CH3 groups and aromatic CHs are sensitive reporters of the ligand binding mode and changes of side-chain structural dynamics upon ligand binding. A special and particularly interesting case of non-covalent interactions are CH-π interactions, in which the CH group of the aliphatic or aromatic residue acts as the hydrogen-bond donor, whereas the π-system is the hydrogen-bond acceptor (Hunter 2004). We have recently shown that CH-π interactions are determinants for the affinity of a drug molecule to its target protein and can efficiently be probed by selective amino acid labeling (Platzer et al. 2020).

Encouraged by our previously obtained results, we suggest extending the selective labeling methodology to the methylene group (CH2) of Isoleucine.

Herein, we show that γ1-13C Isoleucine patterns are effectively induced by the corresponding precursor [3-13C, 4,4,4-2H3] 2-ketobutyrate (Scheme 1) and can be applied as sensitive tools to probe interactions between aryl-ligand CH-π acceptors and protein Isoleucine residue CH-π donors.

Scheme 1
scheme 1

Conversion of [3-13C; 4,4,4-2H3] 2-ketobutyrate in E. coli minimal media to Isoleucine

Scheme 2
scheme 2

Synthesis of [3-13C; 4,4,4-2H3] sodium α-ketobutyrate 9. Reagents and conditions: a PPh3, toluene, rt, overnight; quant.; b NaOHaq (1 M), rt, 15 min, quant.; c 13CH3I, DCM, rt, overnight; d O3, −78 °C, DCM, 5 min, then PPh3, 3 h, rt, 70% over two steps; e H2NNMe2, Et2O, 18 h, rt, 76%; f NaHMDS, CD3I, THF, –78 °C, 5 min, 80%; g HClaq (1 M), Et2O, 95%; h HClg, DCM/Et2O 1:1, 18 h, 0 °C-rt, 90%; i NaOHaq (1 M), lyophilisation, quant. Experimental details, as well as NMR characterization of products and intermediates are given in the Supplementary Information (SI)

We additionally describe the possibility to use [3-13C, 4,4,4-2H3] 2-ketobutyrate in D2O E.coli overexpression medium containing U-2H glucose to increase the deuteration grade in the β and γ2 positions of the Ile-side chains. This additional deuteration attenuates line-broadening due to 1H–1H dipolar interaction and reduces the number of signals resulting from natural abundance methyl correlations.

Materials and methods

Precursor synthesis

Isotopologues of α-ketobutyric acid are common additives in minimal media of bacterial protein overexpression to achieve highly selective Isoleucine labeling. While most of these methods aim for 13C methyl groups, we established a novel synthetic route to access an Ile γ1-13CH2/δ-CD3 pattern (Scheme 2). Our approach combines a modified literature-known protocol to synthesize 3-13C pyruvate (Werkhoven et al. 1999) with an optimized procedure of dimethylhydrazone monomethylation. Starting from tert-butyl bromoacetate 1, the phosphorous ylide 2 was formed and subsequently methylated using 13C-iodomethane. Ozonolysis with careful bulb-to-bulb distillation of the product from the viscous reaction residue yielded [3-13C] tert-butyl pyruvate 4. This compound was further converted to the dimethylhydrazone 5, which was then deprotonated at the alpha-position using sodium hexamethyldisilazane (NaHMDS) and subsequently transformed to the hydrazone 6 upon [2H3] iodomethane addition. Short reaction times in this methylation step effectively minimized the formation of side-products (see SI for experimental details). Alternative protocols using lithium diisopropylamide (LDA) as a base reported in literature could not be reproduced in comparable yields (Hajduk et al. 2000). Final hydrazone hydrolysis and ester cleavage under acidic conditions resulted in the formation of the target compound [3-13C; 4,4,4-2H3] α-ketobutyric acid 8. Compound 8 can be directly applied to protein overexpression as a free acid or lyophilized to a white powder of the corresponding sodium salt 9, respectively.

Ligand synthesis

Ligands A (CAS: 1610045-04-9) and B (CAS: 1772600-27-7) were synthesized as reported in earlier work (Platzer et al. 2020). The corresponding structural details of these two compounds are shown in the supporting information.

Expression and purification of Brd4-BD1

Protein overexpression was performed as described previously (Schörghuber et al. 2017). Briefly, recombinant human Brd4-BD1 (bromodomain 1 of Bromodomain containing protein 4) was expressed in E.coli BL21 (DE3), which contains an N-terminal TEV-cleavable His6-tag (plasmid was kindly provided by Boehringer Ingelheim GmbH & Co. KG). Uniformly 15N and 13C-labeled Brd4-BD1 was expressed following the expression protocol for efficient isotopic labeling of recombinant proteins using a fourfold cell concentration in M9 minimal medium, supplemented with 1 g/L 15NH4Cl, 3 g/L 13C6-D-glucose. Uniformly 15N and selective γ1-13C Isoleucine labeled Brd4-BD1 was expressed following the same protocol, M9 minimal medium was supplemented with 1 g/L 15NH4Cl, 3 g/L 12C6-D-glucose and 130 mg/L [3-13C, 4,4,4-2H3] 2-ketobutyrate (Marley et al. 2001). Perdeuterated, uniformly 15N, D-Glucose-1,2,3,4,5,6,6-d7 and selective γ1-13C Isoleucine Brd4-BD1 expression (concentrations in [g/l] used as above) was initialized by taking several colonies and inoculating them in a 10 mL M9-H2O minimal medium for 8 h at 37 °C shaking. From that culture, 250 µL were taken and inoculated in 10 mL fresh M9-D2O minimal medium, which was shaken overnight at 37 °C. At a cell density of ~ 2.5, the culture was taken and transferred to the final expression culture in a total 300 mL M9-D2O medium. The cells were grown until an OD600nm of 0.7 and induced with 0.4 mM IPTG. Cells were harvested by centrifugation, lysed by sonication and the lysates were subsequently centrifuged. Proteins were purified from the supernatant by Ni2+ affinity chromatography. The purified protein was treated with TEV protease and again loaded onto a Ni2+ column to bind the cleaved His6-tag and the His6-tagged TEV protease. The flow-through containing Brd4-BD1 was concentrated and stored in 10 mM sodium phosphate buffer pH 7.5, 100 mM sodium chloride and 1 mM dithiothreitol (DTT). In the case of perdeuterated Brd4-BD1, after concentrating, the buffer was exchanged to D2O buffer (10 mM sodium phosphate buffer pH 7.5, 100 mM sodium chloride and 1 mM DTT). Purity was analyzed by SDS-Page. NMR samples of Brd4-BD1 were prepared in 10 mM sodium phosphate buffer containing 0.1–0.5 mM protein, 100 mM sodium chloride, 10% D2O, and 1 mM DTT. NMR samples of perdeuterated Brd4-BD1 were prepared in 10 mM sodium phosphate D2O buffer containing 0.2 mM protein, 100 mM sodium chloride, and 1 mM deuterated Tris(2-Carboxyethyl)phosphine:DCl-D16 (TCEP).

NMR measurements

Carbon relaxation studies. All protein NMR measurements were conducted at 298 K on a Bruker Neo 600 MHz spectrometer equipped with a TXI RT probe head with perdeuterated and selective γ1-13C Isoleucine Brd4-BD1 sample concentration of 200 µM. For 13C relaxation studies, pulse sequence “hsqcctetgpsp” for the constant time 1H–13C HSQC with the constant time delays of 0.0133, 0.0266, 0.04 and 0.0532 s were used (Vuister and Bax 1992). For the relaxation measurement of 13C–1H2 heteronuclear triple/single quantum coherence, the pulse sequence “hmqcctetgp.2” (slightly modified) for the constant time 1H–13C HTQC was used (Marino et al. 1997). The constant time delays were set to 0.028, 0.042, 0.056, 0.07, 0.084, 0.098, 0.112, 0.128, 0.14, 0.154 s. To allow direct comparison of intensities, all spectra in a series were acquired with the same spectral parameters and the same settings for the receiver gain. Specifically, spectra were recorded using 106 (f1) × 1024 (f2) real points (CT-HSQC) and 144 (f1) × 1024 (f2) real points (CT-HTQC) with acquisition times of 0.06 × 0.01 s. 128 scans per fid were recorded with a recycle delay of 1 s. The pulse scheme of the CH2-TROSY experiment was applied as described by Miclet et al. (2004) and the corresponding spectrum recorded with a time delay of 0.028 s.

Fig. 1
figure 1

Overlay of 1H–13C-HSQC spectra of selectively Ile 13CH2 labeled Brd4-BD1 (red) onto the CT-1H–13C-HSQC of 13C-uniformly labeled Brd4-BD1 (13CH and 13CH3 signals in black and 13CH2 signals in grey), zoomed into the CH2 region

Protein-Ligand interaction studies. Protein-Ligand interaction measurements were conducted at 298 K on a Bruker Avance HD3 + 800 MHz spectrometer equipped with a TXI RT probe head with Brd4-BD1 sample concentration of 200 µM and ligand concentration of 1 mM. 2D 1H–13C HSQC spectra were acquired using the pulse sequence “hsqcetfpgpsi2” of the Bruker library (Palmer et al. 1992; Kay et al. 1992; Grzesiek and Bax 1993; Schleucher et al. 1994). Spectra were recorded using 128 (f1) × 1024 (f2) real points and acquisition times of 0.05 × 0.01 s. 32 scans were recorded per t1 increment with a recycle delay of 1 s.

Analysis

NMR spectra were processed and analyzed with NMRPipe (Delaglio et al. 1995) and SPARKY (Goddard and Kneller 2006) and CCPNmr (Vranken et al. 2005). Unambiguous sequential assignment of the Isoleucine Cγ1 signals was performed with a series of three-dimensional NMR experiments (HNCACB, HN(CO)CACB, hCC-TOCSY-coNH and Hcc-TOCSY-coNH). 13C relaxation studies were performed by fitting the logarithmic intensities of the peaks to the linear regression model in RStudio (RStudio Team 2020) by taking the linearized logarithmic function as log[I(t)] = log(A) − t/T2.

Results

To demonstrate the potential of our selectively labeled isoleucine precursor in protein dynamics studies, we incorporated [3-13C; 4,4,4-2H3] α-ketobutyric acid 8 into the bromodomain 1 of Bromodomain-containing protein 4 (Brd4-BD1).

Brd4-BD1 belongs to the family of bromodomain and extra-terminal domain (BET) proteins, which act as chromatin readers by binding to acetylated histones and therefore regulating gene expression (Hu et al. 2022). Brd4-BD1 contains seven Isoleucine residues, theoretically yielding 14 peaks. The incorporation of the 13C label into Brd4-BD1 was confirmed by a 1H–13C-HSQC spectrum (Fig. 1, in red). As expected, two cross peaks are obtained for the diastereotopic methylene protons of each Ile Cγ1 differing in the hydrogen dimension, but with the same carbon chemical shift. Note that the peaks of two of the Isoleucine Cγ1 groups are overlapping (presumably I100 and I101). However, spurious cross peaks are found in the methyl group region of the 1H–13C HSQC spectra (Fig. 1). In order to validate that these peaks are indeed natural abundance correlations of methyl CH3 obtained from the expression with D-Glucose (12C) in H2O medium (SI Fig. 1, in black), we incorporated the [3-13C; 4,4,4-2H3] α-ketobutyric acid also in D2O minimal media (SI Fig. 1, in red). As shown in SI Fig. 1 in red, the natural abundance peaks are suppressed by deuteration, and additionally, no metabolic scrambling of the precursor can be observed.

Fig. 2
figure 2

Comparison between CT-HTQC (ct delay 0.028 s) (a) and CT-HSQC (ct delay 0.013 s) (b) spectra of perdeuterated 15N-Brd4-BD1-Ile-13Cγ1

The availability of 13CH2 labeled Ile-residues offers the possibility to create heteronuclear multiple-quantum (triple/single-quantum, T(S)QC) coherences in a straightforward manner. It is well-known that triple/single quantum 13CH2 coherences relax more slowly than the corresponding single quantum coherences of the individual 13C and 1H spins (Grzesiek and Bax 1995; Grzesiek et al. 1995; Marino et al. 1997; Ruschak et al. 2010; Tugarinov and Kay 2013). Effective relaxation rates were measured by recording a series of 1H–13C-CT-HSQCs and 1H–13C-CT-HTQCs spectra (Fig. 2) with different relaxation delays (see experimental section). Figure 3 shows a comparison of the rates extracted from the 1H–13C-CT-HSQC and 1H–13C-CT-HTQC, respectively. It can clearly be seen that the 13C relaxation times for heteronuclear T/SQC extracted from the 1H–13C-CT-HTQC were increased on average by a factor of three compared to the 13C-T2 from the 1H–13C-CT-HSQC, as expected from theory (Grzesiek and Bax 1995; Grzesiek et al. 1995; Marino et al. 1997; Ruschak et al. 2010; Tugarinov and Kay 2013). It should be noted that data for residues I110 and I126 could not be analyzed as their signals are barely above the noise level. Interestingly, these two residues are also buried inside the hydrophobic core of Brd4-BD1, whereas the other five Isoleucine residues are found rather on the surface or at the flexible parts of the protein, thus suggesting the existence of considerable exchange broadening due to conformational averaging.

Fig. 3
figure 3

Extracted 13C relaxation times for specific Ile residues in perdeuterated Brd4-BD1. Dots represent relaxation times extracted from CT-HSQC, whereas pyramids represent rates extracted from CT-HTQCs

Selectively Ile-labeled and otherwise highly deuterated samples are moreover particularly interesting for applying specific transverse-relaxation-optimized NMR spectroscopy (TROSY) methods, which have been introduced to improve the sensitivity and resolution of methylene NMR studies (Miclet et al. 2004). We recorded a corresponding CH2-TROSY spectrum of the Ile 13CH2 Brd-BD1 sample expressed in D2O/deuterated glucose, which shows additional gain in spectral resolution – even relative to the HTQC spectrum (especially in the 1H dimension by virtually eliminating the large geminal 2JHH) as a result of the exclusive TROSY selection of the slowest relaxing multiplet component. Upon comparing the signal-to-noise ratios of the different experiments, it becomes evident that HTQC performs better in this regard compared to both CH2-TROSY and HSQC (SI Fig. 2).

Fig. 4
figure 4

a 1H-13C-HSQC overlay of Brd4-BD1-Ile-13Cγ1 (black) with Ligand A (red) and Ligand B (blue). b Zoom into the Brd4-BD1 binding pocket with an overlay of Ligand A (red, 6XV3) and B (blue, 6XUZ). Dashed lines indicate the distances between the ligand ring center and the two diastereotopic protons of Ile146. The prochiral designation of the methylene group is indicated in both figures

Finally, in order to test whether the methylene group of Isoleucine can be used as a reporter of CH-π interactions, we decided to measure 1H–13C-HSQCs with ligands A and B, which are nanomolar Brd4-BD1 binders (Platzer et al. 2020). Figure 4a shows the chemical shift changes of 15N-Brd4-BD1-Ile-13Cγ1 with 1 mM ligand A (red) and B (blue). The ligand induced proton chemical shift perturbation (CSP) of the Ile146 methylene resonances result in 1.71 ppm for ligand A and 1.78 ppm for ligand B for one hydrogen atom and 0.59 ppm and 0.66 ppm for the other hydrogen, respectively. These significant CSPs point towards an ideally oriented CH-π interaction between the Ile146-Cγ1H2 and the ligand aromatic system, as favorable CH-π stacking orientations are correlated with high upfield shifts. Taking a closer look at the crystal structure of Brd4-BD1 in complex with both ligands (Fig. 4b), the hydrogen of the methylene group (pro-S) is directly positioned over the centroid of the triazole ring, whereas the pro-R hydrogen atom is further away from the ligand. This information enables us therefore to also stereo-specifically assign the hydrogens of the isoleucine methylene groups. In order to correlate the observed CSPs of the Ile146 methylene resonance with the relative orientation of the ligand in the binding pocket of Brd4-BD1, we used the Pople equation (Pople 1956; Platzer et al. 2020). The geometric parameters were taken from the published X-ray structures (see SI Table 2), resulting in a calculated CSP for the pro-S hydrogen of Ile146 of 1.1 ppm and 1.5 ppm for ligand A and B. For the hydrogen atom in pro-R configuration, the calculated CSPs are 0.5 ppm and 0.6 ppm, respectively. This finding shows that the experimentally derived CSPs match well with the calculated, again supporting the presence of a beneficial CH-π stacking orientation.

Discussion and conclusion

The novel α-ketobutyric isotopologue reported here provides an economic tool to implement methylene labeling in isoleucine side chains. This precursor is directly applicable to E.coli-based expression systems, as shown by incorporation into the human protein Brd4-BD1, allowing observation of seven distinct Isoleucine Cγ1–H2 pairs in an otherwise crowded 1H–13C-HSQC spectra (Fig. 1). The incorporation of the precursor into Brd4-BD1 by expression in D2O minimal media results in simplified protein spectra against a deuterated background with favourable 13C relaxation properties and enhanced signal-to-noise ratio. Our experiments further exemplify the benefit of the deuterated Isoleucine Cγ1 labeled Brd4-BD1 by studying its carbon transverse relaxation properties by measuring 13C single and 13C/1H heteronuclear multiple quantum coherences. Comparison of the extracted effective 13C relaxation times shows an increase by a factor of ~ 3.5 going from 13C SQC to heteronuclear triple quantum coherences (TQC). The larger 13C relaxation times in the TQC experiment might allow for interesting applications such as paramagnetic relaxation enhancement (PRE), residual dipolar couplings (RDC) and conformational exchange via CPMG-type measurements. In practice, this has to be balanced against other experimental factors such as potentially higher overall intrinsic signal intensity in the HSQC spectra (due to signal improvement by Rance/Kay type gradient schemes) but holds the promise of application to large molecular weight systems. Most importantly, however, the large upfield chemical shift observed for the pro-S methylene proton in the BRD4-ligand complex convincingly demonstrates the usefulness of the Isoleucine methylene group as an excellent reporter for CH-π interactions.

It can thus be anticipated that particularly protein-ligand binding studies will considerably benefit from the availability of the new Isoleucine precursor. The facile and efficient introduction of this novel precursor to realize new Isoleucine isotopologues in a target protein represents a valuable and general tool to fine-tune NMR studies and decipher protein dynamics, allosteric mechanisms, and binding interactions.