Biological context

Macro domains are highly conserved in different organisms such as viruses, bacteria, archaea and eukaryotes. Their name was derived from the mammalian core histone macro-H2A (Pehrson and Fried 1992; Chadwick and Willard 2001; Allen et al. 2003; Ladurner 2003) and they seem to act as binding modules for NAD+ metabolites, including ADP-ribose, Poly ADP-ribose (PAR), and negatively charged molecules such as RNA (Malet et al. 2009). Furthermore, ADP-ribose 1′phosphate phosphatase activity (Egloff et al. 2006) and involvement in apoptosis (Ahel et al. 2009; Chen et al. 2009) and chromatin repair (Timinszky et al. 2009) are among the various functions attributed to macro domains.

Here we report the sequence specific assignment and secondary structure of Mayaro virus (MAYV) macro domain. MAYV belongs to family Togaviridae, genus Alphavirus and possesses single positive strand RNA (ssRNA(+)) (Powers et al. 2006; Fauquet et al. 2005). The Haemagogus mosquito and Aedes aegypty (Long et al. 2011; Receveur et al. 2010), are involved, as vectors, in the transmission of the MAYV to humans, causing a Dengue-like illness (Forshey et al. 2010). Infections are favored in environments with high humidity, such as tropical forests in South America. It is worth mentioning that there are no vaccines or drugs to prevent or medicate this type of viral infections.

The genomic and sub-genomic mRNA are the matrix for the translation of viral non-structural and structural proteins, respectively. The non-structural polyprotein that is translated from the ORF1 is auto-cleaved in 4 non-structural proteins (nsPs), which constitute the viral replication machinery, necessary for both transcription and replication. The domain that is known as macro is positioned at the N-terminal region of nsP3 (Lavergne et al. 2006; Mourão et al. 2012; Muñoz and Navarro 2012).

MAYV macro domain consists of 159 amino acids and exhibits 66 % sequence identity to Chikungunya virus macro domain, whose crystal structure has already been determined (PDB id: 3GPG). Although a preliminary attempt has been reported (Papageorgiou et al. 2010), there is no available high-resolution structural information for MAYV macro domain. Therefore, the present study provides significant insight into the conformational properties of this domain in solution and contributes to understanding its functions.

Methods and experiments

Cloning

The coding sequence of the macro domain (residues 1–159 of nsP3) was amplified using primers designed by ExEnSo for optimal expression yields (Care et al., Nucleic Acid research 2008) on cDNA from MAYV strain TRVL4675 in order to produce the protein with a C-terminal hexa-histidine tag. MAYV macro domain coding sequence was cloned by Gateway recombination into pDEST14 expression vector (Life Technologies). Rosetta 2 (DE3) (pLysS) E. coli cells (Novagen) were transformed by the plasmid carrying the macro domain coding sequence. Selection of the colonies was done on LB plates with ampicillin and chloramphenicol.

Protein expression and purification

An LB preculture was grown overnight at 37 °C, 180 rpm with 1 mg/ml ampicillin. A culture of 0.5 l M9 medium (6.8 g/l Na2HPO4, 3 g/l KH2PO4, 0.5 g/l NaCl) containing 0.5 g (NH4)2SO4, 2 g d-glucose, 0.5 ml BioExpress™ growth media (CIL), 1 mg/l biotin, 1 mg/l thiamin,0.5 ml 1 M Mg2SO4, 0.15 ml 1 M CaCl2, 1 ml solution Q (40 mM HCl, 50 mg/l FeCl .2 4H2O, 184 mg/l CaCl .2 2H2O, 64 mg/l H3BO3, 18 mg/l CoCl .2 6H2O, 4 mg/l CuCl .2 2H2O, 340 mg/l ZnCl2, 605 mg/l Na2MoO .4 2H2O, 40 mg/l MnCl .2 4H2O), and 1 mg/l ampicillin was inoculated with the preculture. When the OD reached 0.6–0.8, 0.5 ml 1 M IPTG was added. Five hours after induction the cells were harvested by centrifugation and the cell pellet stored at −20 °C.

After thawing and resuspending the cell pellet in 7.5 ml lysis buffer containing 0.1 M NaCl, 50 mM Tris–HCl, 1 mM dithiothreitol (DTT), 10 % glycerol, pH 7.9, the suspension was frozen in liquid N2 followed by immersion in a water bath at 42 °C. This procedure was repeated 3 times. Then cells were sonicated (PMisonix®, Sonicator 4000), protease inhibitor cocktail (Sigma Aldrich®) was added and the suspension was centrifuged at 4 °C and 21.000 g (Beckman 60 Ti rotor) for 40 min. The soluble fraction containing the His-tagged MAYV macro domain was loaded onto a HisTrap™HP affinity column (GE Healthcare) that had been previously equilibrated with 0.1 M ZnSO4 and binding buffer (10 mM imidazole, 20 mM Na2HPO4, 0.5 M NaCl, pH 8). The column was washed with a step gradient of imidazole in binding buffer (10, 20, 40, 100, 200, 400 mM) and finally with 0.1 M EDTA. MAYV macro domain eluted as a pure protein in 200 mM imidazole as checked by a 17 % SDS-PAGE gel. The protein was concentrated with an Amicon® Ultra 15 ml Centrifugal Filter membrane (nominal molecular weight cutoff 10 kDa). Ultrafiltration was also used to exchange the elution buffer to the NMR buffer (10 mM Hepes, 20 mM NaCl, pH 7) and to concentrate the NMR sample to a final volume of 500 µl.

Uniform 15N, 15N/13C and partial 2H labeling using the prototrophic E. coli strain Rosetta 2

Isotope enrichment for NMR spectroscopy was achieved by expression of the macro domain in 0.5 l M9 medium supplemented with 0.5 ml (U-15N or/and U-13C) 10x concentrated BioExpress™ growth media (CIL), 0.5 g 15NH4Cl or/and 2 g uniformly 13C-labelled glucose as sole nitrogen and carbon sources, respectively. In case of a deuterated sample, 0.6 l M9 medium (80 % D2O, 20 % H2O), 2.4 g deuterated 13C-glucose, 0.6 g 15NH4Cl and 0.6 ml (U-13C, U-15N, U-D) 10x concentrated BioExpress growth media (CIL) were used.

Multiple selective 15N-labeling with 15N-Leu/15N-Val/15N-Ala using the auxotrophic E. coli strain DL39

A 0.5 l culture of M9 medium was prepared containing 0.5 g NH4Cl, 2.5 g d-glucose, 0.5 ml unlabelled BioExpress™ growth media (CIL), 75 mg Phe, 45 mg Tyr, 200 mg Asp, 100 mg Ile, 100 mg Asn, 250 mg Gly, 50 mg Met, 105 mg Lys, 100 mg 15N Leu, 200 mg 15N Ala and 100 mg 15N Val.

Arg selective unlabeling of MAYV macro domain

A 0.5 l culture of M9 medium was prepared containing 0.5 g 15NH4Cl, 2 g d-glucose, 0.5 ml (U-15N) 10x concentrated BioExpress growth media (CIL) and 75 mg 14N-Arg.

Simultaneous selective Cys and Asn unlabeling of MAYV macro domain

A 0,5 l culture of M9 medium was prepared containing 0.5 g 15NH4Cl, 2 g d-glucose, 0.5 ml (U-15N) 10x concentrated BioExpress growth media (CIL), 100 mg 14N-Asn and 100 mg14N-Cys.

Data acquisition, processing and assignment

Protein samples for NMR experiments were prepared in a mixed solvent of 90 % H20 (10 mM Hepes, 20 mM NaCl pH 7), 10 % 2H2O, 2 mM DTT, 2 mM NaN3, and bacterial protease inhibitor cocktail (Sigma Aldrich®). The protein concentration in the NMR samples was 0.37 mM. NMR experiments were recorded at 298 K on a Bruker Avance 600 MHz NMR spectrometer, equipped with a cryogenically cooled pulsed-field gradient triple-resonance probe (TXI), and on a Bruker Avance III High-Definition four-channel 700 MHz NMR spectrometer equipped with a cryogenically cooled 5 mm 1H/13C/15N/D Z-gradient probe (TCI).

Sequence specific assignments were obtained from the following experiments: 2D [1H–15N]-HSQC and TROSY, 3D TROSY HNCA, 3D TROSY HN(CO)CA, 3D TROSY CBCA(CO)NH, 3D TROSY CBCANH, 3D TROSY HNCO, 3D TROSY HN(CA)CO, 3D TROSY HBHA(CO)NH, 3D HNHA, 3D 15N-edited NOESY and modified versions of the 3D CBCA(CO)NH experiment for the effective correlation of [NH](i) and [CBCA](i-1) when the (i-1) residue lacks an aliphatic Cγ atom (Ala, Asn, Asp, Cys, Gly, Ser and aromatic residues) or a γCO (Ala, Cys, Ser and aromatic residues); both experiments are included in the Bruker pulse sequence library. Internal 2,2-dimethyl-2-silapentane-5-sulfonate (DSS) was used as a chemical shift reference for 1H. All NMR data were processed with the TOPSPIN 3.3 software and analyzed with XEASY (Bartels et al.1995) and CARA (Keller 2004).

Results

Assignments and data deposition

MAYV macro domain was expressed at a yield of 25 mg/l in M9 minimal growth media and was mainly detected in the soluble fraction of the bacterial lysate. The analysis of the NMR spectra showed a large chemical shift dispersion and narrow line widths indicative of a monomeric folded domain. The backbone 1H–15N resonances of almost all residues (>89 %) could be assigned (Fig. 1). The assignments of 1H, 15N, and 13C resonances of the backbone and side chains resulted in the identification of 93.51, 95.31 and 94.2 % of the anticipated 15N, 13C and 1H backbone and side-chain chemical shifts, respectively. Data have been deposited in the BioMagResBank (http://www.bmrb.wisc.edu) under the accession no. 19927.

Fig. 1
figure 1

a Right: 700 MHz 2D [1H–15N] TROSY spectrum at 298 K of uniformly [15N]-labelled MAYV macro domain. Residue numbering is according to the sequence of MAYV nsP3. Left: magnification of the central region of the TROSY spectrum as indicated. b 2D [1H–15N] TROSY spectrum of selectively labelled 15N-Ala/Val/Leu macro domain. c Secondary structure of MAYV macro domain as predicted by TALOS+

Identification of the cis or trans conformation of the assigned prolines (5 out of 8) was based on the analysis of the 3D HBHA(CO)NH, 3D CBCA(CO)NH, 3D CBCANH and 3D (H)CCH-TOCSY spectra. Comparison of the proline 13Cβ and 13Cγ chemical shifts to reference values in the literature (trans-Pro:13Cβ 31.75, 13Cγ 27.26; cis-Pro: 13Cβ 34.16, 13Cγ 24.52 [Schubert et al. 2002] clearly showed that all assigned prolines of the protein adopt the trans conformation: Pro2 (13Cβ 32.775, 13Cγ 26.716), Pro42(13Cβ 31.049, 13Cγ 27.132), Pro51(13Cβ 32.733, 13Cγ 27.373), Pro71(13Cβ 31.989, 13Cγ 26.000), and Pro107(13Cβ 31.984, 13Cγ 25.423).

To confirm and extend the assignment based on the standard 3D spectra, we selectively labeled certain amino acids in 15N or unlabeled them using their 14N-forms in a 15N-background. Using the auxotrophic strain E. coli DL39, we successfully incorporated in parallel 15N-Leu, 15N-Val, and 15N-Ala into the macro construct. The yield was comparable with the prototrophic strain E. coli BL21(DE3) Rosetta2. In the 1H-15N TROSY spectrum of this sample, 46 out of 50 expected cross peaks were observed (Fig. 1). Finally the triple selective labeling scheme lead to 4 new assignments of backbone amides (2 Leu and 2 Val residues), thus completing the assignment of Leu and Val residues. Furthermore, unlabeling of Arg residues was performed. While the polypeptide contains 12 arginines, 9 cross peaks disappeared from the 1H-15N TROSY spectrum leading to 2 new assignments of arginine amide groups (a total of 9 out of 12 arginine residues assigned). Lastly selective unlabeling of both cysteine and asparagine residues was attempted. While MAYV macro domain contains 11 cysteine and asparagine residues, 16 cross peaks disappeared from the 1H-15N TROSY spectrum of this sample. Some of these peaks corresponded to already assigned lysine residues. However, two additional asparagine residues could be safely assigned leaving only one of the eight asparagines in MAYV macro domain unassigned. On the other hand, one out of three cysteines was assigned. Finally, no spin pattern has been identified for the following residues: Ala1, Asp31, Cys34, Ile113, Asp119, Arg120, Cys143, Arg144.

Overall ~87 % of the 1HN, 15N, 13Cα, 13Cβ and 13C’ resonances of MAYV macro domain were assigned. These chemical shifts were analyzed with TALOS+ (Shen et al. 2009) to identify the elements of secondary structure. As shown in Fig. 1c, the polypeptide contains 4 α-helices and 6 β-strands in a ββαββαβαβα topology. This distribution of secondary structure elements is very similar to that in the crystal structure of Chikungunya virus macro domain confirming that the 66 % sequence identity between the two macro domains corresponds to a high similarity of the 3D structures.

In summary, this work presents an efficient method for recombinant expression in E. coli and purification of MAYV macro domain. Solution NMR analysis showed that the monomeric protein construct is well folded and an almost complete sequence-specific assignment was obtained from 3D triple-resonance spectra and amino acid-selective labeling and unlabeling. The secondary structure of MAYV macro domain is similar to that of other macro domains. Since macro domains have been suggested to be ADP-ribose-binding modules, MAYV macro domain’s solution structure and dynamical properties that can be determined now should provide essential insights into the mechanism of this kind of interaction.