Biological context

During the past 17 years the three coronaviruses, severe acute respiratory syndrome (SARS) in November 2002, Middle East respiratory syndrome (MERS) in April 2012, and more recently the coronavirus disease (COVID-19) in December 2019, have become a global public emergency (Jiang et al. 2020; Singhal 2020).

The MERS-CoV was first identified in Saudi Arabia when isolated from an adult patient lung who was diagnosed with severe pneumonia and died of multiorgan failure (Nguyen et al. 2019). MERS-CoV, like SARS-CoV and SARS-CoV-2, is a member of the Coronaviridae family of the order Nidovirales. It is a large single-strand positive-sense RNA with 30 kb which encodes four structural proteins: spike (S), envelope (E), membrane (M), and nucleocapsid (N). The N protein is the most abundant in infected cells (Carlson et al. 2020). It is composed of two domains: the dimerization C-terminal domain (N-CTD) and the RNA binding N-terminal domain (N-NTD). The N-CTD and N-NTD are linked by an intrinsically disordered region, which contains the Arg-Ser-rich region (SR region) and the phosphorylation site (Taskin Tok et al. 2017; Nguyen et al. 2019). This multifunctional phosphoprotein is involved in the capsid formation, in the modulation and regulation of the viral life cycle. N protein is directly involved in the discontinuous transcription process, acting as an RNA chaperone (Huang et al. 2004).

Coronaviruses are among the largest RNA viruses and they undergo a unique discontinuous transcription of the viral RNA into subgenomic mRNAs (sgmRNAs). At the 5′ end of the genome is found the leader transcriptional regulatory sequence (TRS-L) and at the 5′ end of each subgenomic RNA, the body transcriptional regulatory sequence (TRS-B). When the TRS-B is copied during the transcription process, the nascent negative-strand RNA is transferred to the TRS-L portion through a template switch, finalizing the transcription process. The N-NTD domain of N protein has been reported to specifically interact with TRS and catalyse the template switch acting in the melting activity of dsTRS (Grossoehme et al. 2009).

Our group is involved in the study of the mechanism of specific recognition and melting activity of the N-terminal domain of human betacoronaviruses (Caruso et al. 2020; de Luna Marques et al. 2021). This work is part of an international effort to combat the Covid-19 pandemic (https://covid19-nmr.de/, (Altincekic et al. 2021). Here we report the 1H, 15N, and 13C backbone and side-chain resonance assignments of the N-NTD domain of MERS-CoV without the SR region (N-NTD) and containing the SR region (N-NTD-SR). These assignments are fundamental to obtain structural information on the N-NTD and its Ser-Arg-rich region which in turn will contribute to the better understanding of coronavirus diseases.

Methods and experiments

Protein expression and purification

Two distinct constructs of MERS-CoV protein N were synthesized. The first one contained only the N-terminal domain of MERS-CoV protein N comprising residues 35 to 169 (N-NTD domain), and the other one including, besides the N-NTD domain, the Arg-Ser-rich sequence from residue 170 to 202 (N-NTD-SR domain). Both proteins were subcloned between NdeI and BamHI restriction sites in plasmid pET28a by Genscript Company.

Escherichia coli BL21 (DE3) was transformed with pET28a. One colony was picked and transferred to Luria Bertani (LB) medium. The bacteria were grown in minimal medium (M9) containing 15NH4Cl (1 g/L) and 13C-glucose (3 g/L) for isotopic labeling and kanamycin (30 µg/mL) for bacterial selection. The protein expression was induced with 0.2 mM IPTG (isopropyl β-D-thiogalactoside), overnight at 18ºC. Cells were centrifuged and the pellet was disrupted by ultrasonication in lysis buffer (50 mM Tris-HCl pH 8.0 containing 500 mM NaCl, 20 mM imidazole, 5% glycerol, 0,01 mg/ml DNAse and 5mL SigmaFast protease inhibitor cocktail tablet 1x diluted). The lysate was centrifuged, and the supernatant was applied to a HisTrap FF column (GE Healthcare Life Sciences). The N-terminal domains of N protein were purified by nickel affinity chromatography, using washing buffer A (50 mM Tris-HCl, 500 mM NaCl, 20 mM imidazole, pH 8.0) and buffer B (50 mM Tris-HCl, 500 mM NaCl, 500 mM imidazole, pH 8.0). For His-tag removal the protein was cleaved overnight with TEV protease (TEV:protein 1:30 molar ratio) and the mixture was dialyzed against dialysis buffer (50 mM Tris-HCl pH 7.5, 0.5 mM EDTA and 1 mM DTT). After dialysis, a new cycle of nickel affinity chromatography was performed to improve the purity of the N-terminal domain of MERS protein N and remove the tag cleaved by TEV. The sample containing the protein was concentrated at 5000 g, 10 min, in Amicon Ultra 15 10,000 MWCO, in the presence of PMSF 0.5 mM. The buffer of fractions containing the purified protein was changed by gel filtration chromatography (Superdex 75 column) using the buffer 20 mM sodium phosphate, 50 mM NaCl, 500 µM PMSF, 3 mM sodium azide, and 3 mM EDTA, pH 5.5. At the end of gel filtration, the sample was concentrated using an Amicon and 0.5 mM PMSF, 3 mM EDTA, and 3 mM azide was added to the sample. The samples for NMR were in 20 mM sodium phosphate, 50 mM NaCl, 500 µM PMSF, and 3 mM sodium azide.

NMR experiments

For all NMR experiments, we added 5% (v/v) D2O to the sample. The triple resonance NMR spectra were acquired at 298 K on a Bruker 800 MHz AVANCE III spectrometer equipped with a pulse-field Z-axis gradient triple-resonance probe. We assigned the backbone resonances of 15N–1H-HSQC spectrum (Fig. 1) through the triple resonance experiments HNCO, HNCA, CBCA(CO)NH, HNCACB, and HBHA(CO)NH (Whitehead et al. 1997). We assigned the side-chain resonance through 13 C-HSQC, (H)CCH-TOCSY, HCCH-TOCSY (Kay et al. 1993), and 15N and 13C-NOESY-HSQC (for both aliphatic and aromatic regions) experiments. The NOESY spectra were acquired at 298 K on a Bruker 900 MHz AVANCE IIIHD spectrometer equipped with pulse-field Z-axis gradient triple-resonance probes. For all experiments, we used the chemical shift of water proton as an internal reference for 1H while 13C and 15N chemical shifts were referenced indirectly to water (Wishart et al. 1995). For the triple resonance measurements, we used non-uniform sampling (NUS) of the NMR data based on a 13% Poisson gap sampling schedule (Hyberts et al. 2012). The iterative soft threshold method was used for the spectral reconstruction (Hyberts et al. 2012). We processed the data using the NMRPipe software (Delaglio et al. 1995) and analysed it with CCPNMR Analysis (Vranken et al. 2005) both available on NMRbox (Maciejewski et al. 2017).

Fig. 1
figure 1

Two-dimensional 1H, 15N-HSQC spectrum of uniformly 15N/13C-labelled MERS-CoV N-NTD domains at 298 K in 20 mM NaPi pH 5.5, 50 mM NaCl and 5% (v/v) D2O. The labels show the assigned backbone amino acid residues. MERS-CoV N-NTD spectrum is shown in red and N-NTD-SR domain in green

Assignment and data deposition

Chemical shift assignments 1H, 15N, and 13C have been deposited in Biological Magnetic Resonance Bank (BMRB) under IDs 50,772 and 50,771 for MERS-CoV N-NTD and N-NTD-SR, respectively. Figure 1 shows the assigned 2D 1H–15N HSQC spectrum of the MERS-CoV N-NTD domain and N-NTD-SR domain.

For the MERS-CoV N-NTD domain, we assigned 93.8% of the backbone nuclei (13Cα, 13CO, Hα, amide HN, and 15N). We have a total of 96.3% 13Cα and 92.8% Hα. For the 13CHn aliphatic side chain moieties of the protein, 67,1% of 13C and 70% of 1H were assigned. For the 13CHn aromatic side chain moieties of the protein, 40% of 13C and 80.9% of 1 H were assigned. We assigned 98.3% Cβ and 94.5% Hβ. We assigned 123 amide 1HN (95.1%), 136 15 N (86%) and 136 13CO (86%).

For the MERS-CoV N-NTD-SR domain, we assigned 87.5% of the backbone nuclei (13Cα, 13CO, Hα, amide HN, and 15N). We have a total of 95.3% 13Cα and 86.9% Hα. For the 13CHn aliphatic side chain moieties of the protein, 57.7% of 13C and 60.8% of 1 H were assigned. For the 13CHn aromatic side chain moieties of the protein, 27.5% of 13C and 57.1% of 1 H were assigned. We assigned 96% Cβ and 84.9% Hβ. We assigned 156 amide 1HN (89.7%), 171 15N (81.9%) and 171 13CO (80.1%).

From the resonance assignment we could compare the chemical shift derived order parameter (S2), from the random coil index (Berjanskii and Wishart 2005) and secondary structure prediction using TalosN (Shen and Bax 2013). It is interesting to note subtle differences of backbone flexibility when the N-NTD and N-NTD-SR are compared. The Ser-Arg-rich region is flexible but contains a more ordered region around residue 183 (Fig. 2a). We observed secondary structure elements compatible with the crystal structure (Papageorgiou et al. 2016) and a one-residue shift for β-strand 1 when N-NTD and N-NTD-SR are compared (Fig. 2b, c). Further studies are necessary to understand these structural and dynamical features.

Fig. 2
figure 2

Protein dynamics and secondary structure prediction. a Random-coil index order parameter as a function of the residue number for MERS-CoV N-NTD (red) and N-NTD-SR (black). b TalosN secondary structure prediction of MERS-CoV N-NTD-SR as a function of residue number. c TalosN secondary structure prediction of MERS-CoV N-NTD as a function of residue number. In blue the predicted probabilities for helix and in green for extended structure (β-strand) as a function of the residue number. In the top, the green rectangles represent the β-strands and the blue the helices, corresponding to the secondary structure in the crystal structure [PDB 6KL2 (Lin et al. 2020)]