Introduction

SARS-CoV-2 belongs to the single-stranded positive-sense RNA family (Anand et al. 2005; Chang et al. 2014). This virus family has a large genome that encodes four structural proteins, small envelope (E), matrix (M), nucleocapsid phosphoprotein (N), spike (S), and 16 nonstructural proteins (nsp1-16) that together, ensure replication of the virus in the host cell. The nonstructural proteins, mostly associated with RNA replication, carry out the enzymatic function required for viral replication (Anand et al. 2005; Zhang et al. 2005; Ahmed et al. 2020). The genome of SARS-CoV-2 also encodes for nsp7, nsp8, and nsp12 that together form a complex called RNA-dependent RNA-polymerase, nsp10, nsp13, nsp14 and 16 complexes called RNA capping machinery, and nsp3, PLpro, and nsp5 (3CLpro) known as proteases that impede innate immunity and also essential for cleaving viral polyproteins (Snijder et al. 2016).

Among these, phosphoprotein (N) is an essential component that links the viral genome with a membrane called nucleocapsid. The N-terminal RNA-binding domain (N-NTD) captures the RNA genome while the C-terminal domain anchors the ribonucleoprotein complex to the viral membrane via its interaction with the M protein. The four structural proteins together with the viral + RNA genome and the envelope constitute the complete virion. The nucleocapsid phosphoprotein consists of an N-terminal (NTD) and a C-terminal (CTD) domain. Both of these domains have the RNA binding affinity, while the CTD binds the M protein, establishing the physical linkage between the envelope and + RNA. The SARS N proteins also play regulatory roles in the viral life cycle through the host intracellular machinery (Chang et al. 2014). A more recent study shows the structure of N protein, right hand-like fold, composed of a β-sheet core with an extended central loop. The core region adopts a five-stranded U-shaped right-handed antiparallel β-sheet platform with the topology β4–β2–β3–β1–β5, flanked by two short α-helices. A prominent feature of the structure is a large extending loop between β2–β3 that forms a long basic β-hairpin (β2′ and β3′) (Dinesh et al. 2020).

We docked the crystal structure of RNA (Sheng et al. 2014) and nucleocapsid protein using HADDOCK (Dominguez et al. 2003). The RNA duplex was found to be bound between the basic finger and the palm of the N-NTD with highly positive arginine residues (R92, R107, R149) that directly contact the RNA. The model predicts several hydrophobic interactions with side-chains of residues I94 and L104, contributing the RNA binding. Residues A50, T57, H59, R92, I94, S105, R107, R149, Y172 in the surrounding may form interactions with RNA. However, no experimental or long-term molecular dynamic simulation (MD) was performed to confirm what kind of interaction existed.

The nucleocapsid N-terminal residues (N-NTD) interact with genomic RNA of SARS-CoV-2 and incorporate it into progeny particles (Tang et al. 2005; Cong et al. 2020; Kang et al. 2020). The replication–transcription complexes (RTCs) sites are the place where CoV RNA is synthesized. The N proteins are also present at RTCs sites and help viral replication, interacting with nonstructural protein 3 (Nsp3), a component of the RTCs. These two aspects of the CoV life cycle, however, have not been linked. The N protein was found to bind exclusively to nsp3 and to no other RTCs components. The questions arise that what kind of interactions and how these interactions will be affected in the case of mutations. To address these questions, we docked N proteins with Nsp3 using HADDOCK servers to unveil the residues involved in interactions between Nsp3 and N proteins complex formation for future inhibitor designing. This study will provide a better understanding of rapid drug designing to control the global epidemic of SARS-CoV-2.

Methods

Protein structure retrieval

PDB is a database of biomolecule structure, providing crystal structure for proteomics and in silico studies. Only structure relevant to the current study was chosen. The crystal structure of nucleocapsid N-terminal domain (N-NTD) was retrieved from Protein Data Bank (PDB ID: 6yi3). However, for a better understanding of interactions, the full structure of N protein (QHD43423) was download from I-TASSER (Zheng et al. 2019) and that of Nsp3 (6w6y) were downloaded from Protein Data Bank (Berman 2008).

Protein preparation

The structures were subjected to Protein Preparation Wizard in the molecular operating environment (MOE) (Vilar et al. 2008) and also in the Chimera (Pettersen et al. 2004). The missing hydrogens were added, and partial charges were assigned. The protein structure was energy minimized using default value in Amber 10 forcefield in MOE. Selenomethionines were changed into methionine. The prepared was saved in PDB format for docking.

The N protein and Nsp3 docking

To gain insight into the interactions, N proteins and Nsp3 were docked using HADDOCK server (de Vries et al. 2010; van Zundert et al. 2016). The server supports nucleic acids, small molecules, and can perform tasks with experimental data. HADDOCK is widely used in protein–protein, protein–DNA/RNA complexes. The server allows the users for conformational modification of the side chain and also of the backbone of the biomolecules during complex formation. Besides, the server directly supports the docking of PDB and NMR data that contain multiple structural models. Default parameters were used for docking except correlation was set to shape and electrostatics. Intermolecular bonds were analyzed using UCSF Chimera (Pettersen et al. 2004), MOE, and PyMOL (Lill and Danielson 2011).

The N-NTD and RNA docking

Crystal structure of N-NTD (PDB ID 6M3M) and dsRNA (PDB structure 2LK2), a 21-base pair-long dsRNA(Liu et al. 2008) was extracted from PDB and docked with the fully optimized 6yi3 using the HEX docking server (Macindoe et al. 2010). The HEX docking method considers the electrostatic potential, and is based on the rigid body docking algorithm (Macindoe et al. 2010). The default parameters were used for docking except correlation was set to shape and electrostatics. Intermolecular bonds were analyzed using UCSF Chimera (Pettersen et al. 2004) and PyMOL (Lill and Danielson 2011).

Results and discussion

Docking results

The top clusters based on HADDOCK score are shown in Table 1. Among the seven clusters, cluster 1 was selected, containing 39 structures. However, the tope one based on HADDOCK score, van der Waals, and electrostatic energies has been selected for the current investigation. In the majority of protein–protein dockings, the van der Waals energy, and electrostatic energies combine finds accurate results with promising levels as the scoring function (Mandell et al. 2001). Van der Waals (dispersion forces) contribute to protein interactions with surfaces or other biomolecules.

Table 1 HADDOCK Nsp3-N protein docking results

A total 116 structures have been clustered in HADDOCK server which was further clustered into 8 groups, representing 57% of the water-refined models. The statistics of the topmost reliable clusters are given in Table 1. Its Z-score (the more negative the better) specifies the number of standard deviations from the average this cluster is positioned in the relation of the score. However, van der Waals and electrostatic interactions are important to observe the protein–protein docking where a more negative van der Waals (vdW) and electrostatic interactions show well-docked molecules. The strength of binding of the two proteins relies on the number of residues present at the interface between the two proteins and its area corresponding to interface. Large interfaces depict high binding energy (sum of vdW, H-bonds, electrostatics) (Roth et al. 1996; Nilofer et al. 2017). Docking results were examined with the PyMOL, MOE, and Chimera (DeLano 2002; Pettersen et al. 2004; Vilar et al. 2008), for interactions between the N protein and Nsp3 the closest interacting residues were labeled for better understanding.

Nsp3 and N protein interactions

Here in the current investigations, we detected that residues at N-CTD might be involved in interacting with Nsp3 (Fig. 1a), a member of RTCs for SARS-CoVs RNA processing. The interacting residues of Nsp3, V207, N208, S209, F210, S211, G212, Y213, L214, K215, L216, T217, and D218 might be an important target of inhibitors to prevent viral RNA synthesis, mediated by N protein and Nsp3 interactions (Fig. 1a–c). The surface representation of Nsp3-N complex revealed that the contacts are stable to serve in tethering the genome at a very early stage of infection.

Fig. 1
figure 1

Residues interactions between N proteins and Nsp3. (a) surface representation of N proteins and Nsp3 (PDB ID: 6w6y, residues 207–379). Yellow represents N protein residues involved in interactions. (b) The number of clusters predicted by HADDOCK server and their van der Waals and electrostatic energies. Ribbon representation of loop residues of N. The N-NTD is involved with viral RNA binding during replication

Protein–protein interactions are important in every life activity, leading to the execution of numerous elementary roles inside the cell (Koegl and Uetz 2007; Wang et al. 2019). The CoVs N protein plays a vital role in virions through interactions with the positive-strand RNA, the M protein, and other nsp. Previous studies report that the N-terminal end of Nsp3 (1 to amino acid 233) may be central for interactions with the N protein (Hurst et al. 2010, 2013) which induces RTCs-facilitated CoV RNA replication. However, specific residues have not been highlighted, interacting with RTCs for viral RNA synthesis.

Residues, S188, S190, R191, N192, R195, S197, T198, P199, G200, S201, K237, G238, Q239, Q241, G243, Q244, T245, V246, T247, K248, F314, P309, S310, A311, S312, and A313 of Nsp3 have been detected, interacting with N protein (S183, S184, R185, S186, S187, S188, R189, S190, R191, S193, S194, R195, and N196). Previous studies also confirmed that the N-terminal part of nsp3, from residues 1 to 233 is essential for interaction with N protein (Hurst et al. 2010, 2013). In a more recent study (Cong et al. 2020), it has been shown that interactions with Nsp3 are mediated by N1a–N1b (amino acids 1 to 194) and to a minor extent N2a (amino acids 195 to 257) (Fig. 2). Although there are some other important targets (Khan et al. 2020b), designing potential inhibitor against these interacting sites of both, N and Nsp3 might be useful to block the SARS-CoV-2 replication in host cells.

Fig. 2
figure 2

Schematic structural organization of SARS-CoV-2 genome and N proteins. The region between NTD and CTD (light blue) is SR-rich region (amino acid 194–225) located inside N2a. N1b and N2a domains mediate the binding to Nsp3. The N1b is required for RNA binding during replication in which N2b interacts with M proteins

The N plays a key role to link the viral + RNA to the membrane. There are two domains, N-terminal RNA-binding domain (N-NTD) that binds the RNA while the C-terminal domain (CTD) after interaction with the M protein, is involved in anchoring the ribonucleoprotein to the viral membrane. A more recent study also reported that amino acid residues A50, T57, H59, R92, I94, S105, R107, R149, Y172 are important in the establishment of interactions with SARS-CoV-2 RNA (Dinesh et al. 2020) (Fig. 3). For a better result of inhibitors, conserved residues may be identified for better management of CoVs infections. Residues S105 and R107 are conserved among all SARS-CoV N-NTD (SARS-CoV-2, SARS-CoV, MERS-CoV, and HCoV-OC43) (Kang et al. 2020). Inhibitors may be designed to block the ssRNA binding N-NTD site. Binding of drugs at protein interfaces is mostly controlled by some specific residues contributing disproportionately to the Gibbs free energy of binding and dynamics of proteins (Massova and Kollman 1999; Weiss et al. 2000; Arkin and Wells 2004; Zhao and Chmielewski 2005; Moreira et al. 2007; Wells and McClendon 2007; Boukharta et al. 2014; Ibarra et al. 2019; Khan et al. 2020a).

Fig. 3
figure 3

Crystal structure of nucleocapsid (PDB ID: 6YI3) and suggested interaction site residues with RNA. The interaction site was predicted in a more recent study. (a) Residues predicted, interacting with RNA. Positively charged cleft between the basic finger and the palm creating a putative RNA-binding site in the hinge/junction region between the palm and basic finger. (b) RNA-binding site

A previous study reports the importance of N-nsp3 interaction with N proteins for SARS-CoV-2 replication. A wide range of supporting evidence confirmed the interaction between N proteins and Nsp3 replicase–transcriptase complex (Hurst et al. 2013). These findings suggest that some peptide inhibitor may be designed to prevent the interaction of Nsp3 and N proteins, required of viral replication and propagation.

In conclusion, the N protein seems the most potent drug target in SARS-CoV-2. Its N-NTD interacts with the viral RNA while N-CTD interacts with the Nsp3 of RTCs during RNA synthesis. Domains, N1b, N2a, and N2b, all are a potential target. Residues involved in interaction may be the potential target of drug and peptides inhibitors. Drug development and screening against interacting residues may be useful for better management of SARS-CoV-2 infections.