Revelation of RdRp mutation in SARS-CoV-2
A total of 488 sequences were retrieved from NCBI virus database, submitted from India in the month of July 2020 to May 2021. Also the Wuhan SARS-CoV-2 virus and the first Indian SARS-CoV-2 isolate sequences were retrieved to be used as reference in this study, both sequences were found similar. These sequences were aligned pairwise using CLUSTAL Omega database to detect the presence of mutation in these sequences. The alignment was visualized using Jalview to check out the similarities and differences in the protein sequences. Only those mutations which occurred in the RdRp (RNA-dependent RNA polymerase) or nsp12 region were identified and used further in this study as it plays a vital role in viral replication. A total of 384 recurrent mutations in 318 Indian isolates were detected in the RdRp (nsp12) region by comparing the sequences from India with that of Wuhan as shown in Table S1. Out of 384 mutations only R118C, T148I, Y149C, E802A, Q822H, V880I and D893Y were used in this study (Table 1). Therefore, these seven mutations were further characterized to see its effect on overall protein structure, conformation and dynamicity. Among these seven mutations, only four were found neutral (T148I, V880I, Q822H and D893Y) and rest were deleterious for the RdRp protein at −2.5 cut-off values of PROVEAN score (Table 2).
Table 1 Showing the details of mutated RdRp sequences used for further characterization in this study Table 2 List of nonsynonymous as well as frequent amino acid substitutions in RdRp protein (cutoff = − 2.5) Phylogenetic analysis
To obtain information on the phylogenetic relationships of different SARS-CoV-2 isolates from India as well as Wuhan we constructed a phylogenetic tree by maximum-likelihood (ML) method using MEGAX software by aligning full length ORF1ab polyprotein (7096 amino acid length). This phylogenetic analysis (Fig. 1) showed that, the ORF1ab polyprotein variants from India and Wuhan formed different clusters revealing the multi-spiked nature of the SARS-CoV-2 virus.
Mutation-induced alterations in secondary structure of RdRp protein
Secondary structure analysis was carried out to detect the alterations in the formation or loss of alpha helix, beta sheet and turns. Two mutations, Y149C and V880I in this study did not show any changes in the secondary structure however, rest of the five mutations showed significant changes (Fig. 2A). At position 118, substitution of arginine by cysteine resulted in loss of turn at position 117. Arginine is a polar positively charged amino acid, hydrophilic in nature having guanidino ring in its structure, and therefore, prefers to form turns. Cysteine being an uncharged polar amino acid prefers to form disulfide bond, and hence, favors compact protein structure, resulting in loss of turns. Our analysis also showed a point mutation at position 148 where threonine is substituted by isoleucine and is favored by helix formation. Isoleucine is a non-polar amino acid, and therefore, a stabilized residue for helix formation; whereas, threonine being a polar uncharged amino acid prefers to lie within beta sheets. The detailed analysis further explained substitution of glutamic acid by alanine at position 802 of the RdRp protein. This mutation was accompanied by a loss of turn secondary structure at point 802 (Fig. 2a). Alanine a nonpolar amino acid that has more propensity towards the formation of alpha helix rather than turns, therefore, causes loss of turns whereas glutamic acid having negatively charged side chain favors turns formation resulting in proper folding of the protein. Point mutation was further observed at position 822 where glutamine is replaced by histidine. Histidine is a positively charged amino acid with imidazole ring which neither favors helix structure nor forms sheets therefore causes loss of these secondary structures upon introgression in the protein structure. Glutamine, an uncharged amino acid forms beta sheets and turns with more propensity. Therefore, the replacement of glutamine by histidine resulted in loss of both sheets and turns in the RdRp protein. Similarly, in D893Y mutant there is a loss of turn structure at 893 position. Tyrosine, an aromatic amino acid is polar and does not form sheets and turns hence substitution mutation at this site results in loss of turns. Aspartic acid, negatively charged residue, favors more turn formation rather than helix structure. Finally, our secondary structure predictions showed considerable changes in the formation of helix, sheets and turn that can have a huge impact on the function of RdRp protein of SARS-CoV-2 virus leading to its multiplicity.
Alteration in protein dynamics upon mutation in RdRp
To further characterize the impact of mutations on the RdRp protein dynamics, tertiary structure was built using Dynamut software (Rodrigues et al. 2018). Firstly protein modeling was performed using Swiss model which predicts the structure according to the sequence and a template protein which are shown in Fig. 2b. Ramachandran plot of the template as well as mutated protein was prepared to identify the residues in the favored region (Fig. 2c). On an average more than 90% of the amino residues were found in the favored region of Ramachandran plot of both the wild type and mutated protein as shown in Table 3. The template protein used in modeling was 6XEZ and 6M71.
Table 3 Ramachandran plot analysis of wild type and mutant RdRp proteins showing favored and outliers amino acid residues Dynamut predicts protein steadiness or dynamic state upon mutation in the native structure of protein as determined by ENCoM, mCSM, DUET and others. Our analysis shows free energy difference between the wild type and mutated sequences, ∆∆G was stabilizing in all the RdRp mutants. The free energy change was recorded highest in E802A (1.725 kcal/mol) followed by T148I, Y149C and V880I as shown in Table 4. The free energy changes predict the accessible surface area of protein, cavity volume and packing density, and hence it computes the stability of the mutated protein versus wild type protein. In this study, all the mutations showed positive ∆∆G values hence predicting a stabilized mutant protein structure. Furthermore, in this investigation vibrational entropy energy (ΔΔSVib ENCoM) was computed which gives the configurational entropy of the protein according to the energy landscape. These values provide deep insight into protein movements and hence their conformational changes (Rodrigues et al. 2018). The ΔΔSVib ENCoM calculated for all the mutations showed a negative value representing the rigidification of protein structure upon mutation. The most rigid structure was that of Q822H (− 5.021 kcal/mol/K) mutant, followed by D893Y, R118C and V880I, however, the mutant Y149C (– 3.621 kcal/mol/K) showed less rigidity and this mutant protein had nearly flexible structure as indicated in the Fig. 2d. The visual representation of the flexibility analysis by Dynamut showed all RdRp mutants exhibiting a rigid structure except for Y149C mutant which gained flexibility upon mutation (shown as red region in Fig. 2d).
Table 4 Effect of mutation variability on the structural dynamics of RdRp protein as shown by ΔΔS ENCoM and ΔΔG values Our analysis extended further with the calculation of atomic fluctuations and deformation energies. The visualization of atomic fluctuation predicts the amplitude of the atomic motion present in the protein moiety whereas deformation energy computes the local flexibility in the protein molecule. Atomic fluctuation was calculated over the first 10 non-trivial modes on the protein molecule. The magnitude of the fluctuations calculated is shown by thin to thick tube colors in which blue shows low, white shows moderate and red shows high fluctuations. Similarly, the amplitude of deformation is calculated over the first 10 non-trivial modes of the molecule whose magnitude is represented by thin to thick tube colors in which blue shows low, white shows moderate and red shows high deformations. In this study, visual changes were observed in the atomic fluctuations and deformation energies of mutant RdRp protein with that of wild type protein which are marked with arrows of different color (Fig. 2e, f). Upon mutation there was introduction of blue color tubes at the point of white tubes which shows low level of atomic fluctuation as well as deformation in the mutant RdRp protein versus wild type.
The protein dynamics study further elaborated with the study of shift in intramolecular interactions caused by mutation in the RdRp protein. Dynamut server detects all covalent and non-covalent interactions and hence predicts the intramolecular interactions. The mutation in RdRp protein resulted in disruption of hydrophobic interactions, aromatic contacts, ionic interactions, hydrogen bonds and other metal complex interactions. The results of the present study revealed that the mutation in the arginine, threonine, tyrosine, glutamic acid, glutamine, valine and aspartic acid affected the interaction of the amino acid residues found in close proximity (Fig. 3). The residues found in the side chain of wild type protein are changed due to incorporation of mutant residues. From the above analysis it can be concluded that the mutation in the RdRp protein is changing not only the stability and flexibility of the protein but also interfering with its intramolecular interactions with its neighboring molecules.