Abstract
The SARS-CoV-2 is rapidly evolving and new mutations are being reported from different parts of the world. In this study, we investigated the variations occurring in the nucleocapsid phosphoprotein (N-protein) of SARS-CoV-2 from India. We used several in silico prediction tools to characterise N-protein including IEDB webserver for B cell epitope prediction, Vaxijen 2.0 and AllergenFP v.1.0 for antigenicity and allergenicity prediction of epitopes, CLUSTAL Omega for mutation identification and PONDR webserver for disorder prediction, PROVEAN score for protein function and iMutantsuite for protein stability prediction. Our results show that 81 mutations have occurred in this protein among Indian SARS-CoV-2 isolates. Subsequently, we characterized the N-protein epitopes to identify seven most promising peptides. We mapped these mutations with seven N-protein epitopes to identify the loss of antigenicity in two of them, suggesting that the mutations occurring in the SARS-CoV-2 genome contribute to the alteration in the properties of epitopes. Altogether, our data strongly indicates that N-protein is gaining several mutations in its B cell epitope regions that might alter protein function.
Avoid common mistakes on your manuscript.
INTRODUCTION
In the Wuhan province of China, various cases of pneumonia were reported in the late 2019 whose causative agent was identified as Severe acquired respiratory syndrome coronavirus 2 (SARS-CoV-2) [1–3]. Studies have revealed that SARS-CoV-2 genome shows close resemblance of approximately 96% with the bat SARS-like coronavirus strain Bat-CoV RaTG13 [4]. The SARS-CoV-2 infection leads to Coronavirus disease 2019 (COVID-19) whose symptom ranges from mild to severe respiratory distress. The SARS-CoV-2 rapidly spreads from infected individuals to the normal population by direct contact or via respiratory droplets. This virus has infected human population worldwide and caused tremendous loss of lives and economy. As of 11th Nov 2021, approximately 252 million cases of COVID-19 have been reported worldwide and death toll has reached 5.08 million. To overcome the COVID-19 pandemic, the treatment strategies, better diagnostic methods and vaccines are being developed by the collaborative efforts of industry and academia worldwide.
The single-stranded RNA genome of SARS-CoV-2 is approximately 29 Kb in length [5]. Its genome encodes 29 proteins categorized into structural (S, M, E and N), non-structural (Nsp1-16) and accessory proteins (Orf3a, 3b, 6, 7a, 7b, 8, 9b, 9c and 10) [6]. SARS-CoV-2 enters the host cells by interaction with the receptor angiotensin-converting enzyme 2 (ACE2) present on host cells via the receptor binding domain (RBD) of Spike protein [7]. Subsequent to the entry into the host cell, the SARS-CoV-2 genomic RNA is translated by the host cell translational machinery to produce viral proteins. As a result, the viral proteins are exposed to the host immune system that triggers an immune response. The B cell epitopes generated from viral proteins bind to the host B cell antigen receptors to induce a cascade of reactions to generate protective antibodies that neutralize the virus [8]. Therefore, identifying the epitopes of SARS-COV-2 plays central role in vaccine development and SARS-CoV-2 pathogenesis. More importantly, SARS-CoV-2 is rapidly mutating and several new variants are emerging and therefore, continued analyses of variants are warranted to understand viral evolution and its implication on the host immune response.
In this study, we used bioinformatic approach to characterize N-protein epitopes of SARS-CoV-2. Our data revealed that few epitopes of N-protein might lose its antigenic property as a result of mutations observed among Indian SARS-CoV-2 isolates.
METHODS
Sequence Retrieval for Analysis
We retrieved protein sequences of SARS-CoV-2 used in this study from public accessible NCBI Virus database. The NCBI virus database has annotated 26 proteins of SARS-CoV-2 from the reference genome (NC_045512.2). We performed detailed analysis of variations present in the N-protein among Indian isolates of SARS-CoV-2. For this analysis, sequences of 831 N-protein reported from India (till July 2021), were also downloaded from NCBI virus database. The accession ID of N-protein reference sequence (N-protein) used in this study was YP_009724397. The details of all N-protein sequences (accession ID) used in this study are mentioned in Supplementary Table S1.
The Prediction of B Cell Epitopes
Linear B cell epitopes are small continuous peptides. For this prediction, a webserver tool IEDB (Immune Epitope Database) was applied that uses an algorithm based on the “Bepipred linear prediction method” [9]. For the analysis by “Bepipred linear prediction methods” the default threshold value of 0.350 was applied using Bepipred 2.0. The antigenicity and allergenicity of epitopes were predicted by Vaxijen 2.0 [10] and AllergenFP v.1.0 [11] webservers, respectively. We used reference sequence of N-protein (accession ID: YP_009724397) for this prediction.
Multiple Sequence Alignments (MSAs)
MSAs were performed to identify variations present among N-proteins. The Clustal Omega tool was used to conduct MSAs [12] as described earlier [13]. In this analysis, the first reported sequence of N-protein from Wuhan, China was used (protein Accession Number: YP_009724397) as the reference sequence. The 831 N-protein sequences reported from India until 1st June 2021 were compared with the reference sequence to identify variations present among Indian isolates.
Secondary Structure and Protein Disorder Prediction
The secondary structure of polypeptide sequence was predicted using CFSSP webserver [14] as described earlier [15]. The per-residue contribution of disorder was predicted using PONDR webserver [16]. The PONDR-VSL2 value more than 0.5 represents disorder, while the value less than 0.5 indicates order in the polypeptide structure.
Analysis of Effect of Mutation on Protein Function and Stability
The PROVEAN (Protein Variation Effect Analyzer) score indicates the probable effect of a mutation on protein function [17]. For this prediction, the default threshold score of –2.5 was used. A PROVEAN score of ≤–2.5 represents “deleterious” mutation, while, score more than –2.5 indicates “neutral” mutation. We predicted protein stability by iMutantsuite webserver [18] based on the difference in free energy (ΔΔG) as described earlier [19]. The protein is considered more stable if the ΔΔG is positive, while negative ΔΔG indicates instability.
RESULTS
Identification and Characterization of Linear B Cell Epitopes of N-Protein
The linear B cell epitopes were predicted by a bioinformatic tool IEDB, which is based on the “Bepipred linear prediction method” using 0.350 as a threshold value. The analysis of linear B cell epitopes of N-protein revealed that it contributes to 13 potential peptides (Fig. 1, yellow shaded area). Among those 13 peptides, seven peptides fulfilled the criteria of being antigenic, non-allergen and non-toxic (Table 1). The complete list of 13 peptides is shown in Supplementary Table S2. The peptide 6 (KLDDKDPNFK) shows the highest antigenic value (Vaxijen score) which is 2.129 followed by peptide 4 (AFGRRGPEQTQGNFG) with 1.172 score.
Analysis of Variations in N-Protein of SARS-CoV-2 among Indian Isolates
To understand the N-protein variations in India, we compared the SARS-CoV-2 N-protein sequences reported from India with the first sequence reported from Wuhan, China (YP_009724397). The MSA analysis revealed that N-protein has gained 81 mutations in India (Table 2). We characterized those mutants by analyzing the change in polarity and charge of N-protein (Table 2). Subsequently, we did a stability prediction using I-mutant Suite to understand the effect of these mutations on N-protein. For stability prediction, we measured the change in free energy (ΔΔG) that demonstrates the stability of the protein. Our data revealed that 17 mutations increased stability since the ΔΔG was positive for them while 64 mutations led to decrease in stability of N-protein (Table 2). Further, we measured the PROVEAN score of each mutant that predicts the impact of mutation on protein function. Our data shows that several mutations impart no effect on protein function (neutral), while 12 mutations are deleterious (Table 2). Altogether, our data suggests that the variations in N-protein contribute to the alteration in their properties and function.
Mutations in B Cell Epitope Cause Alteration in Their Antigenicity
Subsequently, we mapped the identified N-protein mutations over the seven B cell epitopes described above. Our data revealed that 21 mutations reside within the B cell epitopes (Fig. 2a). The peptides 2, 5, and 6 possess mutations at one site (Fig. 2a). Similarly, peptide 1 has two mutations while peptide 3 and 5 possess seven and nine mutations, respectively (Fig. 2a). We did not observe any mutation in peptide 4, suggesting that this epitope is the most conserved among the rest of the epitopes (Fig. 2a). The details of the wild type and mutant epitopes are mentioned in Supplementary Table S3. Subsequently, we analyzed the antigenicity, allergenicity and toxicity of the mutant epitopes. Interestingly, our data revealed that all mutant epitopes are non-allergen and non-toxic (Supplementary Table S3); however, peptide 3 mutant 2 (P3M2) and peptide 5 mutant 1 (P5M1) lost their antigenic property and became non-antigenic (Figs. 2d and 2e). The mutant peptides of epitopes 1, 2, 6 and 7 do not show any significant alteration in antigenicity property (Figs. 2b, 2c, 2f, and 2g). The loss of epitope in P3M2 and P5M1 can also be visualized by IEDB epitope predictions. The “bepipred score graph” revealed that compared to wild-type, the mutants P3M2 (compare 2H and 2I) and P5M1 (compare 2H and 2J) has loss of epitopes. Altogether, our data suggests that the emerging mutations in N-protein can contribute to alterations in their properties.
Mutant Epitopes Alter Protein Disorder Parameters
We further characterized the effect of mutations on the two epitopes that lost their antigenicity. We analyzed per-residue disorder property of the mutant peptides. Our data shows that both mutant peptides are more ordered than their wild-type counterparts (Supplementary Figs. S1A and S1B). Altogether, our data suggests an alteration in protein disorder score in the mutant epitopes.
DISCUSSION
The SARS-CoV-2 has undergone rapid evolution after its emergence from Wuhan, China that is adversely impacting the development of vaccines and treatment strategies. Therefore, there is an urgent need to understand the biology of SARS-CoV-2 for designing effective therapy. Although several in silico predictions have been conducted on SARS-CoV-2 to identify putative epitopes, those studies were aimed for vaccine development [20–23]. In this study, we focused on the mutations that have occurred in the epitopes and their effects were predicted. Our data revealed that N-protein sequences reported from India have 81 mutations. To correlate “how these mutations might affect N‑protein epitopes”, we mapped these 81 mutations with putative epitopes to identify 21 mutations reside in those N-protein epitopes. Furthermore, we observed that two epitopes lost their antigenicity (Fig. 2) due to the mutation suggesting that the epitopes are changing with SARS-CoV-2 evolution. Our data strongly suggests that, as a consequence of the mutation occurring on epitopes, the specificity and/or sensitivity of the immune-based assays where N-proteins are used might change, which will adversely affect the results. It has been established by several studies that the new SARS-COV-2 variants have altered host antibody interactions and in some rare cases they might not be recognized by the host antibody [24, 25]. A recent study revealed that several SARS-CoV-2 variants have decreased sensitivity to neutralizing monoclonal antibodies [26]. Such variants are likely to become resistant to the host immune system. This study gives a comprehensive view of B cell epitopes of SARS-CoV-2 N-protein and its evolution has been discussed. It is evident from our study that future studies should be conducted to link the impact of newly emerging SARS-CoV-2 mutations on protein structure, antigenicity, interaction with antibodies and their consequences.
REFERENCES
Lai, C.C., Shih, T.P., Ko, W.C., Tang, H.J., Hsueh, P.R., Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and coronavirus disease-2019 (COVID-19): The epidemic and the challenges, Int. J. Antimicrob. Agents, 2020, vol. 55, no. 3, p. 105924.
Rothan, H.A. and Byrareddy, S.N., The epidemiology and pathogenesis of coronavirus disease (COVID-19) outbreak, J. Autoimmun., 2020, vol. 109, p. 102433.
Rabi, F.A., Al Zoubi, M. S., Al-Nasser, A.D., Kasasbeh, G.A., and Salameh, D.M., SARS-CoV-2 and coronavirus disease 2019: What we know so far, Pathogens, 2020, vol. 9, no. 3, p. 231.
Zhou, P., Yang, X.L., Wang, X.G., et al., A pneumonia outbreak associated with a new coronavirus of probable bat origin, Nature, 2020, vol. 579, no. 7798, pp. 270–273.
Wu, F., Zhao, S., Yu, B., et al., A new coronavirus associated with human respiratory disease in China, Nature, 2020, vol. 579, no. 7798, pp. 265–269.
Khailany, R.A., Safdar, M., and Ozaslan M., Genomic characterization of a novel SARS-CoV-2, Gene Rep., 2020, vol. 19, p. 100682.
Yang, J., Petitjean, S.J.L., Koehler, M., Zhang, Q., Dumitru, A.C., Chen, W., Derclaye, S., Vincent, S.P., Soumillion, P., and Alsteens, D., Molecular interaction and inhibition of SARS-CoV-2 binding to the ACE2 receptor, Nat. Commun., 2020, vol. 11, no. 1, p. 4541.
Chia, W.N., Zhu, F., Ong, S.W.X., et al., Dynamics of SARS-CoV-2 neutralising antibody responses and duration of immunity: a longitudinal study, Lancet Microbe, 2021, vol. 2, no. 6, pp. e240–e249.
Vita, R., Mahajan, S., Overton, J.A., Dhanda, S.K., Martini, S., Cantrell, J.R., Wheeler, D.K., Sette, A., and Peters, B., The Immune Epitope Database (IEDB): 2018 update, Nucleic Acids Res., 2019, vol. 47, no. D1, pp. D339–D343.
Doytchinova, I.A. and Flower, D.R., VaxiJen: a server for prediction of protective antigens, tumour antigens and subunit vaccines, BMC Bioinform., 2007, vol. 8, p. 4.
Dimitrov, I., Naneva, L., Doytchinova, I., Bangov, I., AllergenFP: allergenicity prediction by descriptor fingerprints, Bioinformatics, 2014, vol. 30, no. 6, pp. 846–851.
Madeira, F., Park, Y.M., Lee, J., Buso, N., Gur, T., Madhusoodanan, N., Basutkar, P., Tivey, A.R.N., Potter, S.C., Finn, R.D., and Lopez, R., The EMBL-EBI search and sequence analysis tools APIs in 2019, Nucleic Acids Res., 2019, vol. 47, no. W1, pp. W636–W641.
Azad, G.K., Identification of novel mutations in the methyltransferase complex (Nsp10-Nsp16) of SARS-CoV-2, Biochem. Biophys. Rep., 2020, vol. 24, p. 100833.
Ashok, K.T., CFSSP: Chou and Fasman Secondary Structure Prediction server, Wide Spectrum, 2013, vol. 1, no. 9, pp. 15–19.
Azad, G.K., The molecular assessment of SARS-CoV-2 nucleocapsid phosphoprotein variants among Indian isolates, Heliyon, 2021, vol. 7, no. 2, p. e06167.
Obradovic, Z., Peng, K., Vucetic, S., Radivojac, P., and Dunker, A.K., Exploiting heterogeneous sequence properties improves prediction of protein disorder, Proteins, 2005, vol. 61, pp. 176–182.
Choi, Y., Sims, G.E., Murphy, S., Miller, J.R., and Chan, A.P., Predicting the functional effect of amino acid substitutions and indels, PLoS One, 2012, vol. 7, no. 10, p. e46688.
Capriotti, E., Fariselli, P., and Casadio, R., I-Mutant2.0: predicting stability changes upon mutation from the protein sequence or structure, Nucleic Acids Res., 2005, vol. 33, pp. W306–W310.
Azad, G.K., Identification and molecular characterization of mutations in nucleocapsid phosphoprotein of SARS-CoV-2, PeerJ, 2021, vol. 9, p. e10666.
Ahmed, S.F., Quadeer, A.A., and McKay, M.R., Preliminary identification of potential vaccine targets for the COVID-19 Coronavirus (SARS-CoV-2) based on SARS-CoV immunological studies, Viruses, 2020, vol. 12, no. 3, p. 254.
Grifoni, A., Sidney, J., Zhang, Y., Scheuermann, R.H., Peters, B., and Sette, A., A sequence homology and bioinformatic approach can predict candidate targets for immune responses to SARS-CoV-2, Cell Host Microbe, 2020, vol. 27, no. 4, pp. 671–680.
Chen, H.Z., Tang, L.L., Yu, X.L., Zhou, J., Chang, Y.F., and Wu, X., Bioinformatics analysis of epitope-based vaccine design against the novel SARS-CoV-2, Infect. Dis. Poverty, 2020, vol. 9, 88.
Chukwudozie, O.S., Gray, C.M., Fagbayi, T.A., Chukwuanukwu, R.C., Oyebanji, V.O., Bankole, T.T., Adewole, R.A., Daniel, E.M., Immuno-informatics design of a multimeric epitope peptide based vaccine targeting SARS-CoV-2 spike glycoprotein, PLoS One, 2021, vol. 16, no. 3, p. e0248061.
Starr, T.N., Greaney, A.J., Addetia, A., Hannon, W.W., Choudhary, M.C., Dingens, A.S., Li, J.Z., and Bloom, J.D., Prospective mapping of viral mutations that escape antibodies used to treat COVID-19, Science, 2021, vol. 371, no. 6531, pp. 850–854.
Garcia-Beltran, W.F., Lam, E.C., St Denis, K., Nitido, A.D., Garcia, Z.H., Hauser, B.M., Feldman, J., Pavlovic, M.N., Gregory, D.J., Poznansky, M.C., Sigal, A., Schmidt, A.G., Iafrate, A.J., Naranbhai, V., Balazs, A.B., Multiple SARS-CoV-2 variants escape neutralization by vaccine-induced humoral immunity, Cell, 2021, vol. 184, no. 9, pp. 2372–2383.e9.
Li, Q., Wu, J., Nie, J., et al, The impact of mutations in SARS-CoV-2 spike on viral infectivity and antigenicity, Cell, 2020, vol. 182, no. 5, pp. 1284–1294.
ACKNOWLEDGMENTS
We would like to acknowledge the Department of Zoology, Patna University, Patna, Bihar (India) for providing infrastructural support for this study.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
CONFLICT OF INTEREST
The authors declare that they have no conflicts of interest.
STATEMENTS AND DECLARATIONS
Fund Information. No funding was used to conduct this research.
Ethical Statement. Not applicable
AUTHORSHIP CONTRIBUTION STATEMENT
Sushant Kumar: Methodology, Validation, Visualization, Writing–original draft and editing.
Khushboo Kumari: Methodology, Validation and Visualization.
Gajendra Kumar Azad: Conceptualization, Supervision, Methodology, Validation, Visualization, Writing–original draft and editing.
Supplementary Information
About this article
Cite this article
Kumar, S., Kumari, K. & Azad, G.K. Immunoinformatics Study of SARS-CoV-2 Nucleocapsid Phosphoprotein Identifies Promising Epitopes with Mutational Implications. Moscow Univ. Biol.Sci. Bull. 77, 251–257 (2022). https://doi.org/10.3103/S0096392522040125
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.3103/S0096392522040125