1 Introduction

The begomovirus genus (family Geminivirdae) is the largest and most well-known among the plant virus genera, with over 445 species that severely reduce the yield of commercially significant agricultural crops [1]. Begomoviruses are encapsulated in twine particles containing circular ssDNA that is either monopartite (approximately 2.7 kb with DNA-A alone) or bipartite (2.5–2.6 kb genomic size with both DNA-A and DNA-B) and are primarily spread by whiteflies (Bemisia tabaci). The monopartite begomovirus is also associated with circular ssDNA satellites, which are classified into two groups: alphasatellite, betasatellite, and deltasatellite [2,3,4]. The typical genome organization of begomovirus consists of six open reading frames (ORF): REn, Rep, TrAP, and C4 on the complementary sense, and CP and V2 (pre-cp) genes on the virion strand [5]. Various infections, mainly begomoviruses [6], substantially impact chilli agriculture worldwide, which is largely grown in tropical and temperate regions for vegetables and spices. According to the FAOSTAT data report for 2021 [7,8,9], the global chili production is 48.39 lakh tons per year on 0.683 million hectares. India produces 43% of worldwide chilli (about 1.98 million tons) and is the world’s greatest producer [8]. According to reports, 166 viruses, including begomovirus, infect chilli crops and cause a variety of diseases [9]. According to several reports, symptoms of the chilli plant included mottling, mosaic, vein yellowing of the leaves, stunting, distortion, curling, flower abortion, and unhealthy fruits [10]. The Chilli Leaf Curl Virus (ChiLCV) is one of the potentially severe begomoviruses responsible for Chilli Leaf Curl Disease (ChiLCD) in terms of frequency, diversity, and yield loss [11, 12]. In severe circumstances [13], 100% of commercial fruit was lost. Globally, one of the biggest challenges to chilli production has been the emergence of plant begomovirus disease [14].

Understanding interaction networks is critical for thoroughly understanding organisms’ metabolic and physiological responses to growth, cell differentiation, abiotic stress, and pathogen defence. Genome-wide studies on plant immunity and pathogen invasion methods have revealed a comprehensive picture of pathogen-plant interactions [15]. This network of interactions demonstrates how pathogen effectors and plant defence proteins converge to generate interconnected subsets of host proteins known as immunological ganglia [15, 16]. Various technologies for lead detection and optimization have advanced greatly in recent years. Different computational tools allow us to find compounds that bind to the target more effectively and see the ligand-target interaction (molecular docking) [17]. Recently, a protein–protein interaction network (PPI) formed by host proteins interacting with the begomovirus bipartite transport proteins (movement protein) MP and (nuclear shuttle protein) NSP was characterized [16]. The MP-NSP-host-PPI network identified multiple nodes enriched in host transport functions and defence proteins, suggesting that MPs and NSPs recruit host transport functions to intracellular and intercellular targets while also having immunosuppressive properties that promote infection. Recent research has also shown that the SnRK1 and calmodulin-like protein 11 (Gh-CML11) of Gossypium hirsutum play a role in the CLCuMB-betaC1 interaction [18, 19]. Mishra et al. [20] studied the begomovirus genome using bipartite sequencing and agro-inoculation. It was determined that when DNA-A was combined with betasatellite, significant symptoms such as downward curling, vein yellowing, and leaf curling were observed. In contrast, the infecting clone of DNA-A exhibited just modest symptoms in chilli. There is a significant chance that the AC1 protein, alone or in conjunction with the AV2 protein, will interact with the target host protein and cause severe leaf curl disease symptoms.

To further understand this idea, the present study has mostly focused on the in-silico analysis of individual interactions between all viral proteins and four host proteins found in chillies. The six proteins of ChiLCV_GKP/IN/21 (Acc. No. MZ540908) and one from ChLCuB_GKP/IN/21 (Acc. No. MZ540909) were modelled alongside four chilli proteins. The quality of the simulated proteins was evaluated, and the physiochemical properties of each viral protein were calculated. Our findings provide fresh insights into begomovirus-chilli interactions at the molecular level and provide the basis for future research on structure–function correlations. It was also suggested that utilizing substitution mutations in the host protein would be a more effective strategy to create virus resistance while maintaining the protein’s structural and functional stability.

2 Material and method

2.1 Sequence retrieval of host and viral proteins

The amino acid sequences of eight proteins from Capsicum annuum (ADK, SGS3, GLO1, RAD54, AKIN11, CLAVATA 1/LRR-EKL-SKs, and SKP1), seven from Chill leaf curl virus-DNA-A (ChiLCV-A) (Accession No. MZ540908) proteins (Pre-CP, CP, Rep, TrAP, REn, and C4), and one (βC1) from Chill leaf curl betasatellite (ChLCuB) (Accession No. MZ540909) were retrieved from NCBI, GenBank.

2.2 Estimation of physiochemical properties

Physiochemical characteristics for the retrieved host and virus amino acid sequences were predicted. Expasy protparam (https://web.expasy.org/protparam/) [21] was used to calculate a molecular weight (kDa), pI, instability index, extinction coefficients (M−1 cm−1), aliphatic index, grand average of hydropathicity (GRAVY), and negatively and positively charged residues. We use Deep Learning Approach for the Prediction of Thermal Protein Stability (DeepSTABp) [22] and ProtPi [23], an online application, to predict the melting temperature (Tm) and net charge at pH 7.4 (z).

2.3 Homology modelling and domain prediction

The obtained sequences were uploaded to the Phyre2 server (http://www.sbg.bio.ic.ac.uk/~phyre2/html/page.cgi?id=index) [24], SWISS-MODEL (https://swissmodel.expasy.org/) [25], and the UCSF ChimeraX program [26,27,28]. The modelled proteins were visualized and analyzed using SWISS-MODEL and the ChimeraX Studio Visualizer. In addition, we use NCBI’s conserved domain database (CDD), GeneBank, to determine the domain and its function in the protein model.

2.4 Model quality and similarity assessment

Protein Structure Analysis (ProSA-web) (https://prosa.services.came.sbg.ac.at/prosa.php) [29], Qualitative Model Energy Analysis (QMEAN) v4.3.0 (https://swissmodel.expasy.org/qmean/) [30], and SAVES v6.0 (https://saves.mbi.ucla.edu/) [31] servers were used to evaluate the quality of each chilli and begomoviral protein models. SAVES v6.0 includes ERRAT and Ramachandran plot analyses. In contrast, a full Ramachandran plot structural analysis of the modelled protein was performed using the PDBsum online server (https://www.ebi.ac.uk/thornton-srv/databases/pdbsum/Generate.html) [32] (Fig. S1). RasMol [33], PyMOL [34], and the JavaScript viewer 3Dmol.js [35] of the PDBsum web server all provide interactive viewing of 3D structures.

2.5 Preparation of host protein

The four chilli proteins—ADK, SGS3, GLO1, and SKP1 models—with values more than 90% in the most favorable region of the Ramachandran plot were considered for further analysis. Host protein models were generated with AutoDock Vina v1.5.7 [36] (Fig. S1). To optimize the docking investigation, we prepare the AutoDock Vina model of the selected protein based on: adding polar hydrogen and removing the protein-containing water molecules [37].

2.6 Protein–protein interaction and interface prediction of residues

The protein prepared for docking was used as a macromolecule, and viral proteins acted as ligands. We predicted the protein’s active site using the DoGSiteScorer website (https://proteins.plus/#dogsite), and for precise docking, created a grid box around our macromolecules [38]. To evaluate protein–protein interactions, we utilized HADDOCK v2.4 (High Ambiguity Driven Protein–Protein DOCKing), a flexible docking system for modeling biomolecular complexes based on information [39, 40], and the PyRx program [41]. To evaluate the quality of the host-virus protein complex, binding energy, van der Waal energy, HADDOCK score, electrostatic energy, desolvation energy, constraint violation energy, and Z-score were calculated. We validate the docking result by importing both proteins into the PyRx tool (which contains open-source applications like AutoDock [42], AutoDock Vina, and Open Babel [43]. Furthermore, two structure- and sequence-based bioinformatics tools, UCSF ChimeraX and PDBSum, were applied to predict binding sites and interfaces in the host protein and seven ChiLCV-A–ChLCuB proteins. The whole workflow pathway is illustrated in Fig. S1. We evaluate the top three results—two from ChiLCV-A proteins and one from ChLCuB-βC1—for binding energy and site-directed mutations.

2.7 In-silico mutagenesis study

We performed site-specific mutagenesis for the host proteins that exhibited the best response for the ChiLCV-A and ChLCuB proteins. Using interface alanine scanning outcomes of the int-id and DDG ΔΔ(G), we predict the sites that may have undergone alanine substitution mutation [44]. We confirmed the substitution/deletion mutation type in PROVEAN by introducing alanine at these locations [45, 46]. Protein variation was expected to have a “deleterious” effect if the PROVEAN score was less than the specified threshold value of − 2.5. If the score is above the threshold, the modification was predicted to have a “neutral” effect [45, 46]. We used the DDMut tool [47] to evaluate the ΔΔ(G) value for individual and multiple mutations, as well as produce a mutant protein structure. PRODIGY and PDBSum were used to predict the interface and binding energy of mutated host proteins (ADK and GLO1) and viruses (CP, Rep, and βC1).

3 Result

3.1 Prediction of physiochemical parameters of host and viral sequences

The fundamental physiochemical properties show that pre-CP, Rep, and betaC1 have a substantial number of negatively charged residues. At a pH of 7.4, these three sequences had a net charge of around − 2.1, − 5.8, and − 7.9, with pI values of 6.64, 6.13, and 4.95, respectively, due to an abundance of acidic residues. The host protein sequences of GLO1 have a net charge of ~ + 2.2 and a pI of 8.79, indicating a significant amount of positively charged residues (Table 1). The instability index (value < 40 stable) was found favourable for REn viral protein and GLO1 of the host protein i.e., 35.99 and 36.06, respectively [48]. To determine the thermostability of protein sequences, we anticipated the aliphatic index and melting point (Tm). We found that REn, betaC1, ADK, and GLO1 proteins have aliphatic index values of more than 90, which is directly proportional to protein thermostability. Furthermore, all the proteins were in the range of thermophilic proteins (45–75 °C) [49]. Although other proteins have negative GRAVY values indicating their hydrophilic nature, REn, and GLO1’s positive values demonstrated their hydrophobic nature and much higher thermostability than the other proteins (Table 1). The proteins SGS3 and Rep have high extinction coefficients, indicating a high amount of aromatic amino acids, and their interaction contributes to the protein structure’s conformational stability [50].

Table 1 Physiochemical parameters of chiili and ChiLCV–ChLCuB encoded proteins

3.2 Homology modeling and domain prediction

SWISS-MODEL, Phyre2, and ChimeraX were used to create a 3D model of the specified host protein. Model_1 of each protein ADK, SGS3, GLO1, and SKP1 was considered suitable for homology modeling with > 90% coverage, in addition to generating several templates (Fig. 1). Model-Template alignment among host and template proteins was also evaluated by AlphaFold v2 [51], in which the SGS3 (GenBank accession number: XP_047268992) host sequence showed good coverage with protein: suppressor of gene silencing 3 from Solanum lycopersicum (GenBank accession number: A5YVF1), SKP1 (GenBank accession number: AAX83944) showing high alignment with SKP1-like protein from Glycine soja (GenBank accession number: A0A0R0LC97), ADK (GenBank accession number: XP_016542462) aligned with Adenylate monophosphate kinase1 from Oryza sativa (GenBank accession number: Q10S93) and GLO1 (GenBank accession number: KAF3683929) with glycolate oxidase protein (GenBank accession number 1a18.1) (Fig. S2). Similarly, for ChiLCV-A proteins (Pre-CP, CP, Rep, TrAP, Ren, and C4) and ChLCuB-βC1, only model_1 of the proteins was considered adequate for homology modeling (Fig. 2). The domain study reveals that the Rep protein contains two domain ranges of 7–119 and 126–230 encodes for replication enhancer protein, else all other viral protein contains a single domain (Fig. 3a and Table 2). Similarly, the two domains that make up the SKP1 protein of Capsicum annuum (6–121 aa and 107–154 aa) are engaged in the ubiquitination of target proteins, which is followed by their proteasomal destruction. Similar to this, the SGS3 host protein has two domains which cover 228 aa–262 aa and 296 aa–411 aa regions, the conserved aspartate residue in the XS domain, which is associated with gene X and SGS3 and plays a role in post-transcriptional gene silencing (PTGS) and also predicts the presence of an RRM-like RNA-binding domain in the XS domain (Fig. 3b) (Table 2).

Fig. 1
figure 1

Chilli protein was modelled in Phyre2 and displayed using the ChimeraX software to show the domain network; colour denotes the secondary structures

Fig. 2
figure 2

ChiLCV–ChLCuB proteins were modelled in SWISS-MODEL and visualized with the ChimeraX tool to illustrate the domain network; the colour represents the confidence gradient between the model and the template

Fig. 3
figure 3

Domain prediction within a host protein: ADK, GLO1, SGS3, and, SKP1; b ChiLCV–ChLuB encoded proteins: Pre-CP, CP, Rep, TrAP, REn, C4, and betaC1

Table 2 Domain range and function prediction of ChiLCV–ChLCuB and chilli encoded proteins using conserved domain database

3.3 Model quality assessment

ProSA-web, QMEAN v4.3.0, SAVES v6.0, and PDBsum were used to evaluate the quality of all protein models generated by SWISS-MODEL, Phyre2, and ChimeraX (Tables 3 and 4). The ProSA Z-score was calculated by ProSAweb, whereas the QMEAN4 score was received by the QMEAN server. The ERRAT, overall G-factor, and allowed region residue percentage were determined by the SAVES v6.0 server using the Ramachandran plot for the final two parameters. A model Z-score within the range of values typically observed for native proteins of similar size indicates strong model quality, whereas a score outside the range indicates incorrect structure [29]. All host and viral protein models produced have Z-scores in the range of experimentally established protein structures, indicating acceptable model quality. The QMEAN server score assesses the ‘degree of nativeness’ and shows the quality of the model that was produced with empirically known structures. The result is a QMEAN4 number, with higher scores indicating a more dependable model [30]. The ADK, SGS, and SKP host protein models created by the ChimeraX server had the highest QMEAN4 scores in the current study, followed by SWISS-MODEL and Phyre 2. Similarly, the Phyre 2 model has the highest QMEAN score among begomovirus protein models, followed by ChimeraX. According to the ERRAT parameter, all host and virus protein models from the SWISS-MODEL were acceptable, followed by ChimeraX. An ERRAT of 95% or more indicates an acceptable high-resolution model [52]. For Ramachandran plot evaluation, an overall G-factor greater than − 0.5 indicates good model quality [52]. Except for Phyre2’s ADK host model, both host and virus models on all servers were of good quality (Tables 3 and 4).

Table 3 Chilli protein ADK, SGS, GLO1 and SKP1 model quality assessment by ProSA-web, QMEAN and SAVES v6.0 server
Table 4 Six proteins of ChiLCV-A and ChLCuB-βC1 proteins model quality assessment by ProSA-web, QMEAN and SAVES v6.0 server

We generated a Ramachandran map for all host-modelled proteins that scored more than 90% in the most favorable region using PDBSum (Fig. 4). The psi-psi torsion angles for each residue in the structure were shown on the Ramachandran plot. Triangles represent particularly recognized glycine residues since they weren’t restricted to plot areas corresponding to the other side chain types [53] (Fig. 4). The plot’s colour shading denotes the various regions, whereas the “core” regions—those with the highest concentration of phi–phi combinations—are shown by the deepest red patches. More than 90% of residues in “core” locations are ideal. The proportion of residues in the “core” regions is an excellent determinant of a protein structure’s stereochemical quality [54]. Only four proteins—ADK, SGS3, GLO1, and SKP1—were chosen because they affect critical biological plant processes and received favorable scores of 91.4%, 91.1%, 94.9%, and 91%, respectively (Fig. 4). The Ramachandran plot was used to model and plot the ChiLCV-A encoded proteins (Pre-CP, CP, Rep, TrAP, Ren, and C4) as well as ChLCuB-βC1. We identified banned areas of < 1%, showing that the model is good as per the earlier study [55] (Fig. S3).

Fig. 4
figure 4

Ramachandran plot for four-chilli proteins (ADK, GLO1, SGS, and SKP1) with its statistics showing the most favoured region > 91%

3.4 Protein preparation and active site determination

The protein was prepped for an adequate docking result in AutoDock Vina v1.5.7 [36] by adding polar hydrogen to stabilize the protein–ligand complex and eliminating protein-containing water molecules to clear a protein binding site [37]. We used the DoGSiteScorer to identify the active site for begomoviral protein binding. Appropriate ligand-binding sites were chosen for docking using the simple and drug scores. Fig. S4 shows the amino acid sequences at active sites that were employed in the study for grid box formation in docking. Both amino acid and element data showed that the ADK, GLO1, SGS3, and SKP1 proteins had a longer active pocket length (Fig. 5 and S4).

Fig. 5
figure 5

Active site detection through DoGSiteScorer among modelled plant proteins. a ADK protein with eighty-two active site residues, b GLO1 with a stretch of fifty-one active site residues, c SGS3 with active site residue twenty, and d SKP1 with active site residue of twenty-one (Fig. S4)

3.5 Chilli-begomovirus protein interaction prediction (PPI)

We docked all of the host proteins used in this investigation with each ChiLCV-A and ChLCuB encoded protein to compare their activity, predicting the binding energies of each viral protein with all four proteins using HADDOCK v2 [39, 40] and PyRx [41] to assess the reliability of docking predictions. The binding energy and HADDOCK scores were calculated and considered when the result was close to zero or negative, indicating the best binding affinity and interaction efficiency [56]. The HADDOCK score was used to identify clusters, and the Z-score was used to rank them according to how far they deviated from the mean value. A lower Z-score indicates that the cluster outperformed the average. The lowest binding energy value was computed for ADK-CP, ADK-βC1, GLO1-Rep, ADK-REn, ADK-TrAP, GLO1-Pre-CP, and ADK-C4. HADDOCK scores were highest with the SGS3 host protein for Rep, C4, CP, and Pre-CP genes; TrAP with the SKP1 gene; and REn with the GLO1 gene (Table 5). Furthermore, Van der Waals and electrostatic energies were found to have a significant role in protein–protein interaction and recognition [57,58,59,60]. Electrostatic analysis revealed that SGS3-Rep, SKP1-TrAP, GLO1-REn, ADK-C4, GLO1-CP, SGS3-Pre-CP, and SKP1-βC1 performed optimally. Similarly, Van der Waal’s best findings were GLO1-Rep, ADK-TrAP, SGS3-REn, SGS3-C4, ADK-CP, and GLO1-Pre-CP (Table 5). The negative sign for varying energies implies that an attractive force between the two molecules is driving the interaction. The desolvation energy, which is used to separate solvent molecules from the solute, was also calculated. Negative desolvation energy is commonly reported in hydrophobic interactions and was found in all viral protein complexes, particularly S3S3. Restraints are used to specify how molecules should interact during docking simulations (such as HADDOCK). Essentially, the closer the energy is to zero, the more the structure follows the set limitations, as demonstrated by SKP1-TrAP (Table 5).

Table 5 Binding energy, HADDOCK score, energies, and Z-score for the host-viral protein complex were enlisted below

Using PyRx, we observed that the lowest binding energy for ADK was with TRAP (− 14.8 kcal/mol) and Pre-CP (− 13.3 kcal/mol). When analyzing GLO1 data, REn had the highest binding energy at − 13.7 kcal/mol, while SGS3 had a value of − 12.5 kcal/mol with C4. Similarly, SKP1 showed positive findings for both REn and C4, with binding energies of − 14 kcal/mol and − 14.2 kcal/mol. ChLCuB-βC1 interactions with ADK, GLO1, and SKP1 had binding energies of − 15.5 kcal/mol, − 14.4 kcal/mol, and − 16.5 kcal/mol, respectively, which were lower than those of ChiLCV-A proteins. The lower the binding energy, the higher the binding efficiency, resulting in increased inhibition (Table 6).

Table 6 Binding energy between protein encoded by ChiLCV–ChLCuB and chilli plants by using PyRX tool

3.6 In-silico interface prediction

In Fig. 6, the computational biology study produced residue-level binding hotspot predictions at the individual and collective levels. The colours of the residue showed which amino acids were positive, negative, neutral, etc. The results of this study suggested that ADK (34 res.)–CP (27 res.) and GLO1 (28 res)–Rep (31 res.) had the highest number of residues involved in the interface. It was shown that ADK (1602Å2)–CP (1660 Å2) had the largest interface area involved in complex formation (Fig. 6). In this case, several bonds stabilized the PPI complex; we examined the hydrogen, disulphide, and salt bridge bonds. The contacts between residues with the highest number of hydrogen bonds were those for GLO1-Rep, ADK-C4, and, ADK-betaC1 i.e., 19, 19, and 18, respectively. In contrast, the ADK-CP, GLO1-Rep, and ADK-betaC1 interaction complex displays the greatest number of salt bridges (i.e., 5). We use the UCSF ChimeraX tool (alphafold) to examine the precise interactions between the top seven protein docking results to validate the interface PDBSum predictions (Fig. 7a, b). We obtained the best possible result for the ADK-CP, GLO1-Rep, and ADK-betaC1 interaction by combining the entire PPI results in HANDDOCK and interface prediction findings, including the binding energy, the number of residues involved, the interface area, the total number of hydrogen bonds, and non-bonded atoms. The two host proteins that were most associated, according to our subsequent analysis of the PPI, were ADK and GLO1. The fact that the domain range of these two proteins comprised the maximum length of the protein, suggests that residues from these domains were primarily engaged in the interaction (Figs. 3a, b and 6).

Fig. 6
figure 6

Interface prediction for the top seven PPI complexes. GLO1-Rep, ADK-TrAP, and ADK-REn. ADK C4, ADK-CP, GLO1-pre-CP, and ADK-betaC1. The purple circle represents the chilli protein, while the red circle represents the viral protein. The residues involved at the interface are shown along with salt bridges, hydrogen bonds, and other non-bonded interactions. The different colours of residues and bonds indicate the nature of the residues and the type of bonds

Fig. 7
figure 7

Protein–protein interaction between the seven best complexes displaying the interface residues: green for chilli proteins and pink for viral protein a showing the top three PPI complexes of study: ADK-CP, GLO1-Rep, and ADK-betaC1. b The interface residues of the PPI complex of GLO1-Pre-CP, ADK-TrAP, ADK-Ren, and ADK-C4

3.7 Site-specific mutagenesis

Three PPI complexes—ADK-CP, GLO1-Rep, and ADK-βC1—were chosen based on docking and interface outcomes. While examining the docking results, we observed that all of the other viral proteins had the strongest binding affinity with ADK, except Rep and pre-CP, which exhibited the best interaction with GLO1.So, we did a substitution deletion on both ADK and GLO1 proteins. Furthermore, applying interface alanine scanning, it was found that substitution-deletion in interface residues caused weak interactions between ADK and CP, GLO1 and Rep, as well as ADK and βC1. The host protein is indicated by chain “A” in this computational mutagenesis study, while the viral protein organization is represented by chain “B.” The amino acids from the domains of ADK and GLO1 are primarily represented by blue-colored residues in pdb# (Tables S1, S2 and S3), while the yellow-heightened positions were not used for further mutation because they contained serine amino acids or had a negative ΔΔG value. In these residues, some of the values of int-id are zero, signifying amino acids that are not linked to the interaction partners CP, Rep, and βC1 (Tables S1, S2 and S3). Additionally, DDG ΔΔ(G) also has a favorable score due to its low affinity for binding with the residues found in the virus protein. In addition, to validate this outcome, PROVEAN was applied to predict the effect on the function of amino acid substitutions (Tables S4 and S5), and the DDMut tool was employed to estimate the impact of alterations on protein stability. After introducing Alanine, we noticed that the ADK sites (T71A, D86A, K89A, K96A, N123A, and A E168A) all showed positive ΔΔG for both single and multiple mutations (i.e., 1.07) in DDMut, indicating the stability of the protein following mutation (Tables S6 and S7). A score above > − 2.5 in PROVEAN revealed a neutral mutation. Similarly, the PROVEAN and DDMut tools yield positive results for the alanine insertion at E12A, A K16A, K17A, and N41A in GLO1. However, the insertion of alanine revealed no modifications to the conformation of the GLO1 and ADK proteins (Fig. 8 and S5). After that, we examined the binding energy and interface interacting residues to examine the connection between the substituted alanine residues using PDBSum and PRODIGY web server [61]. For the mutant ADK-CP, mutant ADK-βC1, and mutant GLO1-rep, we identified declines of binding energy by − 1.2 kcal/mol, − 2 kcal/mol, and − 3 kcal/mol, respectively (Table 7). Additionally, in ADK interface prediction, all other substituted residues showed no contact, except for D86A (with CP) and T71A (βC1). Further, in substitution, the disulphide bond between ADK-βC1 turned zero. In contrast, the GLO1 mutation failed to generate an acceptable outcome since the viruses continued interacting with the substituted alanine but the number of hydrogen bonds was reduced by five. Overall, stable ADK structures upon substitution, along with positive values for the DDG and decreased binding energy, are better alternatives for mutagenesis research.

Fig. 8
figure 8

Comparing the wild-type ADK and GLO1 protein models to the mutants, along with the alanine substitution site in the amino acid sequences

Table 7 Detail of site-directed mutagenesis at different sites of ADK and GLO1 proteins

4 Discussion

Viral pathogens affect the physiological features of infected plants, allowing for a variety of host-virus interactions that can lead to host infection and disease onset. Crop losses caused by viral diseases cost more than $30 billion per year [62]. Currently, over 166 viruses, including begomovirus, are known to infect chiles, making them the most serious threat to worldwide chilli production [9]. Chilli contains antioxidants that reduce the risk of cardiovascular disease, cancer, cataracts, and macular degeneration [63], and ChiLCD caused by begomovirus has been shown to destroy these commercial and healthy fruits on a large scale [64,65,66]. Multifunctional protein–protein interaction modeling and analysis are now being performed to gather knowledge about the interactions between the virus and the host plant through bioinformatics approaches. This comprehensive computational method attempts to bring fresh light on the binding processes involved in viral disease occurrences. The interface technique can be used to anticipate the interactions between chilli proteins and begomovirus viruses. Furthermore, it helps the identification of the interaction domain residues responsible for binding efficiency [18, 19].

The plant proteins included in this study are critical components of a plant’s biological system in the following pathways: adenylate kinase phosphotransferase, the glycolytic pathway, phytohormone signalling pathways via protein degradation, and post-transcriptional gene silencing (PTGS) (Table 2). The virus normally relies on the host protein machinery to survive. The work suggests how this interplay contributes to the formation of ChiLCD and increased viral infection. The purpose of this study was to anticipate the physiochemical investigation, homology modelling, site-directed mutagenesis, and protein interactions between the begomovirus (ChiLCV–ChLCuB) and the chilli proteins (ADK, GLO1, SKP1, and SGS3). According to Waterhouse et al. [25], homology modeling has emerged as a critical structural biology tool that has substantially contributed to bridging the gap between available protein sequences and empirically established structures. Before applying homology modeling to these proteins, a thorough understanding of the essential physicochemical properties of amino acid sequences was required to properly explain allosteric regulation, enzyme catalysis, protein folding, and protein recognition [67, 68]. Protein stability is crucial in a variety of biological processes and biotechnological purposes. A protein’s thermodynamic stability affects its folding, assembly, and function; thus, alterations in stability can have a significant impact on protein properties and are frequently related to disease. As a result, protein stability analysis is an important aspect of protein physiology and biological structure [22]. The physicochemical results of the study demonstrated the thermodynamic stability of all viral and host proteins.

In silico protein structure prediction can help in exploring protein–protein interactions, such as ligand binding site determination, which would otherwise take a long time to investigate empirically. Understanding the effect of amino acid substitution on the final model can also be improved by combining other computational approaches into a protein structure prediction investigation. Using three bioinformatic tools (phyre2, ChimeraX, and SWISS-model), 3D homology modeling was performed using the best template-target alignment for viral and host proteins. When comparing the ProSA Z-score, QMEAN4 score, ERRAT%, and G-factor of host and virus-modeled proteins, ChimeraX outperformed SWISS-MODEL and Phyre2 [52]. However, in PDBsum, the chilli proteins predicted in Phyre 2 were found at the preferred region > 90. The proportion of residues in the “core” regions is one of the better evidence of a protein structure’s stereochemical quality [54], with over 90% of them being excellent. According to Nahak et al. [55], the model is effective when the disallowed area is less than 2%. In the case of viral proteins, the disallowed area exceeded 1%.

Several investigations have shown that host ADK and SKP1 (a component of the SCF complex) proteins interact with the viral TrAP protein [69,70,71], as well as SGS3 and Pre-CP [72]. Recent molecular and bioinformatic research suggests that the Cotton leaf curl Multan betasatellite (CLCuMB)-βC1 overexpresses the Gossypium hirsutum calmodulin-like protein 11 (Gh-CML11) protein, which provides calcium for virus spread [19]. Similarly, the CLCuMB-βC1 interaction involves the autoinhibitory sequence (AIS) and ubiquitin-associated domain (UBA) of G. hirsutum’s SnRK1 [18]. This study aimed at the best-performing protein–protein complexes, examining aspects such as interface prediction and binding energy using PyRx and HADDOCK. The docking research revealed that the ADK-CP, GLO1-Rep, and ADK-βC1 proteins had the highest binding energy, interface residues, and hydrogen bonds among the seven viral proteins. These were followed by the ADK-TrAP, ADK-REn, GLO1-Pre-CP, and ADK-C4 PPI complexes. The docking research revealed that the ADK-CP, GLO1-Rep, and ADK-βC1 proteins had the highest binding energy, interface residues, and hydrogen bonds among the seven viral proteins. The ADK-TrAP, ADK-REn, GLO1-Pre-CP, and ADK-C4 PPI complexes came next. Domain assessments revealed that the residues implicated in binding came from the conserved regions of viral proteins as well as ADK and GLO1. This highlights the relevance of the PPI complex between the virus and the host in several transcriptional pathways. As a result, viral titers rise, facilitating virus encapsidation and proliferation within the host cell and establishing pathogenicity [15]. The top three PPI complexes (ADK-CP, GLO1-Rep, and ADK-βC1) were employed in a site-directed mutagenesis investigation, substituting alanine for predicted residues at the interface. The results revealed a favorable decrease in binding energy for ADK and almost no interactions between the substituted alanine at the interface. This gives the researcher a broader range of options for confirming site-directed mutagenesis through molecular testing.

Viral disease may severely harm the novel crop varieties grown in India, as well as their promising yields [73, 74]. According to this computational work, structural biology may be able to predict the binding sites of the ChiLCV–ChLCuB protein and the protein found in chillies. This would be an effective and efficient way to acquire preliminary insights into all potential protein interactions and mutation studies connected to the complex of diseases known as chilli leaf curl disease.

5 Conclusions

Our findings open up new paths for structure-function research and provide fresh insights into begomovirus–chilli interactions at the molecular level. The study demonstrates that a bioinformatics technique can be used to predict potential protein binding sites in both host-coded and viral proteins. Furthermore, the clear facts demonstrated that ADK binds with ChiLCV-CP and ChLCuB-βC1 via their domain areas. This study reveals that begomoviral proteins interact with satellite proteins, increasing their pathogenicity. To improve interface prediction for binding with ChiLCV-CP, ChiLCV-Rep, and ChLCuB-βC1, residues in the ADK and GLO1 domain can be deleted or mutated at specified sites. This may be a superior alternative for developing viral resistance while maintaining the stability of the host protein both structurally and functionally.