Introduction

The outbreak of severe acquired respiratory syndrome-coronavirus-2 (SARS-CoV-2) in 2019 was soon declared a pandemic by the World Health Organization (WHO), and the infection is popularly addressed as coronavirus disease 19 (COVID-19). The genome of SARS-CoV-2, a positive single-stranded RNA virus of length 29.8–29.9 kb, encodes four structural and 16 non-structural proteins. Among the four structural proteins: envelope (E), membrane (M), nucleocapsid (N), and spike (S), SARS-CoV-2s' infectivity relies profoundly on the S protein. The S protein comprises 1273 amino acids and has two distinct subunits, S1 and S2. The S1 subunit (14–685 residues) contains the receptor-binding domain (RBD) and the S2 subunit (686–1272 residues) and acts as a linker between the S1 subunit and the viral envelope [1]. Upon viral entry, the RBD interacts with the angiotensin-converting enzyme 2 receptor (ACE2R) expressed on the surface of the host cells, followed by the fusion of the S2 subunit with the cell membrane to facilitate the viral entry for replication [2]. The affinity between the S protein and ACE2R is very high in the case of SARS-CoV-2, which enhances the strains' infectivity. Thus, SARS-CoV-2 has become the highly contagious strain among the coronaviruses known so far.

Despite the contagious nature of COVID-19 and a low fatality rate of 1.9%, what concerns us most is the post-COVID complications. In the case of COVID-19, the host immune response contributes hugely to the disease severity and post-recovery complications rather than the viral particle itself [3]. Therefore, adopting appropriate strategies to prevent infection is an effective way to tackle the menace and prevent disease-associated complications. Given this, vaccination against SARS-CoV-2 is adopted worldwide to contain the spread of the virus. However, the highly evolving nature of SARS-CoV-2 raises concern over the effectiveness of such measures.

Since its outbreak, SARS-CoV-2 has continuously evolved with unique modifications/alterations in the amino acid sequence to sustain its infectivity and contagious nature. The steadfast changes in its genetic material have resulted in the emergence of variants with enhanced transmissibility, severity, and aggressiveness. The diagnostic escape and vaccine-resistance properties have also improved during the evolution of variants. Due to their ability to incept a new pandemic wave, WHO has enlisted some of these as 'variants of concern' (VOCs). As of November 2022, five variants, alpha, beta, gamma, delta, and omicron, were categorized as VOCs. These strains have predominantly acquired mutations in the S protein, which enhanced their infectivity and lethality [4].

Alpha was the first variant reported from the United Kingdom with 17 mutations in its genome. Out of 17, eight were located in the S protein. The RBD contains only one mutation, namely N501Y. These mutations have made the alpha variant more infectious and enhanced its contagious nature by around 30 to 50% more than the native strain. So far, the alpha variant has been identified in at least 114 countries around the globe [5, 6]. The beta variant of SARS-CoV-2 was reported from South Africa with 21 mutations in its genome. The S protein of the beta variant contains nine mutations. Of these, the RBD has three mutations, namely K417N, E484K, and N501Y. These mutations have made the beta variant 50% more contagious than the native strain. Further, the modifications have enhanced the affinity of the S protein towards ACE2R. The E484K mutation within the RBD is thought to mediate the viral immune evasion property. So far, the variant has been detected in at least 48 countries [7, 8]. The emergence of the gamma variant was reported in Brazil. This variant harbours 21 mutations in its genome, with 10 mutations in the S protein. Among them, three were located within the RBD and were similar to that of the beta variant. Though no substantial evidence has been available to support its enhanced lethality, studies have found that the gamma variant is 2.5 times more contagious than the native strain [9, 10]. The delta variant was first reported from India with completely new mutations that do not match the ones present in the previously catalogued variants. The genome contains about 19 mutations, among which 10 are located in the S protein. The RBD has only two mutations, namely L452R and T478K. These mutations significantly enhanced the rate of viral replication, transmission, and immune evasion properties [11]. As of July 2021, the delta variant was the dominant strain of infection reported in more than 130 countries until the detection of omicron. Omicron is the recent variant reported first in South Africa. Unlike other VOCs, the omicron variant contains the highest number of mutations in its genome, i.e. 60 mutations. The S protein of the omicron variant contains 34 mutations, of which 15 are located in the RBD. Omicron dominates other variants, with the highest reported immune evasion property and transmissibility [12].

The usage of medicinal plants and formulations in COVID-19 prevention and management has gained exponential attention. Due to the continued demand for treatments of COVID-19 symptoms, the use of traditional medicinal plants was ascertained as an alternative solution. Phytochemicals constitute a significant reservoir for treating most viral infections, as documented in ancient literature on Ayurveda and other traditional medicinal systems. Investigations to date have proposed the following phytoconstituents as potential SARS-CoV-2 S protein RBD binders: diacetyl curcumin, dicaffeoylquinic acid, epigallocatechin gallate, theaflavin gallate, bis-demethoxy curcumin, tellimarganolin-II, o-Demethyl-demethoxy-curcumin, racemoside A, ashwagandhanolide, withanoside, racemoside C, tinocordiside, glycyrrhizic acid, limonin, 7-deacetyl-7-benzoylgedunin, maslinic acid, corosolic acid, obacunone, ursolic acid, camostat, favipiravir, tenofovir, raltegravir, and stavudine [13,14,15,16,17,18,19]. The primary disadvantage of these studies is that the evaluation had been carried out for a single plant or phytoconstituent(s) against the RBD of native SARS-CoV-2. Since the RBD in different SARS-CoV-2 variants varies in their affinity towards ACE2R, it becomes necessary to test the efficacy of molecules against all identified variants.

The availability of structural information on SARS-CoV-2 proteins has maximized the possibility of employing structure-based drug designing to develop COVID-19 therapeutics. The pressing priority is to exploit the accessible information to identify chemical moieties that could prevent the pathogenesis of SARS-CoV-2 at an earlier stage, regardless of the variants. Alternative to SARS-CoV-2 main protease, the favoured target, this study considers the RBD of S protein as an attractive target, as the interaction between the RBD and ACE2R initiates viral pathogenesis. The primary aim is to identify potential compounds in a pool of phytochemicals that could influence viral entry and help to treat/manage COVID-19. This study employs comprehensive computational methods comprising multistep molecular docking, pharmacokinetics profile assessment, network pharmacology analysis, molecular dynamics, and MM/GBSA-based binding free energy estimation to identify potent inhibitors of RBD. The discovery of potent inhibitors for SARS-CoV-2 proteins has been successfully demonstrated by employing such computational techniques [20,21,22]. This study also provides valuable insights into the interactions of potential compounds with RBD that can direct future drug-developing research against COVID-19.

Methods and materials

Structural modelling of VOCs

The Cartesian coordinates for "SARS-CoV-2 Spike protein RBD complexed with human ACE2R" were obtained from the Protein Data Bank (PDB ID: 6LZG). The structure of RBD in the complex was used as a template for mutational modelling. Initially, 15 model structures of RBD of S protein of different VOCs were generated using Modeller10v1 [23, 24]. The quality of the generated models was evaluated using the DOPE (Discrete Optimized Protein Energy) score [25] and GA341 score [26]. Further stereochemical validation was also performed using PROCHECK [27], Verify3D [28], and SWISS-MODEL structure assessment server [29, 30] (https://swissmodel.expasy.org/assess). The best score-fitting model was selected for each VOC. Furthermore, the best possible side chain conformation of mutated residues structures was obtained using the energy minimization in GROMACS v5.1 [31]. Finally, the refined models were superimposed on their corresponding highly identical template structure to crosscheck.

Dataset preparation

A total of 9596 phytochemical compounds were retrieved from the Indian Medicinal Plants, Phytochemistry, and Therapeutics (IMPPAT), a curated and largest database on phytochemicals of Indian medicinal plants to date [32]. The compounds were obtained from the database as PDB coordinates and converted to PDBQT format with Open Babel release 2.4.0 [33].

In silico pharmacokinetic, ADME/T, and drug-likeness analyses

The compounds were subjected to ADMET (Absorption, Distribution, Metabolism, Excretion, and Toxicity) filters, and 1576 compounds strictly obliging them were considered for virtual screening. The chemical druggability filters used were as follows: Lipinski's Rule of Five (RO5) [34], Veber rule [35], and Egan rule [36] for oral bioavailability; and GlaxoSmithKline 4/400 rule (GSK 4/400) [37] and Pfizer 3/75 rule [38] for safety profile. The successful compounds do not violate Lipinski's RO5 and hold good against the above-mentioned rules. The identified potential compounds were considered for pharmacokinetic assessment through the SwissADME tool (http://www.swissadme.ch) [39]. Their phytochemical property evaluation was done via the Molinspiration server [40] (https://www.molinspiration.com), and toxicity analysis using the ProTox-II server [41] (https://tox-new.charite.de/protox_II/). Parameters such as lipophilicity, water-solubility, oral bioavailability, GI absorption, lethal dosage, and synthetic accessibility were enlisted.

High throughput virtual screening (HTVS)

The HTVS was performed using the PyRx tool, an open-source software for docking ligand libraries against a protein to find lead compounds with desired biological function [42]. PyRx utilizes AutoDockTools to generate input files; AutoDock4 and Vina are used as docking software. Screening experiments were carried out per prior recommendations on data preparation, docking, and data analysis.

The MM/GBSA analysis

The MM/GBSA (Molecular Mechanics/Generalized Born Surface Area) free energy decomposition analysis was performed in the HawkDock server [43] to calculate per-residue free energy contributions of the SARS-CoV-2 RBD and ACE2R complex. The HawkDock server integrates ATTRACT docking algorithm, HawkRank scoring function, and MM/GBSA free energy decomposition analysis into a single multi-functional platform to highlight the critical residues for protein–protein interactions.

Molecular docking

Based on the ligand binding energy from docking-based virtual screening (PyRx tool), the top scoring 30 compounds were considered for further re-docking with more ligand conformations to explore the mode of action and involving key interaction using AutoDock4 [44]. AutoGrid program was used for the preparation of grid maps. The grid points have a space of 1.0 Å, and the box size was 40 Å for x, y, and z dimensions. The grid centre was established (x: -11.993, y: 15.425, and z: 65.951 Å) at the centre of the RBD interface bound to ACE2R in the crystal structure. The protein and the ligands were loaded with Kollman and Gasteiger-Huckel charges, respectively. The Lamarckian Genetic Algorithm (LGA) with 200 ligand conformations and other default settings was adopted to search for the best conformers. A representative conformer from the cluster with the lowest binding energy and a maximum number of conformers was considered for each ligand.

Molecular dynamics simulations

Molecular dynamics (MD) simulations were performed to analyse the stability and dynamic properties of the RBD bound to the most promising compounds. The best-scored models of the best leads in complex with the RBD of S protein were chosen as starting coordinates for 100 ns all-atom molecular dynamics (MD) simulation using Desmond, as implemented in the Schrödinger package (Schrödinger Release 2021-1: Desmond MD System, D. E. Shaw Research, New York, NY, 2021). The ligand–protein complexes were enclosed into a TIP3P water box, extending 10 Å beyond the complex's atoms, and counter ions (Na) were added to neutralize charges. The MD was performed in the NPT ensemble at a temperature of 300 K and 1 bar pressure. Simulations were run with the OPLS4 force field. After MD simulations, the root-mean-square deviation (RMSD), the root-mean-square fluctuation (RMSF), and the radius of gyration (Rg) were evaluated to understand the relative stabilities of the complexes using the Desmond simulation interaction diagram tool of Maestro. In addition, the protein–ligand interactions were monitored and recorded during the entire period of the simulation. The fractions at 0, 20, 40, 60, 80, and 100 ns were used to determine the binding free energy of the protein–ligand complexes by the Prime application available in the Schrödinger software package. The MM/GBSA method was employed to elicit the free energies for the binding of a ligand with protein.

Network pharmacology analysis

The identification of protein targets for four phytochemicals and their network pharmacology analysis was executed using the STITCH platform [45] (http://stitch.embl.de/). Moreover, these compounds' biological processes and molecular functions were also analysed using STITCH.

Results and discussion

MM/GBSA analysis of SARS-CoV-2 RBD and ACE2R complex

Per-residue energy decomposition of the binding free energy of a SARS-CoV-2 RBD and ACE2R complex via MM/GBSA analysis has been successfully employed to identify the key residues involved in the interaction. The overall binding free energy of the complex was found to be -59.07 (kcal/mol). Per-residue free energy contribution of the RBD amino acids to the overall binding free energy of the complex is shown in Fig. 1. Among the 209 amino acid residues, 21 were observed to have a binding free energy of ≤ − 1.00 kcal/mol and were found to be essential in forming the RBD and ACE2R complex. In particular, Tyr 505, Gln 493, Lys 417, Phe 486, and Leu 455 were the key residues essential for the interaction with a comparatively lower binding free energy of − 4.76, − 3.91, − 3.90, − 3.34, and − 3.13 kcal/mol, respectively. Similar to our observations, an earlier study by Lan et al. identifies Tyr 449, Tyr 453, Leu 455, Phe 456, Phe 486, Asn 487, Tyr 489, Gln 493, Gly 496, Gln 498, Thr 500, Asn 501, Gly 502, and Tyr 505 are essential for the formation of stable RBD–ACE2R complex [46]. Recently, the binding affinity of omicron variant spike protein for the human ACE2R was studied using the MM/GBSA method [47]. It was reported that the omicron RBD residues, namely Gly476, Arg 493, and Try 501, contributed to the stable interaction between RBD and ACE2R with the binding energy of − 2.61, − 4.38, and − 5.49 kcal/mol, respectively. In addition, the two critical electrostatic interaction pairs (Arg 493-Asp 30/Glu 35 and Arg 498-Asp 38) were found between omicron RBD and human ACE2R. The series of residues thus identified and reported earlier are in good agreement with our MM/GBSA calculations of per-residue decomposition of the free energy. Hence, it is hypothesized that compounds capable of establishing stringent interaction with these residues might serve as a better inhibitor of RBD binding to ACE2R.

Fig. 1
figure 1

Per-residue free energy contributions of amino acids in the interface of SARS-CoV-2 RBD–ACE2R complex. The figure represents the amino acids with a binding free energy value below -1.00 kcal/mol as calculated by the HawkDock server

Identification of potential blockers of RBD through HTVS

The HTVS of the phytochemicals enlisted in the IMPAAT database was performed to identify prospective SARS-CoV-2 RBD binders using the PyRx tool. The docking protocol and grid setup were validated by docking the reference compounds Ceftazidime and Erythrosine B, which were already reported to inhibit SARS-CoV2-RBD–ACE2R interaction [48, 49]. Ceftazidime and Erythrosine B had a binding energy of − 6.82 and − 7.53 kcal, respectively. Ceftazidime forms hydrogen bonds with Glu 406, Lys 417, Try 453, and Gln 493 of spike RBD, Erythrosine B bonds with Arg403 and Gly496. The interaction details of these reference compounds are presented in supplementary Fig. 1. Since a significant percentage of the screened phytochemicals had greater binding energies than the reference compounds, a threshold value was opted to filter potential compounds. Sequential steps involved in the VS workflow are represented in Fig. 2. Out of 9596 phytochemicals available in the database, 1578 compounds that have successfully passed the druggability filters were selected for HTVS. Based on the binding energy, the top 30 compounds were chosen for re-docking analysis against the RBD in native and VOCs: alpha, beta, gamma, delta, and omicron. The binding energy and ligand efficiency of the top 30 compounds obtained from the re-docking analysis are given in supplementary Table 1. To prioritize the compounds further, threshold values of ≤ -6.5 kcal/mol and ≤ -0.23 were set for the binding energy and ligand efficiency, respectively. More details of shortlisted compounds based on the threshold values are enlisted in Tables 1 and 2. From Table 1, compounds CID 44562999, CID 44558930, CID 23266161, CID 101281365, CID 25090669, CID 10665247, CID 12877769, CID 443458 CID 3086653, CS 10308017, and CID 11725426 were enlisted as potential binders to RBD in native and VOCs based on the binding energy. Likewise, compounds CID 44562999, CID 443458, CID 10665247, CID 443464, CID 104860, CID 16745513, CASID 35214-68-7, CID 190821, and CID 11725426 were found to be effective against RBD in native and VOCs based on the ligand efficiency (Table 2). Upon comparing the compounds prioritized based on the binding energy and ligand efficiency, only four compounds, namely CID 10665247, CID 11725426, CID 443458, and CID 44562999, were found overlapping (Fig. 3) and designated as prospective compounds. Their corresponding 2D structures are given in Fig. 4. The 3D localization of the four compounds docked onto the RBD of native, and VOCs shows that these compounds bind preferentially at the interface of the SARS-CoV-2 RBD–ACE2R complex (Fig. 5).

Fig. 2
figure 2

Overview of sequential steps in the virtual screening workflow. The flowchart explains the sequential filtering of ligands in each step leading to the successful identification of four promising compounds as SARS-CoV-2 RBD blockers

Table 1 List of compounds enlisted based on the binding energy to be potential against native and different SARS-CoV-2 variants
Table 2 List of compounds enlisted based on the ligand efficiency to be potential against native and different SARS-CoV-2 variants
Fig. 3
figure 3

Prioritization of prospective candidate molecules. Venn diagram showing the four overlapping compounds (purple area) that could be promising candidates to block the RBD of S protein in native and all variants of SARS-CoV-2. Compounds shared between all forms of RBD while ranking based on binding energy and ligand efficiency are illustrated in pink and blue areas, respectively.

Fig. 4
figure 4

A 2D representation of the chemical structures of the four promising candidates. The common name of the phytochemical is given in parentheses

Fig. 5
figure 5

Location of four potential compounds in the SARS-CoV-2 Spike RBD–ACE2R complex interface. Phytochemicals identified as a result of virtual screening are seen to bind at the interface region of the Spike RBD–ACE2R complex. The figure represents the binding conformation of each ligand in the RBD of native and five VOCs

Interaction analysis of four overlapping compounds with RBD in native and VOCs

The sequences and structures of RBD in native and VOCs were analysed and presented in supplementary Fig. 2. The backbone RMSD analysis revealed that the deviation between the native and VOCs RBD was < 0.5 Å. This, in turn, implies that the mutations within the RBD do not induce any significant structural alterations. However, changes were observed in the accessible surface area of mutated amino acids along the interface, which might increase the strength of the interaction network between ACE2R and RBD.

The intermolecular interaction analysis of RBD protein complexed with the four prospective compounds was assessed for their ability to block the binding of SARS-CoV-2 RBD to ACE2R. Table 3 shows the amino acid residues in native and VOCs interacting with the four compounds through hydrogen bonding and hydrophobic interactions. The residues highlighted in Table 3 are the amino acids that majorly contribute to the binding free energy of the SARS-CoV-2 RBD–ACE2R complex determined through MM/GBSA analysis (Fig. 1). Interestingly, our docking analysis identified at least two or more such residues interacting with the compounds in each complex. The most prominent residues involved in ligand interaction were Arg 403, Tyr 489, Gln 493, Gly 496, Gln 498, Thr 500, Asn 501, Gly 502, and Tyr 505. In some complexes, the interactions were contributed by Lys 417, Leu 455, and Phe 456. Other critical amino acids involved in protein–ligand interactions were Tyr 453, Ser 494, Tyr 495, and Phe 497. The individual interaction map of all four phytochemicals with the RBD of native and VOCs is given in supplementary Fig. 3. Supplementary Fig. 4 shows the binding orientation of the four ligands in the RBD. The observations emphasize that the identified molecules can potentially disrupt SARS-CoV-2 RBD binding to ACE2R. Among the four molecules, CID 44562999 exhibited lower binding energy with better ligand efficiency against the RBD of native and VOCs. The binding energies (kcal/mol) of CID 44562999 docked against the RBD of native, alpha, beta, gamma, delta, and omicron were found to be − 10.84, − 8.56, − 10.09, − 10.22, − 9.39, and − 9.59, respectively (Table 1), and the ligand efficiency of CID 44562999 was − 0.32, − 0.25, − 0.30, − 0.30, − 0.28, and − 0.28 respectively (Table 2). Hence, the stability and residual interaction of CID 44562999 with the RBDs were probed over a 100-ns MD simulation.

Table 3 Interaction analysis of four potential compounds with the RBD of native and SARS-CoV-2 variants

Dynamics of RBD-CID 44,562,999 complexes

The stability of RBD-CID 44562999 complexes was evaluated through MD simulation. Parameters such as protein backbone RMSD, ligand RMSD, amino acid Cα RMSF, and residual interaction associated with the protein–ligand complex were analysed. Figures 6, 7, 8, 9, 10, and 11 illustrates the RMSD, RMSF, and various intermolecular interactions formed by CID 44562999, including hydrogen bond, hydrophobic, and water bridges with various RBD forms. A timeline representation of the interactions and contacts is also included in Figs. 6, 7, 8, 9, 10, and 11. Detailed 2D plots displaying the ligand atom interactions with the protein residues are provided in supplementary Fig. 5. Representative conformations of CID 44562999 bound to the RBD extracted from MD simulation trajectories are presented in supplementary Fig. 6.

Fig. 6
figure 6

Various measures of MD simulation of native RBD-CID 44562999 complex. The plots show RMSD (A), RMSF (B), the fraction of amino acid interactions (C), and the preservation of intermolecular interactions in the complex during MD simulation (D). Data were generated by analysing the trajectories after simulating the protein–ligand complex for 100 ns

Stability and preservation of contacts in the native RBD-CID 44562999 complex

The RMSD plot (Fig. 6A) shows that the protein and ligand RMSDs were consistent throughout the simulation, without much deviation, signifying the integrity of the native RBD-CID 44562999 complex. The Cα RMSF plot (Fig. 6B) shows higher fluctuations in the loop regions (440–450 and 474–490) involved in the binding of RBD with ACE2R. The intermolecular interactions include Gly 496, Asn 501, Gly 502, Tyr 505, Gln 498, Tyr 495, Phe 497, Arg 403, and Thr 500 (Table 3). Figure 6C shows that Gly 496, Phe 497, Gln 498, Asn 501, and Gly 502 contribute majorly to the hydrogen bonding, with Phe 497 and Tyr 505 contributing to the hydrophobic residues. The interactions with Gln 498, Asn 501, and Gly 502 are maintained for about 80% of the simulation time. Further, from Fig. 6D, it is evident that most of the interactions mentioned above, especially the contacts involving the residues Gln 498, Asn 501, Gly 502, and Tyr 505, existed throughout the entire simulation time rendering stability to the protein–ligand complex. A switch in the residual interaction was observed at 90 ns. The ligands' contact with Phe 497 was broken, and connections with Asp 405 and Gly 504 were introduced. However, this does not impart significant changes in the protein–ligand complex as observed from other plots.

Stability and preservation of contacts in the alpha RBD-CID 44562999 complex

The RMSD plot (Fig. 7A) shows that the protein was structurally stable throughout the simulation with no significant deviations, while a great shift in the ligand RMSD was observed around 30 ns. The Cα RMSF analysis shows that the residues (465–490) associated with the loop region had higher fluctuation than native (Fig. 7B), while the changes observed in the other areas were similar. The interactions observed while analysing the docked complex (Table 3) were Glu 484, Phe 490, Gln 493, Ser 494, Gly 485, Tyr 489, Leu 455, and Leu 492. Of these interactions, hydrogen bonding by Phe 490, Gln 493, Ser 494, hydrophobic contact by Tyr 489, and water bridges by Glu 484, Tyr 489, Phe 490, Gln 493, and Ser 494 are perceived to contribute immensely to the complex formation (Fig. 7C). As mentioned earlier, the ligand RMSD shifted around 30 ns during the simulation (Fig. 7A). In concordance with this, a shift in the residual interaction was also noted around the same time, as clearly reflected in Fig. 7D. Notably, Gly 485, Phe 486, Cys 488, and Ser 494 contributed to the interactions before 30 ns, and later, the residues involved were switched to Phe 456, Tyr 473, and Glu 484. This indicates that the ligand orients itself at a better position in the protein to enable the formation of a stable complex. Besides this, interactions with residues Glu 484, Tyr 489, Phe 490, and Gln 493 were maintained throughout the simulation, i.e. at least above 50% of the simulation time.

Fig. 7
figure 7

Various measures of MD simulation of alpha RBD-CID 44562999 complex. The plots show RMSD (A), RMSF (B), the fraction of amino acid interactions (C), and the preservation of intermolecular interactions in the complex during MD simulation (D). Data were generated by analysing the trajectories after simulating the protein–ligand complex for 100 ns

Stability and preservation of contacts in the beta RBD-CID 44562999 complex

Protein–ligand RMSD plot (Fig. 8A) shows the overall structural integrity of the beta RBD-CID 44562999 complex. Similar to Cα RMSF in alpha RBD, only the residues (470–490) in one loop region had a higher fluctuation than the rest of the RBD structure (Fig. 8B). About 11 residues of beta RBD were implicated to interact with the ligand: hydrogen bonding involved Arg 403, Tyr 453, Gln 493, and Ser 494 residues, and hydrophobic contact was contributed by Tyr 449, Tyr 495, Gly 496, Phe 497, Gln 498, Tyr 501, and Tyr 505 residues (Table 3). Among this, contacts involving Arg 403, Tyr 453, Ser 494, Gly 496, Tyr 501, and Tyr 505 residues were persistent for more than 50% of the simulation time (Fig. 8C). At this point, it is essential to highlight that Tyr 501 is the mutated version of Asn 501 in the native RBD, and it is seen to contribute to all three types of interactions with the ligand. The same was observed in Fig. 8D, except that Tyr 453 was involved in the interactions mainly in the earlier phase of the simulation (before 40 ns). It is worthy of notice that interactions with Arg 403, Leu 455, and Tyr 495 set in only in the later phase of the simulation, i.e. precisely after the weakening of the Tyr 453 contact. This could not be considered as ligand shift as in the case of the alpha complex, as the ligand RMSD in the beta complex demonstrated no significant deviations. But this could be believed to impart better stabilization of the complex by other interactions.

Fig. 8
figure 8

Various measures of MD simulation of beta RBD-CID 44562999 complex. The plots show RMSD (A), RMSF (B), the fraction of amino acid interactions (C), and the preservation of intermolecular interactions in the complex during MD simulation (D). Data were generated by analysing the trajectories after simulating the protein–ligand complex for 100 ns

Dynamics and preservation of contacts in the gamma RBD-CID 44562999 complex

RMSD plot (Fig. 9A) demonstrates the structural stability of the gamma RBD-CID 44562999 complex throughout the MD simulation. On observing the Cα RMSF graph in Fig. 9B, the loop regions (470–490) were found to have more significant fluctuations. Table 3 shows that Ser 494, Gly 496, and Tyr 501 were involved in hydrogen bonding, and Arg 403, Tyr 449, Gln 493, Tyr 495, Phe 497, and Tyr 505 were involved in hydrophobic contacts. Figures 9C and 9D show that Ser 494 has multiple interactions with the same ligand atom, which is strong and evident throughout the simulation. Apart from this, ligand contacts of Tyr 453 and Tyr 501 residues were consistent. Interactions of the ligand with Arg 403, Gln 493, Phe 497, and Tyr 505 were considered weaker as they appeared only at specific time points throughout the simulation time. Alike beta, Asn to Tyr 501 mutation is also present in the gamma variant and contributes to different ligand interactions.

Fig. 9
figure 9

Various measures of MD simulation of gamma RBD-CID 44562999 complex. The plots show RMSD (A), RMSF (B), the fraction of amino acid interactions (C), and the preservation of intermolecular interactions in the complex during MD simulation (D). Data were generated by analysing the trajectories after simulating the protein–ligand complex for 100 ns

Dynamics and preservation of contacts in the delta RBD-CID 44562999 complex

The retention of the structural integrity of the delta RBD-CID 44562999 complex is evident from the RMSD plot in Fig. 10A. The RMSF plot establishes the regions with higher fluctuation as loop residues between 495 and 490 (Fig. 10B). The RBD residues involved in ligand interaction as obtained from analysing the docked complex is as follows: hydrogen bonding of Ser 494 and hydrophobic contacts imparted by Arg 403, Tyr 449, Tyr 453, Phe 490, Leu 492, Gln 493, Tyr 495, Gly 496, and Tyr 505 (Table 3). In contrast to this, Fig. 10C demonstrates the involvement of serval residues in hydrogen bonding, particularly Lys 417, Tyr 453, Gln 493, and Ser 494. However, the contacts were prominent only at specific frames of the simulation (Fig. 10D). None of the interactions existed throughout the simulation, yet ligand contacts with residues Tyr 473, Gln 493, and Ser 494 were maintained around 40–50% of the simulation time. This suggests that the stability of the delta RBD–ligand complex is a collective influence of various residues.

Fig. 10
figure 10

Various measures of MD simulation of delta RBD-CID 44562999 complex. The plots show RMSD (A), RMSF (B), the fraction of amino acid interactions (C), and the preservation of intermolecular interactions in the complex during MD simulation (D). Data were generated by analysing the trajectories after simulating the protein–ligand complex for 100 ns

Dynamics and preservation of contacts in the omicron RBD-CID 44562999 complex

The protein backbone RMSD of omicron RBD-CID 44562999 complex exhibited a higher deviation up to ~ 4 Å during 60 to 80 ns of the simulation, after which the RMSD bounced back to ~ 3 Å and remained stable for rest of the simulation (Fig. 11A). Likewise, a significant shift in the ligand RMSD was observed around 85 ns. This behaviour could be best understood by comparing the RMSF plots of omicron (Fig. 11B) and native (Fig. 6B). In the omicron-ligand complex, the region containing the amino acid residues 455–490 shows higher fluctuations than native. It is of notice that this region lies in the SARS-CoV-2 RBD–ACE2R interface, which could be the reason behind higher fluctuations. We suppose this could also be one explanation for the fast-spreading nature of the omicron variant than its counterparts [50], which needs to be confirmed by deeper investigations. Leu 492 and Arg 493 were involved in hydrogen bonding; Phe 456, Leu 455, Tyr 489, Phe 490, Leu 452, and Ser 494 contribute to hydrophobic interactions in the docked omicron RBD–ligand complex (Table 3). Figure 11C demonstrates the involvement of Ala 484, Tyr 489, Phe 490, Arg 493, and Ser 494 in forming the protein–ligand complex. It is of notice that Arg 493 in the omicron variant is a mutated version of Gln 493 in native. However, only Ala 484 and Ser 494 established consistent ligand contacts throughout the simulation (Fig. 11D). Ligand interactions with Tyr 489, Phe 490, and Arg 493 residues were almost lost at 85 ns, introducing new contacts with Tyr 449 residue. This might well explain the shift in the ligand RMSD plot at the same time of simulation (Fig. 11A). Yet another reason for the switch of interactions might be due to the more significant fluctuation of residues in the ligand binding region, as discussed earlier (Fig. 11B).

Fig. 11
figure 11

Various measures of MD simulation of omicron RBD-CID 44562999 complex. The plots show RMSD (A), RMSF (B), the fraction of amino acid interactions (C), and the preservation of intermolecular interactions in the complex during MD simulation (D). Data were generated by analysing the trajectories after simulating the protein–ligand complex for 100 ns

Binding free energy using MM/GBSA approach

To investigate the binding affinity of CID 44562999, the MM/GBSA approach was used to calculate the binding free energy (∆Gbind) from the entire MD simulation trajectory. Figure 12 represents the binding free energy of CID 44562999 for each RBD form plotted against the frame number. The average MM/GBSA-based binding free energy (kcal/mol) of CID 4456999 was calculated to be − 40.8 ± 5.90 (native), − 49.6 ± 4.72 (alpha), − 49.1 ± 6.67 (beta), − 44.9 ± 7.47 (gamma), − 46.1 ± 5.94 (delta), and − 42.3 ± 14.27 (omicron). The ∆G bind profile of CID 4456299 was observed to be within the range of − 40.0 to − 49.0 kcal/mol in all RBD forms, signifying the effectiveness of the ligand against RBD in native as well as other variants of SARS-CoV2. However, as seen in the figure, the ∆Gbind increases drastically after 80 ns in the case of omicron and reaches − 14.72 kcal/mol at the end of the simulation, indicating that the ligand has migrated away from the protein corroborating the loss of protein contacts during that particular simulation time (Fig. 11D).

Fig. 12
figure 12

MM/GBSA binding free energy analysis. The plot shows MM/GBSA binding free energy fluctuations as a function of simulation time for each complex of CID 44562999 with the RBD of spike protein in all forms of SARS-CoV2 plotted at 20-ns intervals

Critical residues in SARS-CoV-2 RBD contributing to the complex formation

By analysing the binding conformations, we found that the binding sites of these four phytochemicals were strictly restricted in native, beta, and gamma variants and distributed on two subregions in alpha, delta, and omicron variants. The common residues involved in hydrogen bonding and hydrophobic interaction were Arg 403, Tyr 453, Gln 493, Ser 494, Tyr 495, Gly 496, Phe 497, Tyr 501, and Tyr 505. Excluding this, a series of hydrophobic contacts were also noted with residues such as Lys 417, Tyr 449, Leu 455, Phe 490, and Leu 492. Furthermore, the reference molecules also showed interactions with Arg 403, Glu 406, Lys 417, Thy 453, Leu 455, Phe 456, Gln 493, Ser 494, Tyr 495, Gly 496, Phe 497, Asn 501, and Tyr 505. Various research groups have analysed the interaction of the S protein RBD from SARS-CoV-2 (native and variants) with ACE2 to identify the hotspot residues and to study the effect of polymorphism on the complex interaction [46, 51,52,53]. A study by Jawad et al. in 2021 confirmed that Lys 417, Phe 456, Phe 486, Gln 493, Gln 498, and Asn 501 in RBD are the critical residues involved in the binding of SARS-CoV-2 with ACE2R [54]. The study also revealed that Gly 446, Tyr 449, Asn 487, Gln 493, Gly 496, Thr 500, and Gly 502 in RBD form pairs with residues in ACE2R via specific hydrogen bonding. Another study by Yi et al. in 2020 demonstrated that six single amino acid substitutions in SARS-CoV-2 RBD, namely Arg 439, Lys 452, Thr 470, Glu 484, Gln 498, and Asn 501, resulted in a loss of favourable interactions with ACE2 designating them as critical residues that interact with ACE2 [55]. Likewise, in a simulation analysis by Veeramachaneni et al., residue Gln 493 and Gln 498 showed hydrogen bonding > 90% of the simulation time [56]. Other residues such as Lys 417, Tyr 453, Ala 475, Asn 487, Tyr 489, Thr 500, Asn 501, and Tyr 505 were reported to contribute to over 50% of the hydrogen bonding interaction. Table 4 shows the critical interacting residues of RBD with ACE2R and identified potential phytochemicals. From the table, it was observed that the RBD residues such as Tyr 449, Tyr 453, Gln/Arg 493, Gln 498, Thr 500, Tyr 501, and Tyr 505 in native and all variants, commonly interact with ACE2R and four phytochemicals. These residues overlap with the significant residues established already in the studies mentioned above. Hence, under our predictions and considering the previously reported panel of critical residues, these phytochemicals are seen to competitively interact with the hotspot residues required for the formation of the SARS-CoV-2-RBD–ACE2R complex. Thus, these molecules are expected to restrict the interaction between RBD and ACE2R.

Table 4 Key interacting residues of SARS-CoV-2 RBD with ACE2R and four potential compounds in native and SARS-CoV-2 variants

The potential of phytochemicals in COVID-19 therapeutics

Table 5 shows the respective phytochemical name, class, and other properties of the compounds, along with their source. Table 6 illustrates the physiochemical property and predictive toxicity profile of the four identified phytochemicals. The table shows that the compounds were safe and can be used further as therapeutic molecules or potential leads. The compound CID 44562999 is withanolide F isolated from Ashwagandha (Withania somnifera L.); CID 10665247 is Orobanchol, a constituent of Flax (Linum usitatissimum L.); CID 11725426 is Serotobenine (Moschamindole) identified from Safflower (Carthamus tinctorius L.; and CID 443458 is Gibberellin A51 obtained from Pea plant (Pisum sativum L.). For the first time, this study reports the anti-viral activity of Serotobenine, Orobanchol, and Gibberellin A51 against SARS-CoV-2 by targeting the S protein. Network pharmacology analysis (Fig. 13) revealed that the Orobanchol interacts with seven proteins, mainly the cytochrome P450 family proteins, which determines its additional pharmacological effects. For other phytochemicals, no reported targets existed in the databases.

Table 5 Synonyms and properties of four prospective phytochemicals identified as RBD blockers
Table 6 Physiochemical property and toxicity profile of four prospective phytochemicals identified as RBD blockers
Fig. 13
figure 13

Network pharmacology presentation of CID 10665247 (Orobanchol) and targets. The network view summarizes the predicted associations for the compound with a particular group of proteins. In the network, nodes represent proteins, and edges are predicted functional associations. CYP3A4 cytochrome P450 (family 3A3/3A4/3A7/3A43); DAD1 defender against cell death 1; SLC29A2 Solute carrier family 29 member 2; TBXAS1 thromboxane A synthase 1

The therapeutic and medicinal potentialities of the plants mentioned above are well established. In particular, Ashwagandha is renowned for its anti-viral, immunomodulatory, anti-inflammatory, and anti-oxidant potentials. The phytoconstituents of Flax were described to inhibit COVID-19 pathogenesis via SARS-3CL protease inhibition [57]. Besides, dietary flaxseed was also recommended to strengthen the immune system during the COVID-19 pandemic and prevent comorbidities-related health risks [58]. Similarly, Gibberellin A51 was reported to possess inhibitory activity against SARS-CoV-2 helicase [59]. The phytoconstituents of Ashwagandha, chiefly withanolides, have displayed promising potential in managing COVID-19 by ameliorating the SARS-CoV-2 severity. The withanolide is a steroidal lactone class that has splendid potential therapeutics such as immunomodulatory, anti-inflammatory, antitumor, anti-arthritic angiogenesis inhibitor, anti-oxidant, anti-cholinesterase, anti-bacterial, etc. activities [57, 60, 61]. Recently, withanolide-based phytochemicals have been tested for their COVID-19 inhibitory activity. The ability of withanolides to bind to the spike protein has already been demonstrated in the case of withaferin A, withanoside V, withanoside X, and withanolide M [60,61,62,63]. The withanone compound showed effectiveness at the ACE2-RBD complex interfacing interaction and energetically destabilized the complex interaction [62]. Balkrishna et al. (2020) demonstrated that withanone efficiently inhibited the interaction between ACE2R and RBD in vitro, with an IC50 of 0.33 ng/mL [62]. Two other compounds structurally close to withanolide F, peimine and artemisone, also exhibited similar activities [64, 65]. In a study by Wang et al., Peimine inhibited variants of SARS‐CoV‐2 infection in furin‐overexpressing cells with an IC50 of 0.4 μM against wild-type, alpha, and beta variants. In addition, a time‐resolved FRET assay showed that Peimine blocks the binding between human ACE2R and RBD [64]. An ELISA-based test showed that artemisone inhibited the binding of RBD to ACE2R in a dose-dependent manner and presented a KD value of 0.36 μM in bio-layer interferometry experiments in vitro [65]. Table 7 shows the structure of compounds that are similar to withanolide F and their related activity. Other studies found that among all withanolides, withaferin A, withanone, withanoside IV, and withanoside V significantly inhibited the ACE2 expression. Besides this, withanolides were also shown to inhibit ACE2R and SARS-CoV-2 proteases [61, 66, 67]. Therefore, as identified in the current study, a new withanolide, namely withanolide F would be a promising candidate to inhibit S protein RBD and ACE2R interaction. The highlight of this study is that all four compounds identified were demonstrated to bind S protein RBD in native and VOCs with better ligand efficiency. However, the identified potential compounds might need further optimization and evaluation in vitro and in vivo to ascertain their precise therapeutic value. Accordingly, this study recommends prescribing phytochemicals, especially withanolide F, along with other drugs currently in practice for treating/managing COVID-19. Deeper investigations are warranted to understand the exact mechanism of action of these phytochemicals. Nevertheless, considering their current therapeutic profile, these phytochemicals could be advocated safely as a dietary supplement or as phytotherapy to alleviate the symptoms of COVID-19.

Table 7 Structure and activity details of withanolide F and similar compounds

Conclusion

The most severe COVID-19 pandemic continues unabated causing significant loss of life worldwide due to the unavailability of effective drugs. It is worrisome that the infection remains active at an alarming rate, even after almost 2 years since its outbreak. It is even more distressing that with the availability of sophisticated technologies, no effective chemical entity is available to prevent disease pathogenesis. With the emergence of VOCs, identifying molecules to manage and treat COVID-19, irrespective of the variants, is an absolute necessity. In this study, an extensive computational screening protocol was adapted to filter the phytochemical database of Indian medicinal plants, followed by a 100 ns MD simulation of docked complexes to corroborate the structural integrity of protein–ligand complexes.

Consequently, this study prioritizes withanolide F, Orobanchol, Serotobenine, and Gibberellin A51 to have a more robust binding affinity profile with all currently identified forms of RBD. Therefore, further experimental validations are recommended on these molecules to infer therapeutic efficacy and determine the probability of opting for COVID-19 therapy. Phytochemicals from Indian medicinal plants are not new to therapeutics and have been used in traditional medicine practices since prehistoric times, even before the establishment of modern medicine. It is becoming increasingly apparent, and not fascinating that phytochemicals are potential agents for COVID-19 treatment. Moreover, this study appends the existing knowledge on the therapeutic value of phytochemicals which might help fight the remarkably evolving SARS-CoV-2 virus by targeting the S protein RBD.