Introduction

The first case of the Corona virus disease 2019 (COVID-19) occurred in December 2019 in the Wuhan city of China. The causative novel corona virus was found to be a severe acute respiratory syndrome virus identified as SARS-CoV-2 [1,2,3]. The epidemic outbreak soon turned into a pandemic and has infected over 110 million people worldwide and caused mortality of over 2.4 million as of February, 2021 [4]. While large scale clinical studies (Phase 3 and 4) are in progress as well as marketed with significant success rate for several mRNA, Subunit and vector vaccines worldwide, it is important to understand that vaccines show numerous challenges in production, distribution and administration. Majority of the proposed COVID-19 vaccines requires a follow up dose with multiple shots. Additionally, SARS- CoV-2 has shown capacity to mutate and render certain vaccines ineffective. These challenges may be overcome by the discovery of a potent antiviral compound. As a result, in the past year there has been a surge in the number of computer aided drug design and discovery studies on COVID-19 antivirals using several docking strategies.

The SARS-CoV-2 displays a wide variety of target protein for ligand docking; one of the important targets which have potential to be targeted by an anti-viral molecule is the Main protease (Mpro). Mpro is also called 3-C like protease (3CLpro), it plays an essential role in post-translational modifications of replicase polyproteins [5,6,7,8]. The replicase protein further catalyzes the processing of the viral proteins.The SARS-CoV-2 Mpro is 306 amino acid long and structurally and sequentially highly similar to the SARS-CoV3CLpro [9]. A single monomer of the Mpro houses 3 N-terminal domains namely N-terminal domain-I, N-terminal domain-II, and N-terminal domain-III [10]. Cys145 and His41 catalytic dyads form the active site of the enzyme [11, 12]. Since the outbreaks, several established drugs, such as HIV drugs (Lopinavir and Ritonavir), Peptidomimetic α-ketoamides and other modified α-ketoamides inhibitors have been docked and studied for their inhibitory property towards Mpro [13,14,15,16,17]. The docking is often performed on Mpro of α corona viruses, β corona viruses as well as 3CLpro of enteroviruses. Among the drugs in trial, several antiviral phytochemical active compounds are also under consideration while numerous other flavonoids, glucosides, alkaloids and polyphenolic compounds are being docked on the SARS-CoV-2 Mpro for possible inhibitory activity which might bring new designs for possible therapeutic drugs [18,19,20].

Boerhavia diffusa Linn.is a medicinal plant of the Nyctaginaceae family. Its common English name is Red spiderling or spreading hogweed. In India, its name in Sanskrit is Varshabhu, yet a more common name of B. diffusa in India is Punarnava. B. diffusa is a typical rainy season weed found in India, North and South America and South East Africa. Being a member of the Ayurveda system of medicine it’s classified as a Rasayana herb. It is said to possess numerous health inducing therapeutic properties such as anti-aging, strengthens life, enhances brain power, prevents diseases and re-establishes youth. All these properties clearly indicate its role in hepatoprotection and immunomodulation [21,22,23,24]. Recent studies involving clinical trials have also reported its role as an anticancer agent [25,26,27,28], antidiabitic, antioxidant [29,30,31], anti-inflammatory [32,33,34] antifibriolytic agent and in diuresis [32, 35]. Moreover B. diffusa is an essential component of numerous therapeutic formulations for conditions like jaundice, rheumatism, nephrological diseases, asthama, inflammation, anemia, ascites and many gynecological disorders. While its usage in traditional medicine systems are mostly reported to treat diseases like kidney ailments, jaundice, dermatological conditions, eye ailments, wounds and inflammation. Various ethanopharmacological reports have also mentioned the role of B. diffusa in treating diseases of the reproductive system, urinary system, cardiovascular system, hepatic system, respiratory system, gastrointestinal system and cancer [36].

The phytochemicals extracted from B. diffusa belongs to the novel class of isoflavonoids known as rotenoids, flavonoids, flavonoid glycosides, xanthones, purine nucleosides, lignans, ecdysteroids and steroids. A mitochondrial inhibitor called rotenone is a prototype compound for the isoflavonoid derivative called Rotenoid. Identification of these compounds, its isolation and characterization were only possible after the rapid quantitative estimation methods for boeravinones of B. diffusa developed recently [37]. The roots and in some tribes the entire plant is used as a culinary ingredient accounting to its Vitamin C, Vitamin B3, Vitamin B2 as well as calcium content in roots alone. B. diffusa also has been reported to contain 15 amino acids among which 6 are essential in the entire plant and 14 amino acids among which 7 are essential in the roots alone. The roots are also known to contain isopalmitate acetate, behenic acid, arachidic acid and saturated fatty acids [38]. The present study involves selection of 9 major phytochemicals of B. diffusa namely 2-3-4 beta-Ecdysone, Bioquercetin (Quercetin-3-O-robinobioside), Biorobin (Kaempferol-3-O-robinobioside), Boeravinone J, Boerhavisterol, kaempferol, Liriodendrin, quercetin and trans-caftaric acid (Fig. 1). The mentioned molecules were docked with the main protease of SARS-CoV-2 to discover novel SARS-CoV-2 inhibitors from B. diffusa which could be potential drugs to cure COVID-19.

Fig. 1
figure 1

Source—Pubchem) a-Bioquercetin, b-Boeravinone J, C-2-3-4 beta-Ecdysone, d-Biorobin, e-Trans-caftaric acid, f-Liriodendrin, g-Boerhavisterol, h-Quercetin, i-Kaempferol

Chemical structures of the ligands selected from B. diffusa (

Materials and methods

Obtaining ligand spatial data

The ligand molecules namely 2-3-4 beta-Ecdysone, Bioquercetin, Biorobin, Boeravinone J, Boerhavisterol, kaempferol, Liriodendrin, quercetin and trans-caftaric acid were identified as potential hits from the literature and their structure was obtained from Pubchem database (https://pubchem.ncbi.nlm.nih.gov/), their spatial co-ordinates were obtained as a spatial data file in .SDF format.

Conversion of ligand data to PDB format

The ligands in spatial data file .SDF format were converted to Protein data bank .PDB format using the online structure file generator tool from national cancer institute (https://cactus.nci.nih.gov/translate/). During conversion the parameters were set to default, the structure was obtained in 3D for the kekule form of representation.

Obtaining protein structure

The structure of the target protein namely crystal structure of COVID-19 main protease was obtained from RCSB protein databank (6LU7) in .PDB format. Similarly the crystal structure of the main protease of MERS CoV was obtained from RCSB protein databank (5C3N) in .PDB format for comparative docking.

Uploading target protein and ligands to docking server

The target protein was uploaded in the protein library and all the mentioned ligands were uploaded in the ligand library. At the time of initial cleaning steps, pH was set to 7 and other parameters were left to their default values.

Upon successful cleaning and upload, docking was initiated for individual ligands with the target protein Mpro.

Molecular Docking

Docking Server was used to calculate docking results [39]. Energy minimization of ligand molecules namely 2-3-4 beta-Ecdysone, Bioquercetin, Biorobin, Boeravinone J, Boerhavisterol,kaempferol, Liriodendrin, quercetin and trans-caftaric acid was done using using the MMFF94 force field [40] in the docking server Gasteiger. Partial charges were added to the ligand atoms. Merging of non-polar hydrogen atoms was carried out, and rotatable bonds were defined.

Docking of these ligands was calculated for protein model of the crystal structure of COVID-19 main protease obtained from RCSB protein databank (6LU7 and 5C3N). Auto dock tool was used to add data on essential hydrogen atoms, Kollman united atom type charges, and solvation parameters [41]. Auto grid program was used to generate affinity (grid) maps of 20 × 20 × 20 Å grid points with a 0.375 Å spacing [41]. The calculation of the van der Waals and the electrostatic terms, respectively were carried out by AutoDock parameter set- and distance-dependent dielectric functions.

Lamarckian genetic algorithm (LGA) and the Solis & Wets local search method were used to generate docking simulations [42]. Initial orientation, position and torsions of the ligand molecules were randomly set. 10 different runs were used to derive the results of the docking experiment; these runs were set to terminate after a maximum of 250,000 energy evaluations. The population size was set to 150. During the search, quaternion and torsion steps of 5 and a translational step of 0.2 Å were applied.

The docking parameters were set with the values 0.2 for tstep, 5.0 for qstep, 5.0 for dstep, 2.0 for rmstol, 150 for ga_pop_size, 250,000 for ga_num_evals, 540,000 for ga_num_generations and 10 for ga_run.

ADME studies and druglikeness prediction

Adsorption, distribution, metabolism and excretion along with toxicity (ADME + T) characteristics were predicted for the top 3 molecules with lowest binding energies (Biorobin, Bioquercetin and Boerhavistrol) using the pkCSM pharmacokinetics tool (http://biosig.unimelb.edu.au/pkcsm/). The input files were in .SDF format for the selected ligands (Biorobin, Bioquercetin and Boerhavisterol).

The druglikeness of the top 3 ligands with lowest binding energies was predicted by screening the molecule’s physical properties against Lipinski’s rule, ensuring no more than 5 hydrogen bond donors and 10 hydrogen bond acceptors provided the molecular mass doesn’t exceed 500 Da and octanol–water partition co-efficient (log P) is less than 5.

Results and discussions

Lowest binding energies and decomposed energies of all the major interactions

The binding energies of the ligands docked to the target proteins in kcal/mol along with the decomposed energies of each amino acid interacting with the ligand is described in Table 1

Table 1 Interaction energies of all the ligands docked with Mpro in ascending order

Visualization of protein–ligand interaction

While in this paper, we have targeted the ligands to Main protease, recent studies have also followed similar work on other SARS-CoV-2 target proteins such as RNA dependent RNA polymerase, viral spike protein [43], Angiotensin releasing enzyme 2, Endoribonuclease and Fusion proteins among [44] others.

A graphical representation of the ligand–protein interaction is depicted in Supplementary Table 1. In the geometric representation, the protein is described in cartoon form with coloration based on its tertiary and quaternary structure. The peptide binding with the ligand is illustrated as a cylindrical chain and the ligand itself is visualized in ball and stick form. Each carbon-amino acid interaction is numbered and labeled. Moreover, the entire docking is also visualized and illustrated in a separate column for each docking. The graphical visualization was performed on pyMOL and swiss PDB viewer and images were recorded at optimal viewing angle to best describe the location and configuration of the protein–ligand interaction.

Analysis of molecular interactions at amino acid level and determination of protein contact HP plots

Supplementary Table 2 depicts the 2 dimensional protein–ligand interaction plots where the interactions of amino acids with the ligand are illustrated in 2-D plane depicting the location of interaction with reference to the ligand molecule. The table also contains hydrogen bond interactions as HB Plots depicted in a separate column against each docking.

From the observed Protein contact HB plots, it is clear that docking of all the ligands to Mpro are occurring either on the alpha helix and anti-parallel beta sheets.

Elaborated interaction analytics of Biorobin (lowest binding energy observed) with Mpro

The most efficient dock with lowest binding energy was shown by Biorobin (Kaempferol-3-O-robinobioside). The lowest binding energy for this dock was − 8.17 kcal/mol, with an estimated inhibition constant of 1.02uM. While binding at other locations showed the binding energies as described in supplementary table 3.

The total intermolecular energy was found to be  − 6.34 kcal/mol with vdW + Hbond + desolv Energy being  − 6.35 kcal/mol and electrostatic energy being + 0.01 kcal/mol. Biorobin also showed the highest interaction surface among all the other ligands docked with Mpro, with a value of 718.884 with key interactions being primarily with the amino acids GLN189 ( − 1.9235), PRO168 ( − 1.8902), ALA191 ( − 0.9144), GLN192 ( − 0.4619), LEU50 ( − 0.4883) and MET165 ( − 0.5752). The interactions are illustrated in supplementary table 4.

The ADME + T interactions of Biorobin are described in Table 2. It was found that Biorobin possesses 15 hydrogen bond acceptors, 9 hydrogen bond donors with a molecular weight of 594.522 Da and log P value of − 1.392. It is important to note that although Biorobin fails to obey Lipinski’s rule, it is still a candidate molecule since Lipinski’s rule are not the sole determinant of viability of phytochemicals. Additionally, the exceeding molecular weight and hydrogen bonds in Biorobin is due to the additional side chains and glycoside substituent. Since Biorobin is essentially a derivative of Kempherol, the ADME characteristics of Kempherol was tested and found to obey all the Lipinski’s rules.

Table 2 ADME + T analysis of the top 3 ligands

Elaborated interaction analytics of Bioquercetin with Mpro

The Lowest binding energy shown by Bioquercetin (Quercetin 3-O-robinobioside) dock was − 7.97 kcal/mol, making it the second most efficient ligand with an estimated inhibition constant of 1.44uM. While binding at other locations showed the binding energies as described in supplementary table 5.

The total intermolecular energy was found to be − 3.76 kcal/mol with vdW + Hbond + desolv Energy being − 3.59 kcal/mol and electrostatic energy being − 0.17 kcal/mol. Bioquercetin showed the interaction surface with Mpro of 528.666 with key interactions being primarily with the amino acids GLN189 ( − 1.6072) and PRO168 ( − 0.9041). The interactions are illustrated in supplementary table 6.

Table 2 describes the ADME + T data for Bioquercetin. Like Biorobin, even Bioquercetin was found to disobey Lipinski’s rules with a molecular weight of 610.521 Da, 10 hydrogen bond donors, 16 hydrogen bond acceptors and a Log p value of − 1.682. However, the inference made about the reliability of chemical parameters of Biorobin is also true for Bioquercetin. The deviating values can be accounted for the additional side chains and large substituents in Bioquercetin. Since Bioquercetin is a derivative of Quercetin, the ADME + T studies performed on quercetin gave a molecular weight of 302.238 Da and Log p value of 1.988 with 7 hydrogen bond donors and 5 hydrogen bond acceptors which clearly obeys the Lipinski’s rules.

Elaborated interaction analytics of Boerhavisterol with Mpro

The Lowest binding energy shown by Boerhavisterol dock was − 6.77 kcal/mol, making it the third most efficient ligand with an estimated inhibition constant of 10.98uM. While binding at other locations showed the binding energies as described in supplementary table 7.

The total intermolecular energy of Boerhavisterol was found to be the lowest among all the ligands with the value − 8.40 kcal/mol where vdW + Hbond + desolv Energy was the lowest of all ligands with the value − 8.39 kcal/mol and electrostatic energy was − 0.01 kcal/mol. Boerhavisterol showed interaction surface with Mpro of 664.246 with key interactions being primarily with the amino acids PRO168 ( − 1.0723) GLN189 ( − 0.9152), MET165 ( − 0.7712), GLU166 ( − 0.5393), LEU167 ( − 0.6695), ALA191 ( − 0.4075). The interactions are illustrated in supplementary table 8.

The ADME + T analysis of Boerhavisterol may also be found in Table 2. It is evident that Boerhavisterol obeys all the Lipinski’s rule with a molecular weight of 414.718 Da, Log P value of 8.335 with 1 hydrogen bond donor and 1 hydrogen bond acceptor. This suggests that Boerhavisterol is a suitable candidate drug molecule.

The remaining Ligand-Mpro interactions are elaborated in the supplementary section of this paper.

In addition to the above mentioned target, the top 3 ligands of lowest binding energies were also docked to the main protease of MERS CoV to account for the structural similarity of this protein with the former target and to address the possibility of antiviral compounds that can potentially inhibit both the target proteins of similar structure. However, the binding energies were found to be positive and too high to favor any chances of spontaneous binding in MERS CoV Mpro. Biorobin showed a binding energy of + 3000 kcal/mol, while Bioquercetin and Boerhavisterol showed a binding energy of + 103.59 kcal/mol and + 40.77 kcal/mol respectively. Since the binding is not spontaneous, the post-docking inhibition of the target protein and its ADME studies would be irrelevant. The BlastP alignment of both the target protein sequence reveled a percentage identity of only 50.65% with 100% query coverage. A score of 322 and an E-value of 5e-115 also showed that this alignment is reliable. It may therefore be inferred that although the main protease of SARS and MERS show structural similarity, they differ from each other significantly in terms of the sequences. As a result the ligands that efficiently dock with one may not show similar binding energies with the other.

The Docking results indicated that all the compounds under consideration namely the ligands 2-3-4 beta-Ecdysone, Bioquercetin, Biorobin, Boeravinone J, Boerhavisterol, kaempferol, Liriodendrin, quercetin and trans-caftaric acid can spontaneously bind to the main protease of SARS-CoV-2 accounting to its negative binding energies per mol. However, the molecules showed low interaction surfaces with an exception of Biorobin with binding energy − 8.17 kcal/mol, Bioquercetin with binding energy − 7.97 kcal/mol and Boerhavisterol with binding energy − 6.77 kcal/mol which were the compounds with relatively lowest binding energies among all the 9 compounds tested. Additionally the high interaction surfaces of these compounds (718.884, 528.666 and 664.246 respectively) contribute to lowering of binding energies by enhancing the van der waals force of attraction between the ligand and the target protein. It has also been proposed that filling the dewetted region of the protein increases the entropy.

These binding energies were found to be favorable for an efficient docking and resultant inhibition of the viral main protease. The graphical illustrations and visualizations of the docking were obtained along with inhibition constant, intermolecular energy (total and degenerate), interaction surfaces and HB Plot for all the successfully docked conditions of all the 9 ligands mentioned. ADME + T studies were conducted to successfully verify the druglikeness of these ligands. Additionally the binding characteristics of all the ligands were analyzed against the structurally similar MERS CoV Mpro. However, the unfavorable binding energies indicated that the ligands that docked efficiently with SARS CoV Mpro may not be effective against the Mpro of MERS CoV. This counterintuitive result emphasizes the need for adaptation of this docking based in-silico drug screening and discovery approach for other target proteins of pharmacological importance. From these results, it was concluded that Boerhavia diffusa possess potential therapeutic properties against COVID-19. However, this conclusion essentially requires further wet lab investigations including animal trials, drug formulation and human trails.