1 Introduction

Severe acute respiratory syndrome coronavirus (SARS-CoV) is a deadly pneumonia virus in humans (Drosten et al. 2003; Ksiazek et al. 2003). The term “coronavirus” is named from the ‘corona’-like or ‘crown’-like morphology observed for these viruses in the electron-microscopy images (Gui et al. 2017). SARS-CoV emerged in the Guangdong province of China in 2002 and spread to five continents through air travel routes, infecting about 8000 people and causing 774 deaths. Another deadly coronavirus, SARS-CoV-2 emerged in December 2019 in Wuhan, Hubei province of China is associated with an ongoing outbreak of atypical pneumonia pandemic, COVID-19. The World Health Organization (WHO) has declared that the SARS-CoV-2 epidemic is a public health emergency of international concern. There have been outbreaks worldwide after China including, Italy, Iran, France, Spain, Germany, the UK, the US, India, and so on. The WHO reported that as on 22 July 2020, there are about 14.7 million confirmed cases, including 612,000 deaths in more than 70 countries worldwide. In India, there are more than 1.2 million confirmed cases, including 29, 890 deaths. The current situation clearly evidenced that transmission of the disease is massive in a short period, with thousands of new patients diagnosed daily.

Common symptoms of the coronaviral infection include respiratory problems, fever, dry cough, shortness of breath, nasal congestion, sore throat, and diarrhea (Rothan and Byrareddy 2020). In severe cases, the infection causes pneumonia, severe acute respiratory syndrome, kidney failure, and eventually to mortality. The cause of death is respiratory failure, shock, or multiple organ failure. There is no specific treatment for COVID-19 to date. Hence, discovering pharmaceutically active antivirals and/or vaccines specific to SARS-CoV-2 is imminent under the present worldwide crisis.

SARS-CoV-2 (COVID-19) is highly homologous to SARS-CoV. SARS-CoV-2 genome has ten open reading frames (ORFs) (Wu et al. 2020). Based on the sequence alignment of SARS-CoV-2 with SARS-CoV, the multiple functional proteins were speculated (Sardar et al. 2020; Wu et al. 2020). ORF1ab encodes polyprotein 1ab. Two proteases, PLPro and 3CLPro, cleave the pp1ab protein at different sites to yield multiple proteins involved in the transcription and replication process of viral RNA. The proteolytic process of pp1ab gives 15 non-structured proteins. ORF2-10 encodes viral structural proteins, including S, M, N, and E proteins, and other auxiliary proteins. The S, M, and E proteins are responsible for the viral coating while the N-protein is essential for packing the RNA genome.

The transmembrane glycoprotein Spike (S-protein) is essential for the coronavirus entry into a host cell (Tortorici and Veesler 2019). The S-protein forms a homotrimer that protrudes from the viral surface. The S-protein comprises two functional subunits (S1 and S2). The S1 domain directly interacts with the host cell receptor, while the S2 domain is responsible for the fusion of the viral and cellular membrane (Li et al. 2003, 2005; Xiao et al. 2003). The S1 domain possesses the receptor-binding domains (RBDs) and contributes to the stabilization of the prefusion state of the membrane-anchored S2 subunit, which involves fusion machinery (Gui et al. 2017; Walls et al. 2017). The S1 domain directly interacts with angiotensin-converting enzyme 2 (ACE2) of the host to enter into the host cell (Song et al. 2018; Lan et al. 2020; Shang et al. 2020; Yan et al. 2020). The S-protein of SARS-CoV is highly homologous to that of the recent pandemic SARS-CoV-2 (Wan et al. 2020). In March 2020, the structure, function, and antigenicity of the SARS-CoV-2 spike glycoprotein have been reported (Walls et al. 2020). The structure of the full length of spike protein except the membrane-binding region was determined by cryo-EM. It was revealed that the Spike domain of the recent SARS-CoV-2 possesses a few additional features in comparison to the earlier SARS-CoV strain. The substitution of some residues on the RBD (receptor binding domain) from SARS-CoV to the recent SARS-CoV-2 (Arg426→Asn439, Tyr484→Gln498, Thr487→Asn501, Tyr442→Leu445, Leu443→Phe456, Phe460→Tyr473, Asn479→Gln493, and Val404→Lys417) was observed to increase the binding affinity of the spike RBD domain to the ACE2 receptor and provide a more compact conformation (Shang et al. 2020; Yan et al. 2020). The superposition of the ACE2 with the RBD domain of the full-length spike protein determined that the interaction with the open-conformation of the latter was in higher compliance compared to its closed-conformation, which results in steric clashes between the two proteins.

Targeting the SARS-CoV-2 virus can be broadly divided based on the two stages: (1) the pre-fusion stage; and (2) the post-fusion stage. The pre-fusion stage targets the ACE2 receptor recognition of the RBD domain on the spike protein (S1 domain) and the consequential membrane fusion (S2 domain), leading to the development of prophylactic vaccines, antibodies and antiviral drug compounds (Xia et al. 2020a, b). The post-fusion stages involve targeting the viral envelope shedding and replication after the viral entry within the host cells. Though the Indian origin of SARS-CoV-2 is highly homologs to SARS-CoV-2 of other countries (China, Italy, and the USA), few mutations were observed in ORF1ab, Nsp2, Nsp3, helicase, ORF8 protein and S-protein of Indian SARS-Co-V-2 (Sardar et al. 2020). As the glycosylated S-protein is exposed to the surface and is essential for entry into the host, the S-protein can be considered as a first-line therapeutic target for antiviral therapy and vaccine development. While we were preparing the manuscript, a few works were reported in identifying small molecule inhibitors of S-protein by computational studies. For instance, chloroquine and its derivatives, including hydroxychloroquine were shown to bind the S-protein–ACE2 interface (Beura and Chetti 2020). Sandeep and McGregor (2020) also predicted that hydroxychloroquine and azithromycin bind to the S-protein to inhibit the ACE2 interaction. From a library comprising 9091 FDA approved drugs, the ivermectin drug was shown to bind to the S-protein interface (de Oliveira et al. 2020).

To find the repurposable drug molecules to inhibit the Spike protein interaction with the host receptor, ACE2, we performed virtual screening for the receptor-binding domain (RBD) of S-protein against the DrugBank and PubChem libraries and subsequently carried out molecular dynamics simulation studies on selected compounds. Here, we discuss a discovered repurposable drug, Bisoxatin (DB09219), a laxative drug, which binds substantially at the S-protein – ACE2 interface, and it may be a robust repurposable drug to develop new chemical libraries for inhibiting SARS-CoV-2 entry into the host.

2 Materials and methods

2.1 Molecular docking

2.1.1 Ligand preparation

For docking against the Spike protein, DrugBank (Wishart et al. 2018), and PubChem (Kim et al. 2019) chemical libraries were used. Both are public repositories accessible online containing information on compounds and their biological activities. In addition to the Lipinski’s Rule of Five applied on both libraries for filtering the compounds, only the FDA approved drug molecules (1407 molecules) were selected from DrugBank, and chemical and drug molecules (6942 molecules) annotated to have some pharmacological actions were selected from PubChem. The 3D structures were downloaded in SDF format from the databases and converted to individual PDB files using OpenBabel version 2.4.1 (O’Boyle et al. 2011). Then, the conversion of these compounds were preprocessed for docking using the python script prepare_ligand4.py from AutoDock MGLTools 1.5.6 (Morris et al. 2009).

2.1.2 Protein preparation

The crystallographic structure 6LZG (Resolution: 2.50 Å) of the SARS-CoV-2 spike protein receptor-binding domain complexed with the human ACE2 receptor was retrieved for the study (Wang et al. 2020). The structure was selected based on the structure quality and completeness of the RBD domain. In order to prepare the protein for docking, the ACE2 receptor was removed from the complex, and the S-protein was minimized and converted to pdbqt format using AutoDockTools.

2.1.3 Docking parameters

Molecular docking helps predict the predominant binding pose of a ligand with a protein and analyze the inhibitory interactions between them. Here, molecular docking studies were carried out using AutoDock Vina version 1.1.2 (Trott and Olson 2010). Blind docking was carried out scanning the entire protein surface to get the ligands binding specifically to the ACE2 – S-protein interaction site. The docking parameters were kept to default. The docking grid size in X, Y, Z dimensions was set at size_x = 60.00 Å, size_y = 60.00 Å and size_z = 60.00 Å, respectively and centered at center_x = −32.22, center_y = 25.80 and center_z = 21.19. The exhaustiveness was kept at 9, with all other parameters kept default.

2.1.4 Post docking analysis

The docking results were processed using python script process_VinaResults.py available in MGL tools. The ligands were evaluated based on their binding free energy and the binding site. The ligands which bind at the S-protein–ACE2 interface only were selected for further evaluation. Interactions between the selected best hits and the macromolecule were analyzed using PyMol version 1.3 (Schrodinger 2017).

2.2 Molecular dynamic simulations

The docked pose of the selected ligands in complex with the RBD domain of the S-protein was prepared for further studies using atomistic molecular dynamic (MD) simulations (Bowers et al. 2006). The Desmond module from the D.E. Shaw group was utilized for the MD setup (Bowers et al. 2006). The ligand complexes were assigned and optimized with hydrogen atoms wherever required, checked for other atomic penalties, and minimized before the preparation of the solvation box. The RBD domain in complex with the respective ligands was solvated in a cubic box of water molecules at minimized volume. The solvated water box was generated and minimized using the steepest descent method until a gradient threshold of 1.0 kcal/mol/Å was reached with a minimum step size of 10. The Coulombic interactions were cut-off at 9.0 Å. The default force constant was applied, and no other restraints on the protein and solvent molecules were used during the minimization process. The production run consists of 8 stages, including the pre-relaxation of the protein–ligand complexes. The simulation run was performed at 300 K and 1.01325 bar pressure. The production run was carried out for a simulation time of 100ns. The trajectory frames were written into result files for every 50ps. The resultant 2000 frame trajectory was analyzed using both Desmond (Bowers et al. 2006) and Maestro (Release 2017) modules from the Schrödinger Suite. The PyMOL software (Schrödinger 2017) was also used for visual inspection of the cluster analysis results.

2.3 Free energy binding calculations

The simulated complexes were subjected to MM-GBSA (molecular mechanics combined with generalized Born and solvent-accessible surface area solvation) calculations to estimate the protein-ligand binding free energies (Massova and Kollman 2000; Genheden and Ryde 2015). The MM-GBSA scoring was performed using the Prime MM-GBSA script available in Prime module v5.6 of the Schrödinger Suite 2019-2 (Lyne et al. 2006; Du et al. 2011). The 100ns MD trajectories were first stripped to remove the explicit solvent molecules. The trajectories were then split into total of 100 frame snapshots with 1000 ps step size. A continuum solvent model VSGB 2.1 (variable-dielectric generalized Born model) and OPLS3e force field were used for energy evaluations (Li et al. 2011; Roos et al. 2019). To predict the induced fit effect of ligand, the binding site optimization comprising prime sidechain predictions and minimizations was also done. For each frame, the binding free energy of ligand and receptor is estimated using the following equation (1) (Kollman et al. 2000):

$$ \Delta G_{bind} = G_{complex} - G_{protein} - G_{ligand} $$
(1)

where Gcomplex, Gprotein, and Gligand are the prime energies of optimized complex, free receptor and free ligand. The separate free energy terms for complex, protein, and ligand are calculated for each snapshot using equation (2):

$$ \Delta G =\Delta E_{MM} +\Delta G_{solv} - T\Delta S $$
(2)

where ΔEMM corresponds to the average molecular mechanical energy and includes the electrostatic and van der Waals potentials in the molecular mechanical force field, given in equation (3), and ΔGsolv is the solvation free energy obtained from summation of polar and non-polar contributions from equation (4). The polar solvation free energy is calculated using the generalized Born (GB) model, whereas the non-polar solvation free energy is obtained by solving a linear relation to solvent-accessible surface area. TΔS is the absolute entropy solved using normal mode analysis of vibrational frequencies.

$$ \Delta E_{MM} =\Delta E_{electrostatic} +\Delta E_{vdw} $$
(3)
$$ \Delta G_{solv} =\Delta G_{polar} +\Delta G_{nonpolar} $$
(4)

The final binding free energy reported is the average of the 100 snapshots. Additionally, to relate the stability of ligands at the binding site and its binding affinity, the MM-GBSA calculations were performed on representative frames from each cluster.

3 Results and discussion

The receptor-binding domain (RBD domain) of the SARS-CoV-2 spike protein (S-protein) was selected as the drug target towards the screening of antiviral compounds against the 2019 pandemic SARS-CoV-2 (figure 1A). Superposition of the cryo-EM structure of Spike protein (open state) (PDB ID: 6VYB) with the crystal structure of the RBD domain in complex with ACE2 receptor (PDB ID: 6LZG) revealed a model impression of existent interaction between the two proteins (figure 1B, C). The DrugBank and PubChem databases were used for virtual screening studies (Bolton et al. 2008; Wishart et al. 2018). The compounds considered for the study from both the databases fulfilled the Rule of Five criteria (Lipinski 2004). The selected compounds from the docking studies were visually inspected for the critical interactions that could disrupt the protein-protein interactions (PPI) between the S-protein of the SARS-CoV-2 virus and the human ACE2 receptor, thus potentially inhibiting the entry of the viral particle into the human; thereby, preventing the replication of the viral load. The intermolecular interface region for the docking studies was studied carefully. The blind docking studies were performed to eliminate any biased interactions that would be significantly weak in real-time binding studies. The binding site was divided into three regions based on the electrostatic surface region at the S-protein – ACE2 interface, namely, Site 1, Site 2, and Site 3 (figure 2A). The hydrophilic ‘Site 1’ region comprises of the residues Gly446, Tyr449, Gly496, Gln498, Thr500, and Asn501 on the S-protein and interacts with the residues Asp38, Tyr41, Gln42, Lys353 and Asp355 on the ACE2 receptor surface (figure 2B). The moderate hydrophilic ‘Site 2’ region comprises of the residues Lys417 and Gln493, which interact with the residues Asp30 and Glu35 of the ACE2 receptor (figure 2C). The ‘Site 3’ region, which is also moderate hydrophilic, consists of the residues Ala475 and Asn487 on the S-protein and interact with the residues Ser19, Glu24, and Tyr83 of ACE2 (figure 2D). On further detailed analysis, the topology of the binding site indicated a substantial hydrophilic region on the head end of the ‘Site 1’ region. The ‘Site 1’ also houses a very prominent hydrophobic cleft produced by the residues Tyr495, Phe497, and Tyr505 (figure 2E). This cleft does not indulge in interactions with the ACE2 receptor surface but seems to be an implementable strategy while designing drugs for the S-protein binding to inhibit the receptor interaction. The cleft is followed by the ‘Site 2’ residues, which provide a good hook position for the probable drug compounds. There are many hydrophilic non-interacting residues like Arg403, Glu406, and Tyr453 in the ‘Site 2 region’, which again provide a possible drug designing strategy to inhibit ACE2 (figure 2E).

Figure 1
figure 1

Structure of SARS-CoV-2 spike protein. (A) A cartoon representation of the receptor-binding domain (RBD domain) of the spike protein (PDB ID: 6LZG). (B) A cartoon representation of the superposed cryoEM structure of trimeric spike protein (PDB ID: 6VYB) (in green, cyan, and pink ribbons) with the crystal structure of the ACE2 receptor (blue ribbon) in complex with the RBD domain of the spike protein (green ribbon) (PDB ID: 6LZG). (C) Surface representation of the superposed structures shown in (B).

Figure 2
figure 2

The interface between the viral spike protein and ACE2. (A) Cartoon representation of the Spike-ACE2 protein interactions (in green and light blue ribbons, respectively). The interaction surface is divided into three regions – ‘Site 1’, ‘Site 2’, and ‘Site 3’. The interaction residues are shown as sticks (pink – ACE2 receptor; green – Spike protein). (B), (C) and (D) The interactions between Spike protein and ACE2 receptor are shown for the regions ‘Site 1’, ‘Site 2’, and ‘Site 3’. (E) The residues in and around ‘Site 1’ and ‘Site 2’ form a good binding site for ligands interactions. The topology for the binding site can be divided as the hydrophobic cleft, ‘hook 1’, and ‘hook 2’ regions. The hydrophobic cleft is a hotspot for the binding of compounds with strong aromatic cores. The hook1 and hook2 regions, which comprise of hydrophilic interactions, provide good support to increase interactions.

3.1 DrugBank library

For the DrugBank library, the docking studies revealed that the top-most binding compounds (seven ligands) possess docking scores ranging between −7.5 and −7.0 kcal/mol (table 1). The compounds which were showing site-specific interactions were visually inspected to determine their viability in hampering the ACE2 receptor interactions. Four compounds were further selected for MD simulation studies based on the interactions shown below. The top seven compounds, including the compounds selected for the MD study, are described below.

Table 1 Top seven compounds from the docking studies of the DrugBank database

3.1.1 Mefloquine (DB00358)

Mefloquine, an antimalarial drug, is very active against Plasmodium falciparum as well as against malarial parasites resistant to chloroquine and making it a highly efficient drug against malaria (Palmer et al. 1993; Nosten et al. 2000). The compound shows good hydrophilic interaction with S-protein from a partial portion of Site 1 region extending to the ‘Site 2’ region (figure 3A). The two trifluromethyl moieties on the quinolinyl core significantly interact with the ‘Site 1’ residues (Gly496, Asn501, and Tyr505) and Site 2 residues (Tyr453 and Ser494). The piperidyl methanol moiety interacts with the residue Asn501. Additionally, the quinolinyl core and piperidyl ring stack on the residue Tyr505 in edge to face and face to face π-π interactions. The substantial loss of intermolecular interaction of Lys353 of ACE2 may occur when Mefloquine binds to S-protein at the predicted binding site.

Figure 3
figure 3

Docking interactions of top compounds from the DrugBank database. The intermolecular interactions of the best docking poses of compounds (A) Mefloquine (DB00358), (B) Hetacillin (DB00739), (C) Ketoprofen (DB1009), (D) Phenolphthalein (DB04824) and Bisoxatin (DB09219), (E) Riociguat (DB08931) and (F) Doravirine (DB12301). The interacting residues are shown as sticks.

3.1.2 Hetacillin (DB00739)

Hetacillin is a β-lactam antibiotic that was used to treat bacterial infections (Hardcastle Jr et al. 1966; Smith and Hamilton-Miller 1970). It was later withdrawn as it offered a lesser therapeutic value than the ampicillin derivatives. The compound is positioned in the hydrophobic cleft and the hook region of ‘Site 2’ (figure 3B). The imidazolidine moiety interacts with Tyr453, and the azabicyclo part of hetacillin interacts extensively with Glu406 of the ‘Site 2’ region. The phenyl moiety contributes to hydrophobic interaction with Tyr505. The intermolecular interaction of Lys353 of ACE2 may also be inhibited when Hetacillin binds to S-protein at the predicted binding site.

3.1.3 Ketoprofen (DB01009)

Ketoprofen is a drug used to treat rheumatoid arthritis and osteoarthritis (Sardana et al. 2017). The hydratropate moiety of the compound interacts with the ‘Site 2’ residues, Arg403 and Tyr453 (figure 3C). The benzoyl and the hydrotropate moiety wraps the residue Tyr505 through π-π stacking interactions. The hydroxyl group from the benzoyl moiety interacts with the Gly496. Although the compound occupies the binding site, it does not disrupt any mandatory interactions between the spike-ACE2 interfaces, except the Lys353 interaction, which may disrupt by the compound.

3.1.4 Phenolphthalein (DB04824)

Phenolphthalein is a laxative drug used for alimentary tract clearance before the surgical procedure (Gaginella et al. 1994). Though it has been discontinued in Canada on the basis of being carcinogenic in nature, many countries still continue the usage of the drug (Coogan et al. 2000). The compound consists of twin hydroxylphenyl moieties, one of which, along with the benzofuranone portion, encloses the Tyr505 through π–π stacking interactions (figure 3D). The other hydroxylphenyl group substantially occupies the region where ACE2 interacts in the ‘hook 1’ region. The predicted docking pose of the compound wards off many important hydrophilic interactions that occur at the spike-ACE2 interface.

3.1.5 Bisoxatin (DB09219)

Bisoxatin, also a laxative drug, has been used as a stimulant for intestinal peristalsis, thus treating constipation disorders and also for preparation of the colon for surgical procedures (Rider 1971). The overlap of the drug positioning in the binding site of S-protein reveals a similar mode of binding, as found in the phenolphthalein binding (figure 3D). The benzoxazinone moiety of Bisoxatin and the benzofuranone portion of Phenolphthalein are overlapped each other in the binding site. A weak hydrogen bond is observed between the ligand and Thr500. Similar to Phenolphthalein, as described above, the hydroxylphenyl groups place itself to make π-π interactions with Tyr505 and to block the site where the interactions occurred by the ACE2 residues at the ‘hook 1’ region.

3.1.6 Riociguat (DB08931)

Riociguat is used to treat pulmonary arterial hypertension and chronic thromboembolic pulmonary hypertension to improve exercise capacity and delay clinical worsening (Ghofrani et al. 2013a, b). The pyrimidinyl and methyl carbamate group interacts with Tyr453, whereas the pyrazolopyridine and fluorobenzyl groups stack against Tyr505 through π-π interactions (figure 3E). The fluorine atom of flurobenzyl component of the drug interacts with the residue Asn501 of the Site 1 region.

3.1.7 Doravirine (DB12301)

Doravirine is an HIV-1 non-nucleoside reverse transcriptase inhibitor used for the management of HIV-1 infection (Colombier and Molina 2018). The benzonitrile group, along with the trifluromethyl pyridine group, extends hydrophobic interactions with Tyr505. The trifluoromethyl interacts with the residues Gly496, Asn501, and Tyr505. The triazole moiety establishes hydrophilic interactions with Tyr453 and Gln493 (figure 3F).

3.2 PubChem library

For the PubChem library, docking studies revealed the best hits with binding free energy ranging from −8.9 to −7.6 kcal/mol, which has been listed along with their functions in table 2. The interaction analysis of the top seven hits, which demonstrated site-specific binding, including the three compounds selected further for molecular dynamics study, is given below.

Table 2 Top seven compounds from the docking studies of the PubChem Database

3.2.1 CID135565082

Talazoparib is an antineoplastic agent that selectively binds to poly (ADP-ribose) polymerase (PARP) enzymes and inhibits the PARP mediated DNA repair (Hoy 2018). The flurobenzene rings wrap around Try505, forming π–π and π–π T-shaped aromatic interactions (figure 4A). The piperidine amine engages in hydrogen bonding with the ‘Site 1’ residues Gly496 and Asn501.

Figure 4
figure 4

Docking interactions of top compounds from PubChem database. The intermolecular interactions of the best docking poses of compounds (A) Talazoparib (CID135565082), (B) CX-659S (CID9869053), (C) Naphthalenenitrile 9ac/L-708,780 (CID9890128), (D) Lensiprazine (CID9954003), (E) Bisindolylmaleimide VIII (CID2403), (F) Bisindolylmaleimide X (CID2404), and (G) Silymarin (CID7073228). The interacting residues are shown as sticks.

3.2.2 CID9869053

A diaminouracil derivative possesses potential anti-oxidative and anti-inflammatory activities (Goto et al. 2002). The trimethylphenol moiety of the compound is involved in multiple interactions with the ‘hook 1’ residues, Asn501 and Gly496, and the residues from the hydrophobic cleft (figure 4B). The formation of π-π T-shaped interactions and hydrogen bonding with the carbonyl group of Tyr505 leads to its anchorage at the binding site. The amine groups show hydrogen bond interactions with Tyr453 in ‘Site 2’. Though the compound contributes intermolecular interactions with the binding site residues, it does not hinder the ACE2 – S-protein interactions.

3.2.3 CID9890128

An inhibitor of lipoxygenase enzyme activity and prevents the oxidation of arachidonic acid to 5-hydroperoxyeicosatetraenoic acid by 5-lipoxygenase (Delorme et al. 1996). It also reduces the leukotrienes B4 biosynthesis. The naphtonitrile and benzene moieties display π-π stacked and π-π T-shaped interactions with Tyr505. The dioxabicyclooctanol moiety interacts with Gly496 in ‘Site 1’ and the ‘hook 2’ residue, Arg403 (figure 4C). It also forms a hydrogen bond with Ser494. Moreover, the compound also interacts with Asn501 of ‘Site 1’ and completely occupies the position of Lys343 of ACE2. The docked pose of the ligand positioned at ‘Site 1’ suggests significant disruption may occur when S-protein interacts with ACE2.

3.2.4 CID9954003

Lensiprazine is an antipshycotic agent with the bifunctional activity of dopamine D2 receptor antagonism and serotonin reuptake inhibition (Smid et al. 2005). The docked pose of the compound, which overlaps with the ACE2 interacting region, suggests that the compound may inhibit the ACE2 interaction with S-protein. The methylbenzoxazinone moiety interacts with the ‘Site 1’ residues Gly496, Asn501, and displays π–π T-shaped interactions with Tyr505 (figure 4D). The fluoroindole moiety displays significant interactions with the ‘Site 2’ residue Tyr453 and π-alkyl interactions with Lys417. The amine group of fluoroindole forms a hydrogen bond with Glu406.

3.2.5 CID2403

Bisindolylmaleimide VIII has lesser selectivity and potency towards protein kinase C as compared to CID2404 due to the presence of an amine side chain (Muid et al. 1991). It is patented under Severe acute respiratory syndrome coronavirus as nucleic acids and proteins from SARS coronavirus (US2006257852) (Rino et al. 2004). The propane amine group attached to the indole ring interacts with Tyr453 of ‘Site 2’ (figure 4E). The indole rings stack the Tyr505 by π-π interactions.

3.2.6 CID2404

Bisindolylmaleimide X is an inhibitor of protein kinase C with potential anti-inflammatory and anti-asthmatic activities (Muid et al. 1991). The compound belongs to the class of organic compounds known as n-alkylindoles. The indole rings stack around Tyr505, forming edge to face and face to face strong aromatic interactions (figure 4F). It is also patented under US2006257852 (Rino et al. 2004). The indole ring and pyrroledione moiety contribute hydrogen bonds with Tyr505 and Gly496, respectively. The position of the compound and its interaction at ‘Site 1’ of S protein demonstrates significant disruption of the ACE2–spike protein interactions. Both CID2404 and CID2403 interrupt the ACE2 Lys343 interactions, and their binding pose indicates they might be able to break the ‘Site 1’ interactions with the ACE2 receptor.

3.2.7 CID7073228

Silymarin is a phytomedicine extracted from milk thistle seeds. Along with its hepatoprotective effects, it is also known for anti-oxidant, antiviral, anti-inflammatory, and anti-fibrotic activities (Soleimani et al. 2019; Xie et al. 2019). It exhibits interactions with both ‘Site 1’ and ‘Site 2’ residues, thereby inhibiting His34 and Lys343 of ACE2 (figure 4G). The methoxyphenol moiety stacks with Tyr505 by an edge to face aromatic interaction and leads to many hydrogen bonding with the ‘Site 1’ residues (Gly496, Asn501, and Tyr505). The dihydrobenzodioxinylmethanol moiety forms interactions with Gly496, Ser495 in ‘Site 1’, and the ‘Site 2’ residues (Arg403 and Tyr453). Additionally, the resorcinol group engages in hydrophobic interactions with the aliphatic side chain of Lys417 in ‘Site 2

3.3 Analysis of MD simulations

3.3.1 DrugBank molecules

Further, MD studies were performed on the selected DrugBank molecules – Mefloquine (DB0358), Phenolphthalein (DB04824), Bisoxatin (DB09219) and Doravirine (DB12301) to evaluate the stability of the protein-ligand complexes. The selected protein-ligand complexes were solvated in a cubic box of water with minimized volume and setup for a simulation time of 100ns. Inspection of the basic parameters and detailed interactions with the ligands were performed to elucidate the best interacting ligand. The analysis of both, RMSD protein backbone and the ligand atoms revealed intense changes to the backbone of the Spike protein during MD run in the DB12301 complex (figure 5A). The protein backbone in this complex seems to show a stepwise elevation in the RMSD values of the protein backbone. Also, the ligand RMSD of DB12301 shows higher and considerably fluctuating RMSD compared to the other ligands. The other ligand complexes exhibit a normalized RMSD pattern with minor deviations to the protein backbone RMSD throughout the simulation time. The root mean square fluctuation (RMSF) calculation revealed that the ligand DB12301 complex showed very high fluctuation in most regions in accordance with the RMSD pattern (figure 5B). Consecutively, it was observed that the unstructured loop near the ‘Site 3’ region, comprising of Gln474 - Asn487 was a contributing factor to the elevation in the RMSD values in not only DB12301 but also in other ligand complexes in varying proportions (figure 5C). Further, the ligand RMSD with respect to the protein structure revealed the instability in the binding interactions of ligands DB00358, DB04824, and DB12301 (figure 5D). Intriguingly, the drug, Bisoxatin (DB09219), exhibits consistent interactions in the targeted binding pocket. Trajectory cluster analysis revealed the ratio of percentage time the ligands existed at different binding positions throughout the run. The cluster analysis shows that the ligands DB00358, DB04824, and DB12301 are not consistent with the binding site except the ligand, DB09219 (figure 6). The cluster ratios for the molecules are calculated as given in table 3. Phenolphthalein (DB04824) and Doravirine (DB12301) reveals a poor ratio of 62:38 and 15:85, respectively. Mefloquine (DB00358) does not bind to the targeted site at all. However, Bisoxatin (DB09219) occupies the “Site 1 and 2” binding site of S-protein consistently throughout the simulation time.

Figure 5
figure 5

Analysis of the molecular dynamic (MD) simulations for the selected molecules from the DrugBank database. (A) Root mean square deviation (RMSD) analysis of the protein backbone for the complexes – Apo (grey), DB00358 (dark blue), DB04824 (red), DB09219 (purple) and DB12301 (dark green). Root mean square deviation (RMSD) analysis of the ligands for the complexes – DB00358 (light blue), DB04824 (yellow), DB09219 (pink), and DB12301 (light green). (B) Root mean square fluctuation (RMSF) plot for the protein backbone atoms in the selected ligand complexes. (C) A cartoon representation of the RBD domain of S-protein (green ribbon), indicating the unstructured loop (Gln474-Asn487), shown in red ribbon, that reveals the highest patch of fluctuation in the protein backbone. (D) RMSD plot of the ligand with respect to protein position the consistency of ligand binding to the targeted site.

Figure 6
figure 6

Trajectory cluster Analysis of the ligand complexes. The trajectories were clustered and analyzed to validate the interaction throughout the MD timeline. The clustered positions of the ligands are shown: (A) Mefloquine (DB00358), (B) Phenophthalein (DB04824), (C) Bisoxatin (DB09219) and D Doravirine (DB12301). The time span of existence for the clusters is also given for each ligand. The cluster sizes are given in table 3.

Table 3 Trajectory cluster analysis – DrugBank Compounds

A comprehensive analysis of the polar and non-polar interactions established by the molecule DB09219 was done throughout the simulation. The analysis revealed important electrostatic interaction around the ‘Site 1’ region. The majority of the interactions were formed by the residues Arg403, Ser494, Gly496, and Asn501 through direct and water-mediated hydrogen bond contacts (figure 7A, B) similar to the RBD-ACE2 contact (Malik et al. 2020). The residues Tyr453, Gln494, Gly495, and Tyr505 contribute to the water-mediated hydrogen bonding through most of the simulation. Based on the RMSD, cluster, contact and visual analysis, the simulation shows a slight change in the binding site around 75 ns, causing two different cluster conformations as well as differential contact analysis. We speculate that the flip over establishes new contacts with residues Gly446, Gly447, and Tyr449. Additionally, the hydrophobic contact pattern found with the residues Tyr449 and Tyr505 reveals stark complementarity leading to speculations that the ligand is under constant hydrophobic interactions (figure 7C). Taken together, the MD studies on these drug molecules substantiated that Bisoxatin might be a possible lead towards inhibition of Spike-ACE2 interactions.

Figure 7
figure 7

Contact analysis of DB09219. (A) Hydrogen bond analysis for DB09219 revealed significant interactions with the residues Arg403, Ser494, Gly496, and Asn501. (B) Water-mediated hydrogen bond analysis revealed the involvement of the residues Tyr453, Gln494, Gly495 and Tyr505 in addition to the residues in (A). (C) Hydrophobic contact analysis reveals the complementing nature of Tyr449 and Tyr505 in establishing continuous hydrophobic interaction with the molecule DB09219.

3.3.2 PubChem compounds

The PubChem compounds though showed good binding energies, most of them tend to bind strongly in the hydrophobic cleft and not to interfere with the ACE2 – Spike protein interactions. Therefore, only two compounds (CID2404 and CID9890128) were selected for the simulation study.

The RMSD of the protein backbone and ligand atoms were calculated for RBD of SARS CoV2 Spike Protein in complex with PubChem compounds CID2404 and CID9890128 against their initial structures (figure 8A). Both the complexes showed no significant fluctuations over the 100ns simulation time and stabilized at ~ 2.5 Å. In order to check the stability of compounds at the binding site predicted by docking, the ligand RMSD with respect to protein was calculated and analyzed. The CID9890128 complex showed very high fluctuation from 20 - 55 ns, pointing to extreme changes in the ligand-binding (figure 8B). The CID2404 complex shows stable RMSD after 20ns of simulation (figure 8B). The higher RMSD values for both the compounds compared to the initial frame indicate a change in the initial binding position. The RMSF gives average residual mobility throughout simulation in a structure. RMSF of each complex was calculated and plotted against the residue number (figure 8C). Similar to the results of DrugBank compounds, the high fluctuation is observed in the loop region Gln474-Asn487. The validation of ligand positioning was done through trajectory cluster analysis (figure 8D). The number of clusters and cluster ratios for each of the ligand complex is given in table 4. The ligands CID2404 and CID9890128 exhibit poor cluster rations as they do not show binding to the target site and instead exhibit complete disruption of protein interactions as in the latter case.

Figure 8
figure 8

Analysis of the Molecular Dynamic (MD) Simulations for the selected molecules from the PubChem Database. (A) RMSD of RBD S protein backbone in CID2404 complex (red), CID9890128 complex (dark green), and apo-protein (grey). RMSD of the ligands in CID2404 (light red) and CID9890128 (light green) complexes are also included.; (B) RMSD of ligand position with respect to protein in the complexes of CID2404 (Red) and CID9890128 (Green); (C) RMSF of RBD S protein backbone in CID2404 complex (Red), CID9890128 complex (Green) and apo RBD of spike protein (grey); (D) Trajectory cluster analysis of CID2404 complex. (i) Clustered position of the ligand for the initial 25ns of the simulation period. (ii) Clustered position of the ligand for 30-200ns of the simulation period.

Table 4 Trajectory cluster analysis- PubChem Compounds

To determine the interactions formed by the compounds with the residues at ACE2 – S protein binding site, hydrogen bond profiles were calculated between the compounds and the selected residues. Overall, CID2404 forms three hydrogen bonds, and CID9890128 forms two hydrogen bonds. The compound CID2404 interacts with the ‘Site 1’ residues Thr500, Gln498, and the ‘Site 2’ residue Ser494 during the simulation time (not shown). It was observed that the compound moves away from the initial binding site towards ‘Site 3’ (figure 8D). After 25 ns, it loses the initial interactions with Tyr505 and gets in close proximity of Leu452, Phe490, and Leu492, forming aromatic interactions. The compound CID9890128 shows interactions initially with Ser494 and Thr500. Its cluster analysis data shows that CID9890128 is moving away from the binding site (not shown). The initial interactions observed with Tyr505 were lost in the simulation course for both the compounds.

3.4 MM-GBSA analysis

The MM-GBSA analysis was performed for the selected four ligands from the DrugBank database and the selected two ligands from the PubChem database. The trajectories of the selected protein-ligand complexes were sliced into periodical 100 snapshots, each of which was used for making Prime MM-GBSA calculations in the Schrodinger suite. The total binding energy trend of each complex at an interval of 1ns per snapshot is given in figure 9. The running average values of different components of the calculated MM-GBSA binding free energies are shown in table 5. On analyzing the individual energy components, it is indicated that the van der Waals and electrostatic energies from the coulombic interactions form the major contribution to the binding free energy. The molecules, DB09219, and CID2404 showed the lowest binding free energy values with highly favorable van der Waals energies indicating the establishment of hydrophobic interactions with the surrounding residues. This high energy component validates the proximity of the ligand to the hydrophobic cleft, as discussed earlier. Additionally, the lipophilic energies indicate to be supporting the van der Waals interactions. These hydrophobic contacts play a crucial role in the stability of the ligands at the binding pocket.

Figure 9
figure 9

MM-GBSA analysis. The dG values for the total (black), coulomb (orange), lipophilic (green), and van der Waal’s (blue) energy for (A) DB09219 and (B) CID2404.

Table 5 Binding free energy (in kcal/mol) results obtained from MM-GBSA analysis

The electrostatic contribution to binding energy is comparable for all compounds except CID9890128 and DB00358 (table 5). The coulombic trend in the best-binding compound, DB09219, revealed the energy-dip post 70ns of trajectory, indicating the loss of electrostatic interactions. Interestingly, a comprehensive hydrogen bond analysis indicates that Gly496 and Asn501 show hydrogen bonds trend (figure 7A) that matches with the coulombic trend (figure 9A). The visual inspection revealed that the loss of this interaction is accompanied by a slight change in the binding site. Since the residues, Gly496 and Asn501, are some of the essential residues in the RBD-ACE2 interaction at the ‘Site 1’, we speculate that they might be critical residues for the competitive binding at the interface.

In addition, the binding energy analysis was performed for the clustered poses to differentiate the frames contributing and depleting the energy values. For the DrugBank compounds, DB00358 and DB04824, the total binding energy values indicate that higher cluster sizes contribute to the depletion of the overall average energy values (table 3). For the compound, DB12301, Cluster #1, which possesses the largest cluster size, exhibits higher binding energy compared to the other clusters. However, the significant energy values of DB09219 are consistent in both the clusters as well as better compared to other ligands, suggesting that the DB09219 compound indeed binds substantially to the S-protein. For the PubChem compounds, CID2404 indicates consistent binding energy values between the two clusters but moderately lower than that for the compound, DB09219 (table 4).

In conclusion, virtual screening, docking, molecular dynamics simulation and MM-GBSA studies of the DrugBank and PubChem libraries against the receptor-binding domain (RBD) of S-protein yielded a robust repurposable drug molecule, Bisoxatin which significantly binds to the RBD domain at the RBD – ACE2 interface and thereby Bisoxatin may inhibit the binding of ACE2 to the Spike protein. Thus, we propose that the hit molecule, Bisoxatin, can be used as a lead molecule to develop new chemical libraries as inhibitors of the SARS-CoV-2 Spike protein to prevent the host cell interaction.