Introduction

In December 2019, a newly identified coronavirus disease (COVID-19) emerged in Wuhan city, China, which rapidly resulted in a global pandemic. Coronaviruses are the large family of viruses that belong to the Coronaviridae family. Based on genomic structures and phylogenetic relationships, the subfamily Coronavirinae includes four genera, namely, α-coronavirus, β-coronavirus, γ-coronavirus, and ∆-coronavirus (Woo et al. 2012). The newly identified coronavirus is named acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and is categorized into the genus β-coronavirus (Hui et al. 2020), which causes respiratory and intestinal infections in animals and humans (Vijay and Perlman 2016). Severe acute respiratory syndrome coronavirus (SARS-CoV) has 79% and 50% similarity in genome sequences of Middle-East respiratory syndrome coronavirus (MERS-CoV) and severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2), respectively (Lu et al. 2020). However, there are significant discrepancies in disease transmission and pathophysiology among these three infectious diseases (Cruz et al. 2020; Huang et al. 2020; Wang et al. 2020). Studies have revealed that the rate of infectivity of SARS-CoV-2 is markedly higher than that of other members of the Coronaviridae family. It is now known that SARS-CoV-2 has a close relationship with the other two coronaviruses, MERS-CoV and SARS-CoV (Organization, W.H 2020; Tai et al. 2020). However, there are still no antiviral medications and vaccines approved for the treatment and prevention of SARS-CoV-2. The structure of coronaviruses is mainly composed of the spike (S), envelopes (E), membranes (M), and nucleocapsid (N) (Zhou et al. 2018; Cui et al. 2019). Angiotensin-converting enzyme-2 (ACE2) is a key enzyme that SARS-CoV and several coronaviruses can bind to it to enter lung epithelial cells (Kirchdoerfer et al. 2018; Song et al. 2018). The most current findings suggest that SARS-CoV-2 is able to bind ACE-2, expressing on the cell surface of its hosts by means of the spike protein (S protein) receptor-binding domain (Goswami and Bagchi 2020; Walls et al. 2020; Li et al. 2005). Thus, by blocking the binding site of the S proteins in ACE-2, the interaction of the virus-receptor complex would not be feasible, and infection cannot occur.

The spike glycoprotein, which forms a homo-trimer domain protruding from the outer surface of the virion, can facilitate the entry of the virus into host cells (Walls et al. 2016). The spike glycoprotein contains 1300 amino acids and is expressed as a single polypeptide chain (in the form of a precursor) and cleaved by host furin-like proteases to be converted into the amino (N)-terminal S1 subunit and the carboxyl (C)-terminal S2 subunit. The host cell binding, recognizing the host receptor, and the stabilization of host cell membrane and viral membrane fusion during infection are the significant roles that the spike glycoprotein is responsible (Du et al. 2009; Millet and Whittaker 2015). As shown in Fig. 1, the homo-trimers and a monomer protein of the S glycoprotein are represented, respectively. The two conformations of the spike glycoprotein are shown in Fig. 1a, in which the ectodomain trimer of the closed conformation has 3 symmetrical chains with 3 binding sites for ACE-2. These binding sites are very crucial in the crystallography of the SARS-CoV-2-ACE2 complex (Li et al. 2005). The accessible form of SARS-CoV-2 spike glycoprotein is an asymmetric reconstruction of the trimmer with a single subunit B domain (Fig. 1b) (Walls et al. 2020). These indicate that the spike glycoprotein trimers in the accessible form are present in severe infectious diseases caused by coronaviruses, while the inaccessible conformation is mostly detected in the common cold (Guan et al. 2003; Li et al. 2004; Wan et al. 2020). Based on recent evidence, the binding affinity of SARS-CoV for human ACE-2 is correlated with viral transmission rate, viral replication in distinct organisms, and the disease severity (Graham and Baric 2010; Hofmann and Pöhlmann 2004). It is believed that the most pathogenic forms of coronaviruses express the spike glycoprotein trimers spontaneously, inducing the inaccessible and accessible conformations in SARS-CoV and MERS-CoV, respectively (Walls et al. 2020). The subunits S1 and S2 are two functional subunits responsible for the host cell receptor and viral-cell membrane fusion that forms the spike glycoprotein (Walls et al. 2016; Belouzard et al. 2009; Bosch et al. 2003; Kirchdoerfer et al. 2016). The subunit S1 facilitates the virus-cell membrane complex by identifying specific receptors on the host cell surface (Li 2015; Li 2016; Lu et al. 2015; Graham and Baric 2010). A hydrophobic fusion peptide and two heptad repeat regions contain the subunit S2 (Song et al. 2018). Upon the attachment of the spike receptor-binding domain with the cell receptor ACE-2, some conformational changes occur in S1 and S2 subunits, leading to the exposure of the fusion loop and its insertion into the target cell membrane (Hofmann and Pöhlmann 2004; Lan et al. 2020). Different groups of ligands were known to block the binding of the spike glycoprotein to ACE-2, namely, antiviral agents, flavonoids, fluorophenyl, phenylpropanoids, and some drugs used for the treatment of SARS-CoV-2, compounds similar to fluorophenyl groups. These groups were virtually screened using the PubChem database, and finally, 3 compounds were chosen that had propane groups. Antiviral compounds have been used because of their antiviral properties and their effectiveness against SARS-CoV-2. Flavonoids are present in nearly all fruits and vegetables, as a category of natural substances with variable phenolic structures (Panche et al. 2016). These natural products are well known for beneficial effects on human health, such as antimicrobial, antioxidant, anticancer, and antiviral activity (Cushnie and Lamb 2005; Pietta 2000; Ren et al. 2003; Zhou and Li 2007). The fluorophenyl compounds are composed of fluorine plus phenyl groups. Studies have demonstrated that 2-fluorophenyl, 3-fluorophenyl, and 4-fluorophenyl groups have antibiotic and antifungal activity, so these compounds could be included in docking analyses in our study (Saleh et al. 2010). Phenylpropanoids are a class of plant secondary metabolites derived from aromatic amino acids, such as phenylalanine, found in many plants or tyrosine found in partial monocots (Deng and Lu 2017). These types of compounds are useful for human health, so phenylpropanoids could be applied for therapeutic purposes, such as producing antioxidants, anticancer, antiviral, anti-inflammatory, wound healing, and antibacterial substances (Korkina et al. 2011). In this study, using the molecular docking analysis, we sought to identify new active and stable inhibitors against the SARS-CoV-2 spike glycoprotein S1 subunit from a total of six different groups that are mentioned earlier. Thus, it is conceivable that blocking the interaction between the spike glycoprotein and ACE-2 can prevent the entry of the virus to the host cells. AutoDock Vina (http://autodock.scripps.edu) is a popular open-source application and used for molecular docking and the prediction of ligand-receptor interactions. In the drug discovery process, molecular docking is considered a computationally intensive and semi-valid method.

Fig. 1
figure 1

a Closed SARS-CoV-2 spike glycoprotein trimer. b Opened SARS-CoV-2 spike glycoprotein trimer. c The monomer of S glycoprotein with different subunits

Methods

Protein preparation

As mentioned above, subunit S1 in the B domain is responsible for different pathogenicity of SARS-CoV-2; hence, in this experiment, only the B domain was examined in both accessible and inaccessible conformations of the spike glycoprotein. Both conformations of the SARS-CoV-2 spike glycoprotein were downloaded from Protein Data Bank (Table 1) (Berman et al. 2000). First, MODELER 9.2 software was used for modeling missing residues located in the S1 subunit for both selected B domains. Following the modeling of the chains, the position of the amino acids was altered in both conformations, as in the accessible type 87 amino acids were deleted (amino acid 87 was converted into amino acid 1 in terms of the sequence order), while 102 amino acids were removed from the inaccessible type when both structures were downloaded from PDB (Webb and Sali 2016; Fiser and Do 2000). AutoDock Vina (http://autodock.scripps.edu) is a popular open-source application for molecular docking analysis, as well as the prediction of ligand-receptor interactions. In the drug discovery process, molecular docking is a computationally intensive and semi-reliable method. The B domains and ligands were then converted into the PDBQT format to undergo docking by the Autodock Vina software (Trott and Olson 2010). Before the docking process, polar hydrogens and Gasteiger charges were applied for the configuration of B domains and ligands. The Autodock Vina docking tool was utilized to examine the ligand binding on the B domain. Additionally, blind docking of ligands was performed to recognize the possible binding sites in the S1 subunit. To this aim, the entire protein was covered with the grid box of dimension 36.70×50×70.01 Å in the accessible form of the protein and 63.29×52.10×50.14 for the inaccessible form with grid spacing 1 Å. Finally, the conformations with high negative binding energy in binding sites mentioned in the recent study were chosen (Fig. 2) (Walls et al. 2020; Lan et al. 2020; Yan et al. 2020).

Table 1 Crystal structures obtained from the RSCB protein data bank
Fig. 2
figure 2

The steps of molecular docking of the B domain of S-protein and ligands are represented

Ligand preparation

The 3-D structures of ligands were extracted from ChemSpider and PubChem databases, and then the files were converted into the PDB format using the molecular visualization package of Chimera (Meng et al. 2006; Pettersen et al. 2004). In order to prepare and optimize the ligands for docking, polar hydrogen atoms were inserted, torsional degrees of freedom (nTDOF) were determined, and Gasteiger charges were calculated for all generated ligands. All ligands were ranked based on physicochemical properties, as shown in Table S1.

Ligand-receptor interaction analysis

In order to demonstrate inter-molecular interactions (e.g., hydrophobic, h-bonds, halogen bonds, and π/aromatic interactions), Accelrys Discovery Studio Visualizer software version 4.1 (ADSV) was applied. In addition, intermolecular hydrogen bonds were also examined using the LigPlot+ v.2.2, PyMol v.2.3.2, and UCSF Chimera.1.12 (Laskowski and Swindells n.d.; BIOVIA 2017; Studio 2008). By means of UCSF Chimera and ADSV, all hydrogen bonds were included, and the required edition was performed on ligand topology varieties.

Drug-like characteristics

It is necessary to analyze the main parameters associated with absorption, distribution, metabolism, and excretion (ADME) properties such as the five rules of Lipinski, drug solubility, pharmacokinetic properties, molar refractivity, and drug likeliness in order to produce efficient medicines with proper therapeutic indices (Bueno 2020; Lipinski 2004). The drug design requires ADME analysis before the discovery process, at a period when multiple compounds are potential candidates; however, gaining access to physical samples is restricted. Therefore, the computational prediction of ADME for candidate ligands is virtually performed (Daina et al. 2017). The ADME analysis of all candidate ligands was carried out using online software (http://www.swissadme.ch). Lipinski’s rules state that an active oral compound should not violate more than one of five rules. Lipinski’s rules include having a molecular weight (MWT) ≤ 500, log P ≤ 5, H-bond donors ≤ 10, and H-bond acceptors ≤ 10 (Lipinski et al. 1997). Moreover, pan-assay interference compounds (PAINS) identifies a variety of sub-structural features that may help to recognize compounds appearing as frequent ligands (promiscuous compounds) in several high-throughput biochemical screens (Baell and Holloway 2010), A web server, FAF3-Drugs, was used for filtering large compound libraries before in silico screening different analyses or related modeling studies (Lagorce et al. 2015).

Results

Molecular docking

The identification of ligands, which are binding to the binding site of ACE2, was conducted by molecular docking. In this experiment, 111 compounds downloaded from the ChemSpider and PubChem databases were submitted to molecular docking software. All ligands with their chemical formula, binding affinity in accessible conformation, and SB domain residues interactions through hydrogen and hydrophobic bonds are shown in Table 2, in which the residues at the binding site of the spike glycoprotein-ACE-2 complex are bolded (the data of inaccessible conformation is also available as Supplementary File S2). According to molecular docking results, seven molecules were selected and subjected to drug-like filtering. The hydrogen-bond and hydrophobic interactions at the binding site of the spike glycoprotein-ACE2 complex are bolded in Table 3 for both accessible and inaccessible conformations of the spike protein (Fig. 3). Rossicaside A has a hydrophobic binding site possessing Tyr347 in the accessible state, with a binding energy of −7.4 kcal/mol. As shown in Fig. 4, 1,2-ethanediol,1,2-bis(4-fluorophenyl) with a binding energy of −6.6 kcal/mol in the accessible conformation forms hydrogen bonds with Gly394 and three hydrophobic binding residues in which Tyr393 and Tyr403 are present at the binding site of the spike glycoprotein-ACE-2 complex. As depicted in Fig. 5, 1,2-propanediol, 3,3,3-trifluoro-2-phenyl-(2R) with a binding energy of −6.7 kcal/mol forms a hydrogen bond with Gly394, and its hydrophobic bond interacts with Tyr393, Asn399, and Tyr403 residues. Also, 1,1-bis(3-fluorophenyl)-2-methoxyethanol with a binding energy of −6.6 kcal/mol in the accessible conformation forms hydrogen bonds with Gly394, Gln396, Asn399, and Gly400 residues while other hydrophobic interacting residues were Tyr393 and Tyr403 (Fig. 6). Besides, 1,1-diphenyl propane-1,2-diol also forms two hydrogen bonds with Gly394 and Asn399 residues and two hydrophobic bonds with Tyr393 and Tyr403 residues (Fig. 7). The seventh chosen ligand was (S)-1,1-diphenylpropane-1,2-diol with a binding energy of −6.2 kcal/mol that forms hydrogen bonds with Gly394, Gln396, and Asn399 residues and hydrophobic bonds with Tyr393 and Tyr403 residues (Fig. 8). In inaccessible conformation, hydrogen and hydrophobic bonds are displayed in Table 1 (all hydrogen bonds in the closed state are shown in Supplementary File S2).

Table 2 Result of 6vyb molecular docking with all ligands, which are under study in this work. Five ligand groups are ranked by binding affinity
Table 3 Summary of top seven ranked ligands screened against RBD of Spike 2019 n-cov2, with their respective classification, chemical formula, binding affinity, hydrogen, and hydrophobic interacting residues
Fig. 3
figure 3

Chemical structures of selected ligands. Ball and stick models show the optimized structures for molecular docking

Fig. 4
figure 4

The interacting binding site amino acid residue of SARS-CoV-2S with 1,2-ethanediol,1,2-bis(4-fluorophenyl) and LigPlot+ analyses results in the open state of binding conformation of 1,2-ethanediol,1,2-bis(4-fluorophenyl)

Fig. 5
figure 5

The interacting binding site amino acid residue of SARS-CoV-2S with 1,2-propanediol,3,3,3-trifluoro-2-phenyl-(2R) and LigPlot+ analyses results in the open state of binding conformation of 1,2-propanediol,3,3,3-trifluoro-2-phenyl-(2R)

Fig. 6
figure 6

The interacting binding site amino acid residue of SARS-CoV-2S with 1,1-bis(3-fluorophenyl)-2-methoxyethanol and LigPlot+ analyses results in the open state of binding conformation of 1,1-bis(3-fluorophenyl)-2-methoxyethanol

Fig. 7
figure 7

The interacting binding site amino acid residue of SARS-CoV-2S with 1,1-diphenyl propane-1,2-diol and LigPlot+ analyses results in the open state of binding conformation of 1,1-diphenyl propane-1,2-diol

Fig. 8
figure 8

The interacting binding site amino acid residue of SARS-CoV-2S with (S)-1,1-diphenylpropane-1,2-diol and LigPlot+ analyses results in the open state of binding conformation of (S)-1,1-diphenylpropane-1,2-diol

Drug-like characteristic of the chosen ligands

ADME database contains the latest and most comprehensive information about the interactions of substances with drug-metabolizing enzymes and drug transporters that are specific to humans. It is designed for use in drug research and development, including drug-drug interactions (Matter et al. 2001). In order to assess the pharmacokinetic characteristic of the chosen ligands, the drug-likeliness of 7 chosen ligands was evaluated based on Lipinski’s rule of five (Lipinski et al. 1997). (Lipinski et al. 1997). Lipinski’s rule of five suggests that weak absorption is more probable if more than 5 H-bond donors are involved, 10 H-bond acceptors, the molecular weight exceeds 500 Da, and the calculated high lipophilicity (LogP) exceeds 5 (Lipinski et al. 1997). The qualifying range for molar refractivity was within a range of 40–130, with a mean value of 97 (Matter et al. 2001). As shown in Table 3, Rossicaside A would not be suitable according to Lipinski’s rule of five since its molar refractivity is more than 130, and it violates three rules. The remaining ligands met the required criteria of MADE (Table 3). PAINS filtering was conducted to identify the presence of chemical groups belonging to the PAINS category. Six out of seven ligands were accepted as drug-like compounds, and the physicochemical filter passed without any structural caution (Table 4). Rossicaside A was discarded as a result of possessing the catechol group in the PAINS sub-structural moieties. Also, FAF3-Drugs filtering rejected Rossicaside A, while other ligands were accepted by this filtering.

Table 4 FAF-Drugs3 and pan assay interference (PAINS) filtering of 7 identified ligands

Discussion

In the specialized field of computer-aided drug design to discover new compounds, molecular docking is widely used to explore different forms of the binding interactions between the prospective drugs and various domains or active sites, as well as binding sites on target molecules (Raj et al. 2019; Hughes et al. 2011). For a decade, molecular docking has been a great tool for the exploration of potential compounds, and it is used to model atomic bindings between proteins and small molecules. This helps us to characterize the interactions of small molecules at the binding sites of the target proteins (Meng et al. 2011). In viral infections, due to the lack of successful antiviral therapies, there is an urgency to speed up the process of drug development to find new and effective drug candidates. The spike glycoprotein of SARS-CoV-2 plays significant roles in binding, fusion, and entry into the host cells (Yan et al. 2020). The B domain in this protein causes the formation of two open and closed forms of coronavirus. The B domain is in a heterotrimeric form with three different polypeptide chains, namely, chains A, B, and C; each constitutes a monomer (Walls et al. 2020). In this study, the B chain of the spike glycoprotein in both open and closed forms (PDB ID: 6vyb and 6vxx, respectively) was used to model the missing residues and molecular docking. To this purpose, 111 compounds were screened obtained from ChemSpider and PubChem databases (Table 1) to find the optimal ligands to block the B-chain binding site interacting with ACE-2. The compound IDs (CIDs) of selected ligands obtained from the PubChem database were as follows: CID 13916145, CID 193962, CID 2755890, CID 11095754, CID 53722331, CID 555451, and CID 736300, which interact with the binding site of the spike glycoprotein-ACE-2 complex with the energy binding affinity of −7.5 kcal/mol, −7.4 kcal/mol, −6.7 kcal/mol, −6.7 kcal/mol, −6.6 kcal/mol, −6.4 kcal/mol, and −6.2 kcal/mol, respectively. Among all different types of interactions that are usually analyzed, such as H-bond, π-π, and amide-π interactions, the ligand binding energy attracts further attention, and the characteristics of amino acids involved in the binding site are further assessed (Raj et al. 2019; Hughes et al. 2011). The final proposed ligand was Rossicaside A, which is a phenylpropanoid that along with its derivatives, is commonly found in fruits, vegetables, grains of cereals, beverages, spices, and herbs. They have antimicrobial, antioxidant, anti-inflammatory, anti-diabetic, and anti-cancer activities, as well as renoprotective, neuroprotective, cardioprotective, and hepato-protective effects (Jia et al. 2018; Shyr et al. 2006). Etravirine is a non-nucleoside and inhibitor of the reverse transcriptase enzyme, which is orally administered and prescribed for the treatment of AIDS in whom resistant to other anti-retrovirals (ARVs) (Croxtall 2012). Different combinations of this structure exist; for instance, 1,2-ethanediol,1,2-bis(4-fluorophenyl) and 1,1-bis(3-fluorophenyl)-2-methoxyethanol are two fluorophenyl compounds that have hydrogen and hydrophobic interactions at the binding site of the spike glycoprotein-ACE2 complex. Therefore, three ligands (1,2-propanediol,3,3,3-trifluoro-2-phenyl-(2R); 1,1-diphenyl propane-1,2-diol; and (S)-1,1-diphenylpropane-1,2-diol) were used in our study since they had a similar structure to fluorophenyl compounds. Given the pharmacological properties of the selected ligands, it is concluded that many of the important pharmacophore properties required for adequate inhibition of SB protein are consistent with the six known ligands from the PubChem database. Moreover, their binding to the B chain in both conformations forms a stable complex with a sturdy network of hydrogen and hydrophobic bonds as well as critical residues, namely, Tyr347, Phe377, Tyr393, Gly394, Gln396, Asn399, Gly400, Tyr403, Tyr408, Gly409, Gln411, and Asn414 that were recently predicted as close-contact residues with the human cell host receptor (Walls et al. 2020; Shang et al. 2020). Using ADMEtox filtering, all of the identified ligands were assessed in terms of pharmacokinetic properties. Lipinski’s rule of five is commonly used to determine possible reactions between drugs and other non-drug target molecules. Based on these rules, potential drugs must have (a) molecular mass < 500 Da, (b) high hydrophobicity (expressed as LogP < 5), (c) less than 5 hydrogen bond donors, (d) less than 10 hydrogen bond acceptors, and (e) also the molar refractivity between 40 and 130. The drug-likeness is another factor assessed in ADMEtox filtering. In the case of having three parameters or higher mentioned earlier, a compound may be a candidate to act as a drug (Table 3). PAINS and FAF3-Drugs are two databases for the drug filtering process. FAF3-Drugs is a large filtering program that includes large libraries of compounds used for in silico screening or modeling of drug-protein interactions. PAINS filtering can also analyze thousands of compounds and their interaction with proteins within a few seconds, preventing further unnecessary analyses.

As displayed in Table 5, among seven final candidate ligands, Rossicaside A was excluded by these filtering methods, while the others were accepted. The molecular docking was employed to reveal whether there was any close interaction between potential ligands and the spike glycoprotein. Regardless of some drawbacks, such as in vitro conditions and not being the in vivo conditions, the use of molecular docking allows researchers to make more precise decisions within a shorter timeframe. The results showed acceptable binding affinity of Etravirine, 1,2-ethanediol,1,2-bis(4-fluorophenyl), 1,2-propanediol,3,3,3-trifluoro-2-phenyl-(2R), 1,1-bis(3-fluorophenyl)-2-methoxy ethanol, 1,1-diphenyl propane-1,2-diol, and (S)-1,1-diphenylpropane-1,2-diol, to the binding site of the spike glycoprotein-ACE-2 complex.

Table 5 ADME properties of selected ligands against SB domain

Conclusion

SARS-Cov-2 has emerged as a significant pandemic pathogen. It has been shown that the SARS-CoV-2 spike glycoprotein is a highly potent and critical target for the inhibition of COVID-19. In the present study, we attempted to seek the optimal ligands, using molecular docking, to have interactions with the B chain of the SARS-CoV-2 spike glycoprotein-ACE-2 complex. Molecular docking selected six ligands (Etravirine [−7.4 kcal/mol], 4-fluorophenyl [−6.7 kcal/mol], 1,2-propanediol,3,3,3-trifluoro-2-phenyl [−6.7 kcal/mol], 3-fluorophenyl [6.6 kcal/mol], 1,1-diphenyl propane-1,2-diol [−6.4 kcal/mol], and (S)-1,1-diphenylpropane-1,2-diol [6.2 kcal/mol]) from different groups with potential inhibition and high affinity to the SARS-CoV-2 spike glycoprotein to prevent the formation of the spike glycoprotein-ACE-2 complex. The selected compounds were subsequently submitted to the ADME webserver to analyze the toxicity of compounds against the human cells. The compounds that met the required criteria could be tested in animal models to analyze the efficacy of these chemicals in vivo.

Limitations

Due to the high risk of this virus, the experimental part for this study was omitted.